Design Is Evaluation

Who's This For

You run STEM programs that matter. Summer research experiences, cross-institutional partnerships, broader impacts initiatives that connect universities with communities. You know these programs create real change in real people. But when reporting time comes, you are stuck translating rich, messy, human experiences into numbers on a spreadsheet. You have felt the gap between what your program actually does and what your evaluation captures. You have watched important moments disappear because no survey question was designed to find them. This guide is for you -- the faculty member, the research administrator, the partnership broker who knows that evaluation should serve the work, not just document it.

The Partnership Moment

You are running a summer research experience for undergraduates. Ten weeks of lab work, mentoring, professional development seminars. The students are engaged. Something is clearly happening -- you can feel it in the hallway conversations, in the way a first-generation student starts referring to herself as a scientist, in the connections forming at the poster session between people from institutions that have never talked before.

Then the dean's office sends out a Qualtrics survey. Rate your satisfaction on a scale of 1-5. Would you recommend this program? The response rate comes back at 40 percent. The data says students were satisfied. The report goes to NSF.

Everyone moves on.

But you know what got lost. That moment in week three when a student's mentor made an offhand comment about her data analysis that shifted how she saw herself. The networking that happened spontaneously at the poster session, where a community college instructor discovered that her institution's location near an agricultural processing corridor made her students uniquely positioned for environmental monitoring research. The gradual shift in how students talked about their futures -- from "if I go to grad school" to "when I apply."

None of that made it into the report. And the program that looks "satisfactory" on paper is actually transformative in practice.

Under the Surface

This is not just a data collection problem. It is a design problem.

Bolt-on evaluation -- the survey sent after the event, the rubric scored by external judges, the satisfaction rating requested weeks later -- measures what is easy to measure, not what matters. It depends on participants being willing and able to complete instruments after the fact. It lives in a separate timeline from the program itself. And when logistics shift, which they always do, the evaluation breaks first.

More fundamentally, bolt-on evaluation assumes that the program and the evaluation are two different things. The program does its work. Then evaluation comes along afterward to check whether the work happened. This separation is so deeply embedded in how we think about funded projects that most PIs do not even question it.

But think about what this separation costs you. Every hour spent on a post-program survey is an hour not spent on a program activity. Every evaluation instrument that interrupts the participant experience is a small withdrawal from the engagement account you have been building. And the data you collect through these bolted-on mechanisms captures retrospective self-report -- what participants remember thinking, filtered through time and social desirability -- rather than what actually happened in the moment.

There is a different way to think about this. And it starts with a deceptively simple principle: the design of bringing people together IS the evaluation.

Consider what embedded evaluation looks like in the same summer program. Every activity serves double duty. The poster session is not just a presentation opportunity -- it is designed with a structured response card system where attendees write specific observations. Those cards become evaluation data about what participants learned and what connections formed. The weekly mentor meetings are not just check-ins -- they use a structured conversation protocol, and each conversation generates notes that document skill development and identity formation over time. The final showcase is not just a celebration -- it is designed with a storytelling prompt that asks students to share a moment that mattered, generating rich qualitative data about what actually produced impact.

And when a scheduling conflict eliminates the pre-program survey window, embedded evaluation continues uninterrupted -- because the evaluation activities are the program activities. Nothing breaks.

The Partnership Shift

What if evaluation was not something you added to your program, but something your program already did? What if every activity you designed served double duty -- building participant skills AND generating evidence of impact? What if the moments you struggle to capture were designed into the structure of the experience itself?

This is the shift from bolt-on to embedded evaluation. And it changes everything about how you design programs, write proposals, and report results. You are not stealing time from the program to evaluate it. The evaluation IS the program. That principle -- deceptively simple, radically different in practice -- is the foundation of everything in this guide.

The Partnership Pattern: Translation Architecture

The translation architecture has three mechanisms, and understanding how they work together is what separates a generic broader impacts plan from a competitive one.

Mechanism 1: Research Translation

Research translation takes findings from the scholarly literature and embeds them into your program activities. You are already doing this when you design curricula based on evidence-based practices. The key move is making the research base visible in your evaluation design.

Consider mentoring. The research on effective mentoring tells us that structured approaches with clear expectations outperform informal "my door is always open" models. So instead of pairing students with mentors and hoping for the best, you design specific conversation protocols based on mentoring research: first meetings focused on expectation-setting, mid-program check-ins focused on skill identification, and end-of-program conversations focused on next steps. Each of those conversations generates evaluation data. You can see whether the structured approach produces the outcomes the research predicted, and where your specific context diverges from the literature.

The evaluation is not bolted on. It is built into the mentoring structure itself.

Mechanism 2: Informal Translation

Informal translation takes facilitation practices from informal education, science communication, and community engagement and embeds them into your program structure. Think about how a good science museum designs an exhibit: there is a hook, an interactive element, a moment of surprise, and a takeaway.

You can use those same principles in professional development workshops. Instead of a lecture on "how to write a broader impacts statement," design a gallery walk where participants move through stations, each featuring a real broader impacts example -- some strong, some weak. Participants place sticky notes with observations, ask questions, rank elements. The activity is engaging and educational for participants. But it is also generating rich qualitative data about what faculty actually understand and misunderstand about broader impacts. The facilitation design creates the evaluation opportunity.

Mechanism 3: Workforce Translation

Workforce translation weaves actual career-relevant skill development into the program in ways that simultaneously create evaluation evidence. Students are not just learning research methods -- they are building portfolios, practicing elevator pitches, conducting informational interviews with industry professionals. Each of these activities develops real workforce skills AND produces artifacts you can assess.

A student's elevator pitch at month one versus month six shows growth in communication ability, disciplinary identity, and professional confidence -- all documentable without a single survey item.

When the Three Work Together

When all three translation mechanisms operate in your program, reviewers see a design where everything connects. The research base informs the activity design. The facilitation practices make activities engaging and data-rich. The workforce skill-building produces tangible evidence of impact. Each mechanism reinforces the others.

For your evaluation plan, name these three mechanisms explicitly. Show how each activity in your program draws from research, uses effective facilitation, and builds workforce-relevant skills. Then show how each activity generates evaluation evidence.

Your Partnership Move

Start with one activity in your current program. Just one. Ask yourself three questions:

What research informs this activity? If you are running a mentoring program, what does the literature say about effective mentoring practices? If you are hosting a networking event, what does the research say about how professional connections form? Make the research base explicit, and design the activity to test whether the research holds in your context.

What facilitation approach would make this activity both engaging and data-generating? Could you add structured response cards to a poster session? Could you build a reflection prompt into a workshop that captures learning in real time? Could you design a conversation protocol that produces documentation as a natural byproduct?

What skills does this activity actually build? Not just the content learning, but the professional capacities -- communication, collaboration, project management, presenting to different audiences. Could participants produce an artifact that demonstrates growth?

If you can answer all three questions for a single activity, you have the foundation of embedded evaluation. The activity serves the program AND the evaluation simultaneously. No bolt-on required.

Your Team's Turn

Which of your current evaluation activities feel "bolted on" versus "embedded"? What makes the difference?
Pick one program activity and brainstorm: how could it generate evaluation evidence without adding a separate instrument?
Where are you currently losing important data because your evaluation cannot capture what actually happens?
What would change about your next proposal if you designed evaluation into the program from the start?

Your Local Action

Find the person on your campus who runs faculty professional development around assessment and evaluation. This might sit in your Center for Teaching and Learning, your Office of Institutional Research, or your Office of Sponsored Programs. Ask them: "Who on campus is doing embedded assessment -- where the assessment activity is also a learning activity?" You will likely discover colleagues in other departments who have been thinking about this same problem in different contexts. Their approaches may translate directly to your partnership work.

Also look for your campus's broader impacts support office, if one exists. Many research universities now have dedicated staff who help faculty develop competitive broader impacts plans. They are natural allies for this kind of evaluation thinking, and they see the gap between bolt-on and embedded approaches in every proposal cycle.

Continue the Journey

This is Issue 1 of the Participatory Evaluation for Researchers Super Team Guide -- a 12-issue series designed to give you a complete toolkit for designing evaluation that serves your programs, satisfies your funders, and actually improves your practice.

In Issue 2, we tackle one of the most important and most overlooked aspects of partnership evaluation: Partner-Defined Impact. Your partners define success differently than you do. That is not a problem -- it is a feature. We will show you how to surface those different definitions and design your evaluation to serve all of them.

The One Thing

Every program activity can serve double duty -- building participant capacity AND generating evaluation evidence. The question is not "how do I evaluate this program?" The question is "how do I design this program so that evaluation is already happening?"

Participatory Evaluation for Researchers is part of the STEMsaic Super Team Guide Collection.

Your campus has people who think about evaluation every day. Find them. They are waiting for someone to ask better questions.

Next Issue: Partner-Defined Impact

[Subscribe to continue the journey]

Issue 1 Connected Assets

Slide Deck: Translation Architecture Introduction

Bolt-on versus embedded evaluation comparison framework
The three translation mechanisms with examples
Workshop activity for mapping your current evaluation approach

Interactive: Translation Architecture Explorer

Navigate all three translation mechanisms with guided prompts
Identify which mechanisms are active in your current program design

Download: Translation Architecture Quickstart

One-page overview of the three translation mechanisms
Decision framework for choosing which mechanisms fit your context
Quick-start checklists for each mechanism