Program evaluations are “individual systematic studies conducted periodically or on an ad hoc basis to assess how well a program is working1.” What was your reaction to this definition? Has the prospect of undertaking a “research study” ever deterred you for conducting a program evaluation? Good news! Did you know that program evaluation is not the same as research and usually does not need to be as complicated?
In fact, evaluation is a process in which we all unconsciously engage to some degree or another on a daily, informal basis. How do you choose a pair of boots? Unconsciously you might consider criteria such as looks, how well the boots fit, how comfortable they are, and how appropriate they are for their particular use (walking long distances, navigating icy driveways, etc.).
Though we use the same techniques in evaluation and research and though both methods are equally systematic and rigorous (“exhaustive, thorough and accurate”2), here are a few differences between evaluation and research:
Program Evaluation Focuses on a Program vs. a Population
Research aims to produce new knowledge within a field. Ideally, researchers design studies to be able to generalize findings to the whole population–every single individual within the group being studied. Evaluation only focuses on the particular program at hand. Evaluations may face added resource and time constraints.
Program Evaluation Improves vs. Proves
Daniel L. Stufflebeam, Ph.D., a noted evaluator, captured it succinctly: “The purpose of evaluation is to improve, not prove3.” In other words, research strives to establish that a particular factor caused a particular effect. For example, smoking causes lung cancer. The requirements to establish causation are very high. The goal of evaluation, however, is to help improve a particular program. In order to improve a program, program evaluations get down-to-earth. They examine all the pieces required for successful program outcomes, including the practical inner workings of the program such as program activities.
Program Evaluation Determines Value vs. Being Value-free
Another prominent evaluator, Michael J. Scriven, Ph.D., notes that evaluation assigns value to a program while research seeks to be value-free4. Researchers collect data, present results and then draw conclusions that expressly link to the empirical data. Evaluators add extra steps. They collect data, examine how the data lines up with previously-determined standards(also known as criteria or benchmarks) and determine the worth of the program. So while evaluators also make conclusions that must faithfully reflect the empirical data, they take the extra steps of comparing the program data to performance benchmarks and judging the value of the program. While this may seem to cast evaluators in the role of judge we must remember that evaluations determine the value of programs so they can help improve them.
Program Evaluations ask “Is it working?” vs. “Did it work”
Tom Chapel, MA, MBA, Chief Evaluation Officer at the Centers for Disease Control and Prevention (CDC) differentiates between evaluation and research on the basis of when they occur in relation to time:
Researchers must stand back and wait for the experiment to play out. To use the analogy of cultivating tomato plants, researchers ask, “How many tomatoes did we grow?” Evaluation, on the other hand, is a process unfolding “in real time.” In addition to determining numbers of tomatoes, evaluators also inquire about related areas like, “how much watering and weeding is taking place?” “Are there nematodes on the plants?” If evaluators realize that activities are insufficient, staff are free to adjust accordingly.5
To summarize, evaluation: 1) focuses on programs vs. populations, 2) improves vs. proves, 3) determines value vs. stays value-free and 4) happens in real time. In light of these 4 points, evaluations, when carried out properly, have great potential to be very relevant and useful for program-related decision-making. How do you feel?
- U.S. Government Accountability Office. (2005). Performance Measurement and Evaluation. Retrieved January 8, 2012 from http://www.gao.gov/special.pubs/gg98026.pdf
- Definition of “rigorous.” Retrieved January 8, 2012 from google.com
- Stufflebeam, D.L. (2007). CIPP Evaluation Model Checklist. Retrieved January 8, 2012 from http://www.wmich.edu/evalctr/archive_checklists/cippchecklist_mar07.pdf
- Coffman, J. (2003). Ask the Expert: Michael Scriven on the Differences Between Evaluation and Social Science Research. The Evaluation Exchange, 9(4). Retrieved January 8, 2012 from http://www.hfrp.org/evaluation/the-evaluation-exchange/issue-archive/reflecting-on-the-past-and-future-of-evaluation/michael-scriven-on-the-differences-between-evaluation-and-social-science-research
- Chapel, T.J. (2011). American Evaluation Association Coffee Break Webinar: 5 Hints to Make Your Logic Models Worth the Time and Effort. Attended online on January 5, 2012
For more resources, see our Library topic Nonprofit Capacity Building.
Priya Small has extensive experience in collaborative evaluation planning, instrument design, data collection, grant writing and facilitation. Contact her at firstname.lastname@example.org. Visit her website at http://www.priyasmall.wordpress.com. See her profile at http://www.linkedin.com/in/priyasmall/
Module 10: Distinguishing Evaluation from Research
“The purpose of evaluation is to improve, not prove.” - D.L. Stufflebeam(1)
Research and evaluation are characterized by similar features that center on the shared objective of answering a question. However, it is important to distinguish between the two disciplines by explaining that the purpose of evaluation is essentially to improve the existing program for the target population, while research is intended to prove a theory or hypothesis. Although both use similar data collection and analysis methods, the two disciplines diverge again during use and dissemination. This relationship can be visualized using an hourglass shape:
Source: T. Beney (2011)
Considering the aspects of research and evaluation, there is validity in the opening quote by Stufflebeam. However, from what we know about the purpose of evaluations, some evaluations do seek to ‘prove’ a theory; probability evaluations prove that the outcomes or impact of a program are the result of program activities.(2) Therefore, although the main purpose of evaluation is to improve a program, certain circumstances enlist evaluations to "prove". For example, an evaluation has the ability to demonstrate that microcredit programs for women reduce child mortality. According to the World Bank, although results may be generalizable in robust plausibility evaluations, the primary purpose of program evaluations is to benefit the specific program target audience.(3) It is possible to say that evaluation is a sub-set of research because it would be impossible to conduct an evaluation without incorporating basic constructs of research, such as question development and study design.
In his article about the differences between evaluation and research, Scriven (2003/2004) distinguishes the skills of evaluators from those of social science researchers. Scriven notes that an evaluator’s ability to determine context and unexpected effects of a program distinguishes him from a researcher.(4) Whereas a researcher would seek to determine whether microcredit programs accomplish the intended goal of reducing child mortality, an evaluator would also look for side effects of the same microcredit program: How does it improve the household’s quality of life? How does the program increase spending on health care? What village-specific confounding factors also reduce child mortality? These are just a few questions specific to evaluators in the course of program analysis.
Research is intended to increase the body of knowledge on a particular issue; any subjective opinion limits the researcher’s credibility. On the other hand, evaluators must balance the need to remain objective and the expectation to make recommendations for stakeholders. Evaluators must determine what information is valuable, what method is best for data collection, how to analyze the data, and how to relay findings to stakeholders. This requires interpretation and a certain level of judgment by the evaluator that is absent from the role of the traditional researcher.(5)
There are extensive debates about the differences between research and evaluation and their potential relevance to practical use. The visualization of the hourglass provides one perspective of this debate. In practice, the two disciplines hold different objectives, but as tools, they use the same methods of analysis. The particular intersection of evaluation and research will depend on the context, but it is important to identify the distinction, as evaluation is conducted for the purpose of improvement and traditional research for the gains in knowledge base.
Go To Module 11: Health Indicators >>
(1) Stufflebeam, D.L. (1983). The CIPP Model for program evaluation. In G.F. Madaus, M. Scriven, and D.L. Stufflebeam (Eds.), Evaluation Models: Viewpoints on Educational and Human Services Evaluation. Boston: Kluwer Nijhof.
(2) Victoria, C., Habicht, J.P., and Bryce, J. (2004). Evidence-based public health: Moving beyond randomized trials. American Journal of Public Health, 94(3):400-405.
(3) The World Bank. (n.d.). HIV monitoring and evaluation resource library. Global AIDS Monitoring and Evaluation Team (GAMET).
(4) Scriven, M. (2003/2004). Differences between evaluation and social science research. The Evaluation Exchange Harvard Family Research Project, 9(4).
(5) Levin-Rozalis, M. (2003). Evaluation and research: Differences and similarities. The Canadian Journal of Program Evaluation, 18(2):1-31.