Performance Evaluation Designs

Speeches Shim

FEATURED

Whole-of-Project Evaluations

Whole-of-Project evaluations are a new category of evaluation in USAID. They will address questions about the combined results of multiple activities on important development outcomes under a CDCS and every Mission is now required to complete one whole-of-project evaluation during the lifetime of their CDCS.

DOWNLOAD PDF


Conducting Mixed Methods Evaluations

A mixed methods evaluation is one that combines two or more evaluation methods in an integrated effort to address a set of evaluation questions. Mixed methods approaches are the preferred modality of many of USAID’s performance evaluations. This Technical Note explains methods are combined and sequenced to gather and integrate both quantitative and qualitative information to address USAID evaluation questions.

DOWNLOAD PDF

CONSIDER AS WELL

Addressing Attribution of Cause and Effect in Small N Impact Evaluations: Towards an Integrated Framework 
This paper from 3ie includes an overview of, and individual annexes on, non-experimental approaches for examining “why” questions that involve cause-and-effect in evaluations that involve too few observations to be candidates for full scale (experimental or quasi-experimental) evaluations, or evaluations which for other reasons need to examine cause-effect questions but are not able to do so using approaches that qualify as Impact Evaluation designs under USAID’s Evaluation Policy.

DOWNLOAD PDF


RealWorld Evaluation 
This volume was developed specifically to address the need for practical strategies on how to ensure the highest level of methodological rigor consistent with the circumstances under which evaluations are conducted. This volume is used as a companion volume for USAID’s Evaluation for Evaluation Specialists (EES) course. Note that the PDF linked below is a summary of this volume. The book itself can be purchased online. RealWorld Evaluation is one of the volumes used as a companion volume for USAID’s Evaluation for Evaluation Specialists (EES) course. 

DOWNLOAD PDF

ALSO SEE

USAID Evaluation Policy

USAID Evaluation Tookit

Road to Results

USAID TIPS: Conducting a Participatory Evaluation

The Sustainability Evaluation Checklist (Canada)

USAID Trade Project Evaluations 2002 to 2013

Joint Evaluations: Recent Experiences, Lessons Learned and Options for the Future

Including an analysis in a Project MEL Plan of the need for evaluations during the project (tied to some threshold or key decision) and at the end of the project (either for decisions or to capture learning) lays the foundation for allocating sufficient evaluation resources and planning in a way that allows the use of the best methods for quality evaluation.

Performance evaluations, as defined in ADS 201, encompass a broad range of evaluation methods. They often incorporate before–after comparisons but generally lack a rigorously defined counterfactual. Performance evaluations may address descriptive, normative, and/or cause-and-effect questions. Performance evaluation questions may include, but are not limited to, the following topics:

  • What a particular strategy, project, or activity has achieved;
  • How it is being implemented;
  • How it is perceived and valued;
  • Contribution of USAID assistance to the results achieved;
  • Possible unintended outcomes from USAID assistance; and
  • Other questions pertinent to strategy, project or activity design, management, and operational decision-making.

No single evaluation design or approach will be privileged over others; rather, the selection of method or methods for a particular evaluation should principally consider the appropriateness of the evaluation design for answering the evaluation questions as well as balance cost, feasibility, and the level of rigor needed to inform specific decisions.

In recent years, studies carried out for PPL/LER have shown that:

  • Roughly half of the performance evaluations USAID has undertaken have been mid-course evaluations, while most others are conducted closer to the end of the project, and are sometimes called “final” evaluations. USAID also undertakes a limited number of ex-post evaluations.
  • Most USAID performance evaluations are have been external, meaning they have a Team Leader who is external to the Agency and not otherwise linked to the project or activity that will be evaluated. Smaller numbers of performance evaluations have been undertaken by USAID’s own staff or by USAID implementing partner staff or consultants they engage. Both USAID staff and implementing partner evaluations are called “internal” evaluations.
  • The evaluation questions and approaches used in performance evaluations vary widely. Questions addressed are more closely linked to the timing of an evaluation and its purpose than to the methods used.

Illustrative Types of Performance Evaluation

There is no commonly accepted taxonomy of performance evaluations on which all evaluators would agree. The range of evaluation questions addressed by performance evaluations is wide, and includes questions about project or activity results or outcomes; implementation processes and their effectiveness; what has been sustained since a project or activity ended; how cost effective was the program compared to existing practice or another approach; was the project or activity viewed as being relevant, or given positive ratings by intended beneficiaries; were men/women, or elderly, or poor, differentially affected by the project or activity. The range of data collection methods used may be as expansive as the list of questions, and thus many performance evaluations are self-described as Mixed Method Evaluations. Using a mix of methods, evaluators undertake all of the sub-types of performance evaluations described below – and more.

Process or Formative Evaluations

Process or formative evaluations are most often undertaken mid-way through USAID activities. Process evaluations focus on how the activity is working and whether expectations about implementation and beneficiary response to the activity are in line with expectations. Both process and formative evaluations may include questions about the initial results of an activity, such as what goods/services have been delivered, or to what degree outputs have been achieved, such as teachers trained, or new seed varieties planted. While USAID does not have a Technical or How To Note on Process Evaluation, guides produced by other organizations are helpful for understanding this type of performance evaluation and where it may be useful, including the Evaluation Brief on Conducting a Process Evaluation that can be downloaded here.

Outcome or Summative Evaluation

Among USAID "final" evaluations, most fall into the outcome or summative sub-types of performance evaluations, and address questions that focus on whether planned results and targets were achieved, as well as whether activities had unintended consequences. While many evaluations in this cluster are single point in time studies, some are more formal "pre-post" evaluations that fund both a baseline and endline round of data collection, but only for the activity’s intended beneficiaries. This performance evaluation subtype does not include a comparison group, which is one of the characteristics that differentiates it from an USAID impact evaluation. USAID outcome and summative evaluations tend to consider results at all levels of an activity or project Logical Framework, i.e., whether outputs were produced, to what degree purpose was achieved, and whether any change in the status of the goal could be detected. UNDP, which has a useful guide on Outcome-level Evaluation, uses a narrower definition of the term "outcome", focusing on what USAID would call the purpose and goal levels of an activity or project, but not the outputs.

What Has Been Sustained? (Ex-Post)

The presence of this question in a performance evaluation tends to signal that it is an ex-post evaluation, often with an outcome focus, though in some cases USAID may be just as interested in whether services or processes were sustained beyond the funding period for the activity as they are in whether benefits to particular beneficiaries were sustained. Performance evaluations in this cluster can be asked to empirically determine what was sustained, which is generally not possible in mid-course or final USAID evaluations that as whether an activity is likely to be sustained? The latter involves a hypothetical, and often the best that evaluators can do is determine whether protocols that would help sustain an activity are in place. Since sustaining services and benefits from an activity often involves the people who live in a project or activity location and will continue to live there, USAID’s Local Systems Framework, a systems thinking in evaluation, more generally, can be helpful for structuring this type of performance evaluation. Given the retrospective nature of this type of evaluation, it may also be important to be able to reconstruct baseline data, which the World Bank paper highlighted under this heading addresses.

In the trade arena, the European Commission uses ex-post evaluations to examine the results of regulatory changes, such as evaluations of Free Trade Agreements.

Whole-of-Project Evaluation

In its 2016 update of USAID’s evaluation guidance in ADS 201, USAID introduced a requirement for evaluations that look beyond the results of a single activity implemented by a single partner, calling upon each Mission to conduct at least one "whole-of-project" evaluation during the lifetime of a CDCS. As USAID’s ADS 201 Additional Help paper indicates, evaluations in this cluster are characterized more by their scope and questions than by the mix of techniques they use to gather information.

Whole-of-Project Evaluation Questions

The following are illustrative evaluation questions for Missions to consider and revise per their learning needs.

To examine the contribution from all constituent parts of a project to the Project Purpose:

  • To what extent has progress been made in achieving [the Project Purpose]?
  • To what extent have all of the project’s constituent activities contributed to achieving [the Project Purpose]?
  • How did positive and negative unintended outcomes of the project and all constituent activities contribute to or detract from achieving [the Project Purpose]?
  • To what extent have all of the project’s constituent activities contributed to the sustainability of [project outcomes]?

To examine strengths and weaknesses of the project theory of change:

  • To what extent were the programmatic and contextual assumptions identified in the project theory of change sufficient to achieve [the Project Purpose]?
  • What are the characteristics of the project design that positively or negatively influence activity contributions to [the Project Purpose]?

To examine the interaction among activities as they contribute to the Project Purpose:

  • What were the benefits of coordinating all of the constituent activities of a project portfolio to achieve [the Project Purpose]? What were the challenges?
  • How well did the internal and external project management protocols and practices implemented in all constituent activities support progress toward [the Project Purpose]?

Causality Questions in Performance Evaluations

The presence of questions about cause and effect in a performance evaluation does not define a cluster of evaluations, in the same way as do the other headings above. Questions about causality are found in USAID final evaluations as well as in ex-post evaluations. What draws attention to them in a performance evaluation is the need this kind of question creates for an approach for dealing with causality that does not involve a counterfactual comparison of the results for activity beneficiary to the results for some rigorously constituted comparison group. Over the years, a variety of evaluation approaches for addressing questions about cause-and-effect have emerged for use in situations where an impact evaluation is either not feasible or will not be undertaken for other reasons. This range of techniques is often grouped under the term “non-experimental designs.” Illustrative of the nature of these evaluation techniques are the following.

  • Interrupted Time Series – in this type of performance an evaluation an activity’s target group or area is compared to itself, using multiple data points both before and after an intervention. It is a useful technique for examining causality in policy projects, where no comparison group can be formed. The first use of this technique for understanding causality and its name emerged from a study of the effects of a policy change.
  • Modus Operandi or General Elimination Method (GEM) – like a courtroom process, or epidemiology, this approach to questions of causality begins with an observed results and uses a process of elimination to arrive at its cause
  • Correlational Research – examines the strength of specific relationships and either uses what is learned to predict what will happen in the future absent certainty about causality, and/or takes additional steps to bolster hypotheses about causality that may have emerged.
  • Contribution Analysis – starts with the proposition that no single activity or intervention will explain how or to what degree a particular goal was achieved. Rather it examines all of the various actors present in an environment and what they were doing, positive or negative, with respect to the goal. By mapping out each actor’s contribution, the role of each contribution to whatever change can be detected emerges.
  • Outcome Harvesting – examines what has changed, often in a complex situation, using verbal and other information from and about participant-observers and locations, without having clear expectations about results in mind, and then applies modus operandi techniques to work backwards to identify causes of important observed changes.
  • Most Significant Change – like Outcome Harvesting, looks for causality often in complex situations where multiple donors may be trying to bring about change in a particular suite of outcomes, such as health status. The approach relies on stories about what has changed collected from participants and observers, and the analysis of these data for common threads that zero in on the most significant changes that have occurred and their perceived causes.

A number of these techniques are treated in greater detail in a volume from 3ie entitled Addressing Attribution of Cause and Effect in Small N Impact Evaluations: Towards an Integrated Framework, which is highlighted on this page.

Performance Evaluation Staffing

As ADS 201.3.5.13 states, all required evaluations must be external evaluations, meaning that that Team Leader will be an independent expert from outside the Agency, who has no fiduciary relationship with the implementing partner for the project or activity to be evaluated. The requirement for evaluations to be external applies all three of USAID’s required types of evaluations.

  • Performance Evaluations – at least one evaluation per project for every project a Mission or other Operating Unit undertakes
  • All required impact evaluations, as defined in ADS 201, whether funded by a Mission or other OU
  • One Whole-of-Project evaluation to be completed by each Mission during its CDCS period.

Beyond this, USAID's evaluation policy encourages USAID staff as well as evaluators from partner countries to serve as members of evaluation teams. More generally, USAID guidance and experience indicates that, on occasion, USAID may elect to undertake an evaluation on a joint basis, together with its country partner or other donors. Evaluations of this type require close coordination at a number of points, and may require that both USAID staff and the evaluation team dedicate more time to this type of effort than might be expected for other evaluations. Similarly, when USAID elects to undertake a Participatory Evaluation, in which beneficiaries play a more active role, additional evaluators and USAID staff time may be required to facilitate this process. Decisions about team composition for mid-term and final evaluations have M&E budget implications that are worth considering when the evaluation component of a Project M&E Plan is developed.

Other options USAID may wish to consider include joint evaluations conducted collaboratively with other donors or country partners. Useful Guidance for Managing Joint Evaluations has been developed by the OECD and can be accessed through the line above.

Performance Evaluation Methods

Decisions about the specific data collection and analysis methods to be applied to address performance evaluations questions are an important step in the planning of every performance and impact evaluation, but the need not be made prematurely.

USAID requires that the main types of evaluations to be undertaken during a CDCS period be specified in a PMP, noting that PMPs can and should be updated, as needed, to reflect additional evaluations a Mission schedules in response to facts that “trigger” supplementary evaluationsduring the CDCS implementation period. Indicating in a PMP which evaluations will be performance evaluations and which will be impact evaluations, including the identification of one whole-of-project evaluation that will be undertaken, provides USAID with an early basis for understanding whether all of the Agency’s evaluation requirements will be met.

At the Project MEL stage and in Activity MELs, USAID may decide to identify specify subtyles of performance and impact evaluations they are considering. Data collection methods associated with performance indicators are needed at these stages in the program cycle, but ADS 201 does not focus on the data collection and analysis methods for evaluations until it describes the required elements of an evaluation Statement of Work (SOW) and even there it is flexible saying that USAID should:

Specify proposed data collection and analysis method(s) or request that prospective evaluators propose data collection and analysis method(s) that will generate the highest quality and most credible evidence on each evaluation question—taking time, budget, and other practical considerations into account.

 

 << Evaluation Quality Standards Up Impact Evaluation Designs >>

ProjectStarter

BETTER PROJECTS THROUGH IMPROVED
MONITORING, EVALUATION AND LEARNING

A toolkit developed and implemented by:
Office of Trade and Regulatory Reform
Bureau of Economic Growth, Education, and Environment
US Agency for International Development (USAID)

For more information, please contact Paul Fekete.