Use of Indicators in Program Evaluation
Defining Program Evaluation
In this database the term “program evaluation” is used to encompass (1) routine monitoring and (2) the different forms of evaluation: process, results (or performance), and impact. “Evaluation” may refer to any aspect of program evaluation.
Routine monitoring and process evaluation – as well as evaluation of results – measure how well a program is working to achieve the desired results (Adamchak et al., 2000). Some organizations, including USAID, consider these part of “performance monitoring” because they are used for decision making about whether and how to modify programs, continue to invest in programs, etc. By contrast, impact assessment relates to whether desired changes occur and whether those changes can be attributed to the program itself. More specifically:
Monitoring is the routine tracking of a program’s activities by measuring on a regular, ongoing basis whether planned activities are being carried out. It is used to track changes in program performance over time. Monitoring has been referred to as process evaluation because it focuses on the implementation process and seeks to determine how well a program has been implemented, how much the implementation varies from site to site, and if the program benefited the intended people and at what cost (MEASURE Evaluation, 2007).
Evaluation of results measures the extent to which change occurs consistent with the program objectives. Many evaluations focus on change in the knowledge, attitudes, and behaviors of (1) clients/participants in the program or (2) members of the intended audience in the population at large (e.g., women of reproductive age). However, “results” may refer to changes in policies related to reproductive health, management procedures, logistics systems, quality of care in health facilities, and other aspects of the supply environment. This is also referred to as performance evaluation. According to the “USAID Evaluation Policy” (2011), performance evaluations often incorporate comparisons from before and after a project or intervention, but they generally lack a rigorously defined counterfactual.
Impact evaluations measures not only the change that has occurred, but also the extent to which this change is attributable to a defined program intervention. They require data collection at the start of a program (to provide a baseline) and again at the end, rather than at repeated intervals during program implementation. They also require a well-planned study design with a control or comparison group in order to measure whether the changes in outcomes can be attributed to the program. Impact evaluations in which comparisons are made between beneficiaries that are randomly assigned to either treatment or control groups provide the strongest evidence of a relationship between the new approach (e.g., intervention) and the outcome. This type of evaluation is most likely to be undertaken when an activity or project involves a new approach that is expected to be expanded in scale or scope.
Most programs have the objective of achieving some type of change at the population level (among the general public) or at the program level (among clients or participants in the program). The purpose of many program evaluations is to measure whether that change occurs. Yet relatively few evaluations go as far as establishing cause-and-effect between the program and the change (e.g., impact assessment that allows for attribution).
The indicators per se do not define whether an evaluation measures impact. Rather, the study design determines whether one can establish impact (causality). A program can achieve its objective, but one cannot rule out factors other than the program (“confounding factors”) that might be responsible for the change. Tracking change in a given population (in the absence of a control group) is also known as monitoring of results or “trend monitoring.” Many policy makers are entirely satisfied with this type of evaluation, especially if the results show the desired change in the outcome variable; they are not concerned about confounding factors that might also explain the change. Indeed, many would conclude that if the desired change occurs, then the “program has had impact.” By contrast, evaluation specialists recognize that simple tracking of change does not demonstrate causality.
To demonstrate causality, an impact assessment is needed. It should include an intervention and a comparison (control) group with data collected before and after the program is implemented to monitor change over time. Randomizing participants or sites to the intervention and control groups provides the most rigorous test of causality. As stated above, this kind of evaluation is most useful for new and innovative approaches.
Note: in these two examples, the indicators could have been the same. What differed — and allowed for impact assessment in the second case — was the study design. (For a more detailed discussion, see Bertrand, Magnani, and Rutenberg, 1996, Chapter IV.)
Levels of Reporting for USAID Cooperating Agencies
The indicators are presented for use at the national level (e.g., prevalence of breastfeeding, the percentage of facilities that offer postabortion contraception, the percentage of pregnant women who are anemic). However, evaluators should adapt these indicators to suit the needs of the organization using them. For example, if a project works in a single district or set of districts, this geographical area becomes the population of interest for the project. Similarly, if an adolescent project only works in the capital city, youth of a certain age (possibly defined by socioeconomic or geographic variables) in this city become the intended audience.
The USAID cooperating agencies (organizations supported by USAID funding that give technical assistance worldwide) must satisfy multiple needs for indicators. First, they work with host-country counterparts in establishing the best indicators to use for evaluating the program or project at the country level or, alternatively, the regional district or city level, as described above. In addition, they may need to report the aggregated results of their work in different countries to the donor. For example, for USAID-funded projects, in addition to reporting on performance monitoring indicators in the workplan, USAID’s new Gender Equality and Female Empowerment Policy requires that one or more of USAID’s gender indicators is reported on if the program or activity has gender equality or women’s empowerment as an explicit primary or secondary objective to achieving family planning and reproductive health goals. More information about gender equality and women’s empowerment can be found in the Women and Girls’ Status and Empowerment and Male Engagement sections of this database.