Moving impact evaluations beyond controlled circumstances

12 June 2013

The constraints imposed by an intervention can often make designing an evaluation quite challenging. If a large-scale programme is rolled out nationally, for instance, it becomes very hard to find a credible comparison group. Many evaluators would shy away from evaluating programmes when it is hard to find a plausible counterfactual. Since it’s also harder to publish the findings of such evaluations, there don’t seem to be many incentives for evaluating such programmes. But the evaluation of a large-scale programme provides crucial information to policymakers. Doesn’t this make it important that evaluators take up the challenge of evaluating these types of programmes? If impact evaluators want to influence policy, they have to move beyond evaluating small programmes for which they have optimal control over the design of the intervention.

The Millennium Villages Project is an example of an important programme, where the evaluators do not have perfect control over the design of the intervention. This project is being implemented in several countries in sub-Saharan Africa, and is supported by the Earth Institute of Columbia University among other institutions. The first phase of the Millennium Village model focuses amongst other things, on distribution of improved seeds and fertilizer, long-lasting insecticide-treated bed nets, basic immunizations, Vitamin A campaigns and community-wide deworming. These interventions are then complemented by infrastructural investments in the areas of education and health. The project is based on the big-push theory which hypothesises that these interventions in all sectors of the economy, when implemented together, can break the poverty trap.

It’s clear that the complexity and the scale of this programme makes an evaluation difficult. Evaluators are also not in the driving seat for determining the design of this programme. Past evaluations of the Millennium Villages have been surrounded by a lot of controversy over the issue of ‘quality’. The debate on quality has been so fierce that it prompted the UK Department for International Development to take the stand that it would only co-fund a new Millennium Village in northern Ghana if it is accompanied by the implementation of a rigorous, high-quality, independent impact evaluation. An evaluation would assess whether the Millennium Village has indeed created freedom from poverty traps, as claimed by the project documents.

Impact evaluation of the Millennium Village in northern Ghana

The challenge of evaluating the Millennium Villages in northern Ghana was taken up by researchers from the Institute of Development Studies (IDS) among other institutions. A design for the evaluation of the new Millennium Village has now been published on the IDS website and in the Journal of Development Effectiveness.

The research team responsible for this impact evaluation should be commended for addressing many of the constraints imposed by the intervention, in a clear and open way. This is not to say that the evaluation approach does not have important limitations. But most of the limitations are related to the design of the intervention and the data collection processes.

So, what were the limitations? Since the programme will be implemented in a small number of localities, a randomised evaluation is hardly feasible. This makes the study potentially vulnerable to bias from unobservable characteristics. The intervention is clustered in 34 villages in two administrative districts (Builsa and west Mamprusi). So, there is the risk that droughts and floods only happen in the treatment group, without influencing the comparison group. For example, if it only rains in the treatment group, this could bias the results of the impact evaluation. The process through which the intervention site was chosen was also not entirely transparent. This complicates the choice of a comparison group. The access of nearby localities to programme services provided by the Millennium Village may also lead to spillover effects, which could in turn bias the impact estimates.

What is perhaps the biggest concern for this evaluation is that the Earth Institute, Columbia University, conducted the baseline surveys in the treatment group between April and June 2012, while interviews in the comparison group were conducted between July and September 2012. This time-lag was planned to be shorter. But due to the delays in the data collection process, there may be baseline differences between the treatment and the comparison group caused by seasonality. For example, the incidences of malaria may be higher in the rainy season.

Dealing with limitations

There are no perfect methods to deal with these limitations. But, the evaluation design for the new Millennium Village is thorough in its attempt to deal with many of the weaknesses. Comparison villages were selected by matching them with project villages based on ex-ante village characteristics and by using nearest neighbour propensity score matching. The evaluators plan to use a quasi-experimental difference-in-difference approach combined with further matching of project and comparison households based on household and community characteristics. The validity of the difference-in-difference approach is assessed by testing the assumption of parallel trends for the treatment and the comparison group.

To mitigate the concern of bias from shocks that only happen in the treatment group, the research team proposes to collect a lot of information on droughts and floods. Weather-shocks in northern-Ghana are typically localised in small areas. Therefore, the collection of data in the treatment and the comparison group will allow for comparing beneficiaries with non-beneficiaries that face similar climatic shocks. This should be possible, since the comparison group consists of a large and dispersed set of villages that are not likely to experience the same weather shocks.

Spillovers will be assessed by constructing two types of comparison villages: 1. a group of villages that is sufficiently close to the Millennium Village, and 2. a group of villages that is relatively far away from the Millennium Villages. Spillovers can then be detected by using a difference-in-difference approach combined with propensity score matching to compare nearby comparison villages with far away comparison villages. This approach will be complemented with a social network survey to detect spillovers that run through social networks.

Secondary data will be used to assess the size of the potential seasonal bias. If large differences in project and comparison villages are found, the impact of seasonality can be estimated by decomposing the differences between the treatment and the comparison group into differences in observable characteristics and seasonalities. This is probably the best the researchers can do to mitigate the concerns about seasonality. However, we will only find out whether the impact of seasonality can be estimated after the first analysis of the baseline data.

Does a ‘big-push’ really work?

One of the major strengths of the evaluation design for the new Millennium Village is that it explicitly aims to test the big-push theory by asking a fundamental question: can an integrated package of interventions bring about a sustainable reduction in poverty in communities in northern Ghana where individual interventions have not made much of a difference? To test this, the researchers plan to assess whether the individual components of the Millennium Villages Project are more cost-effective than standalone programmes, such as the Livelihood Empowerment Against Poverty programme and the JPAL: Targeting the Ultra Poor Programme, amongst other programmes (Both these programmes are currently being assessed by evaluations supported by 3ie). Moreover, the quantitative research design is complemented with an intensive qualitative component. This will hopefully allow the researchers to not only answer the question ‘What is the impact of the new Millennium Village?’ but also ‘Why did the Millennium Village have this impact?’

Finally and importantly, the evaluation of the Millennium Village also focuses on the sustainability of the results generated by the programme. Impacts will be estimated before the end of the project in 2016, and also five years after the programme has ended. This will allow the research team to see whether local governments have the capacity to maintain the infrastructure provided by the Millennium Village team after external support has ended.

Clearly, the research design for the impact evaluation of the new Millennium Village has some key limitations. It is important that evaluators openly discuss these constraints. Evaluation can often be a challenge but this is no reason to stay away from pursuing important research questions for policymakers. It is likely that researchers may not always be able to address these policy relevant questions under perfectly controlled circumstances. But the impact evaluation design of the new Millennium Village in northern Ghana shows that this is not an impossible task. Despite the limitations, this theory-based impact evaluation to assess the big push theory in practice is likely to provide important lessons for policymakers.

Moving impact evaluations beyond controlled circumstances

Leave a comment