3ie’s recent systematic review of farmer field schools (FFS) found that these programmes worked as pilots and small- scale programmes. But the few impact evaluations of national-level programmes found no impact. The evidence suggested that problems in recruiting and training appropriate faciliators impeded the scale-up of the experiential learning model of farmer field schools. It is hard to find people who can facilitate learning rather than just lecture in a top-down style.
Evidence-informed development is in trouble then if evidence of things working as pilot programmes cannot be relied upon as the basis for taking those programmes to scale. How large a problem might this be?
Scalability can be threatened if an impact evaluation has weak external validity. This means that the impact shown with a particular intervention design for a particular group in a specific context may not be achieved in another time, place, with a different group or if there are any changes in how the programme is designed or implemented. Weak external validity may manifest itself in at least four ways.
1. Weaker implementation at scale
Great care is often paid to the implementation of small- scale pilots. The researchers themselves, or their grad students, directly oversee implementation or even do it themselves. That won’t be the case once the programme is to taken to scale by local staff. Scaling up training has its own threats, as we saw with the FFS model. Local staff often need more training and mentoring, but training efforts do not reflect these needs. They are very possibly implementing the programme without access to required project inputs. And perhaps the project is yet another job for them on top of existing responsibilities, with weak incentives to pay attention to project implementation.
2. Weak fidelity to programme design
The programme that is taken to scale may not be the same as the pilot. This may happen for budgetary reasons. So, crucial programme components are dropped or there is simply a lack of attention to detail.
3. Pilot in a specific context
The pilot may have taken place in a very specific context. For example, it may have taken place in a region growing a particular crop that is not grown elsewhere. There is high impact, it cannot be replicated elsewhere. Project success may also require access to water, electricity, markets or a whole number of things, which if not available, will lead to the project not working.
4. The programme has already reached all those who can benefit
If the programme being evaluated has already been rolled out and participation was through self-selection in the first phase, then attempts to expand coverage in the next phase will include groups for whom the intervention is less attractive. Participants in the next phase may most likely be those who think they will benefit less from the programme.
What is to be done? One clear implication is the need to evaluate programmes at scale. Small-scale pilots can be considered as efficacy trials – does the intervention work under ideal conditions? Evaluations at scale are effectiveness studies – does it work under actual field conditions? Research teams have to pay more attention to possible threats to external validity of their impact evaluation, and be honest about them in their policy recommendations. Finally, policymakers have to read the fine print about the programme design and implementation of pilots going to scale.
Evidence-informed development can lead to better lives. But we need to get it right.