If the answer isn’t 42, how do we find it?

16 June 2014

Those of you around my age may be familiar with Douglas Adams’ Hitchhikers Guide to the Galaxy in which the answer to the question ‘What is the meaning of life, the universe and everything?’ turns out to be the number 42. We wish that systematic reviews could be like that. Throw all the evidence into a big number cruncher and out pops a single answer.

This is what statistical meta-analysis does. But are the answers it gives as absurd as Douglas Adams’ 42?

In his recent blog, Cyrus Samii argues that statistical meta-analysis may often be inappropriate (meta-analysis more generally defined applies to any synthesis). There is just too little evidence to meaningfully combine in this way. Heterogeneity of context, intervention and estimation method render any measure of an ‘average treatment effect’ useless. So, says Cyrus, many of the meta-analyses being produced today – including his own – are probably wrong-headed. They are in pursuit of a single number answer where none exists. Rather than meta-analysis, studies should present the ‘best available evidence’ for the question at hand.

Now, I don’t disagree with Cyrus at all, but I do think there is more to be said.

First, this argument is not a rejection of meta-analysis and should not be read that way. If we are advocating for the best available method to answer the question at hand, that best method will sometimes be meta-analysis. It is then a secondary consideration, albeit an important one, if there is currently sufficient evidence of the right sort to carry out the meta-analysis.

Second, let us unpack a bit what we mean by ‘the right method to answer the right question’. 3ie supported a Campbell review of interventions to improve schooling in low- and middle-income countries by Petrosino et al., which we repackaged with our own analysis in a 3ie working paper. This review lumps together many studies of different interventions in different contexts. But if the question is ‘on average, have interventions to get children into school been effective?’, then lumping all these studies and doing meta-analysis is the right approach. And the answer is ‘yes, they have been effective, as have been those interventions for improving learning outcomes. But it takes different interventions to get children into school than it does for them to learn once they are there.’ These are useful questions to answer, and meta-analysis is the best approach to answer them.

Third, heterogeneity is the friend of meta-analysis not its enemy. With sufficient observations we can unpack the average treatment effects by intervention type, beneficiary population and so on. For example, which interventions are most effective at getting children into school? Answer: conditional cash transfers and providing resources for teachers, with pre-school and school feeding also looking promising.

One of Cyrus’s own reviews uses meta-analysis to show that land reform generally has productivity enhancing effects, but not in Africa. And one of my favourite recent reviews, by Sarah Baird and colleagues, coded the degree of monitoring and enforcement of conditionality in conditional cash transfer programmes. The review clearly shows that programmes in which conditions are better monitored and enforced have a greater impact on school enrolments.

Having said all that, there are certainly cases in which quantitative synthesis of effect sizes is not appropriate. 3ie promotes synthesis of both factual and counterfactual evidence across the causal chain – as exemplified in our recent review of farmer field schools. But how many studies actually do that?

The most pertinent methods issue I believe raised by Cyrus’s blog is that qualitative synthesis is generally done so poorly. Many studies present what is essentially an annotated bibliography – that is a list of studies devoting a paragraph to each one, perhaps organised into sections. But this is just a presentation of the data, it is not a synthesis. There are well-established methods of qualitative synthesis, including coding and matrices which are barely applied in the reviews I have seen. This approach will allow a thematically organised presentation rather than a study-oriented approach.

So, I am all for using the right method for the right question. But there is too little understanding and agreement of the right methods of qualitative synthesis.

If the answer isn’t 42, how do we find it?

Leave a comment