play_arrow
10- 10+
0:0
0:0

Social science research is currently facing questions about credibility, as existing evidence points to the lack of replicability of reported results. Multiple instances have occurred where users have failed to reproduce the original results, even when the same data and code were used, casting doubt on the reliability of published findings. In response, the open science movement, promoting the sharing of data, code, and research materials, has emerged as a key initiative aimed at improving transparency and accountability in research. 

At 3ie, we uphold the principle of research credibility through push-button replication (PBR), a process that helps us verify the reliability of findings, promote accountability, and build trust in the evidence we produce. In this blog, we explore 3ie’s PBR work: what it entails, the steps involved, and best practices to support the PBR process. 

Understanding Push button replications

Push-button replication (PBR) at 3ie is a process designed to verify the computational reproducibility of research findings by using the original data, code, and methodology provided by the authors. This process ensures that results can be replicated with a push of a button. To facilitate this, 3ie’s Transparent, Reproducible and Ethical Evidence (TREE) policy encourages 3ie-led studies to undergo PBR prior to being published. It plays an important role in ensuring the quality of research outputs and demonstrates 3ie’s commitment to transparency.  

3ie’s replication program

Recognizing the importance of replication in verifying conclusions and reducing errors and biases, 3ie established its Replication Programme in 2012 to promote transparency and reproducibility of research. By making data and code publicly accessible, 3ie encourages external researchers to reproduce results and confirm the accuracy of studies, while preserving data for the future. 

How is PBR implemented? 

To initiate the process, the PBR team requests the study authors for the materials needed to replicate the results.  Once the data and code are obtained, the PBR team identifies key results that would be replicated. They then use the provided code and data to replicate the study’s main tables and figures, following any replication instructions given by the original authors. The PBR team troubleshoots and rectifies minor technical challenges occurring while executing the code and contacts the authors for clarification if major alterations are required. After the replication process, the output generated is compared with the results provided in the report. 

Based on the replication process and the comparison of the results, the PBR team classifies the study into one of the following categories.

  1. Comparable replication: identical results or very small changes (like rounding). 
  2. Minor differences: small differences in coefficients and/or p-values. 
  3. Major differences: meaningful differences in reported outcomes (especially in the key results) or the code does not reproduce published results.
  4. No access: the original authors decline the request to provide data or code. 
  5. Proprietary data: unable to provide data but provided replication code and Domain Specific Language (DSL) documentation.
  6. Incomplete: If PBR researchers are unable to reproduce part of the publication due to missing code and/or data.

If the study is classified as category ‘A’, the results are published, and the data archived. For studies classified under categories ‘B’ to ‘F’, the PBR team works with the authors to ensure the generation of comparable results.

What are the major requirements for a PBR?

The study authors are expected to organize and share the required information to initiate the PBR process. It must contain the following components.

Deidentified raw data files: Deidentified raw data files are preferred for the PBR process to ensure confidentiality.
Readme file: A readme file with instructions for using the codebook. The readme file can also specify the order of running the do files for data preparation and analysis. 
Codes for data preparation: The codes used for cleaning and preparing the data, are used to generate the cleaned data set.
Codes for analysis: The codes used for producing and exporting key results, including tables and figures used in the report. A master code (such as a STATA do. file or an R script) specifying the prerequisites for executing the codes, including data organization, creation of repositories, installation of statistical packages, and other requirements, can help organize the PBR process better.

Best practices for PBR

Here are a few best practices that researchers can follow when preparing files for replication:

  1. Preserve and share all files and code used in processing the raw data, ensuring they are shared as part of the replication package.
  2. Label all variables to help other users understand the data structure and troubleshoot any issues with the data or analysis code.
  3. Provide documentation of all data processing and analysis steps. Use inline comments in the code document steps.
  4. Consider using version control tools to track changes to code and data files over time.
  5. Specify and archive the versions of all statistical software packages used in the analysis to ensure consistent replication results.

Read more about our transparency and reproducibility work here.

Leave a comment

Enter alphabetic characters only.