Principles of Reproducible Research
What is Reproducible Research
Research is considered to be reproducible when the exact results can be reproduced if given access to the original data, software, or code.
- The same results should be obtained under the same conditions
- It should be possible to recreate the same conditions
Definition
Reproducibility refers to the ability of a researcher to duplicate the results of a prior study using the same materials as were used by the original investigator. That is, a second researcher might use the same raw data to build the same analysis files and implement the same statistical analysis in an attempt to yield the same results. Reproducibility is a minimum necessary condition for a finding to be believable and informative.
— U.S. National Science Foundation (NSF) subcommittee on Replicability in Science
Elements of Reproducible Research
There are four key elements of reproducible research:
- data documentation
- data publication
- code publication
- output publication
Reproducibility Crisis
Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016)
Factors Contributing to the Reproducibility Crisis
- Not enough documentation on how experiment is conducted and data is generated
- Data used to generate original results unavailable
- Software used to generate original results unavailable
- Difficult to recreate software environment (libraries, versions) used to generate original results
- Difficult to rerun the computational steps
While reproducibility is the minimum requirement and can be solved with “good enough” computational practices, replicability/ robustness/ generalisability of scientific findings are an even greater concern involving research misconduct, questionable research practices (p-hacking, HARKing, cherry-picking), sloppy methods, and other conscious and unconscious biases.

What is the Reproducibility Spectrum

Peng, RD Reproducible Research in Computational Science Science (2011)