Reproducibility and bias in the reporting of experimental findings continues to be an important issue for science. This has been reinforced with the publication of a series of articles in Nature and an interesting paper analysing in vivo studies in PLoS.

I have previously touched on the topic of unconscious bias with regard to gender issues but the failure to understand some of our own biases is not helping the cause of reproducibility. As scientists we have many biases – the best way to analyse a data set, our expectations about the outcome of the experiment, selection of the parameters to measure, our stake in the outcome of the research, for example, for our career progression to name a few potential areas.

Regina Nuzzo (Nature 2015: 526, p182-185) goes into more detail about these biases and Silberzahn and Uhlmann show some compelling data in the same issue (Nature 2015: 526, p189). One of these biases is called ‘p-hacking’ where researchers peek at the results and then decide whether to collect more data. I definitely saw this in the lab when I was a post-doctoral research fellow and was actually very pleased (and surprised) when I went into industry in the 1980s to find that the level of rigour was much greater than the academic lab.

In vivo experiments were carried out in a blinded fashion – in fact double blind as the person injecting the animals did not know what they were injecting, the bottles being labelled A-D etc., but also the behavioural tester did not know which treatment the animal had received as the tester was blind to what had been injected. Treatments were given according to randomly generated protocols and numbers of animals to be used, the analysis was predetermined and standards included according to established standard operating procedures.

Somewhat naively I believed that this type of rigour was now common place in academic labs too but the paper in PLOS Biology (Macleod et al PLoSBiol: e1002273) indicates that this might not be so. They looked at the incidence of randomisation, blinding and reporting of conflict of interest in two datasets. The first was set was selected from a random sample of 2000 publications from PubMed. Of 814 primary research papers that remained after excluding physics and chemistry publications, 146 reported in vivo data where the full text was accessible. Of these 27 stated they had used randomisation, four used blinding and 15 whether the authors had potential conflicts of interest. Clearly this was a small sample and a second set was analysed. The data here was from the CAMARADES standardised data for disease models where 2671 publications reporting drug efficacy in the eight most frequently represented models were analysed. Shockingly (since this data set had publications from 1992-2011) only 24.8% reported randomisation, 29.5% blinding but less than 1% sample size calculation. This clearly is limited to in vivo studies but I am sure there are worrying biases in in vitro studies too.

So why does this matter? Clearly this lack of rigour will contribute to the lack of reproducibility that has been observed especially in preclinical biomedical research (Freedman et al PLoSBiol: e1002165) and this not only wastes money, provides false hope to patients but also erodes trust in science. Clearly funders have a role to play including ensuring appropriate training but so does the whole scientific community. Lack of rigour as shown by not blinding experiments, validating reagents etc should be just as unacceptable as picking your nose in the lab (assuming this is deemed to be unacceptable behaviour!). Supervisors, experimenters, lab heads, professors as well as assessors and funders, science needs you to ensure the rigour that will maintain public trust and provide an efficient use of resources in times which are not currently, and are unlikely to be in the future, in oversupply!

On a more positive note BBSRC has always been a strong sponsor of the Daphne Jackson Trust. We sponsored the Trust’s conference this year where the results of its 2015 survey of its former Fellows was reported. The survey had a 79% response rate and showed that over 90% of Daphne Jackson Fellows continued working in STEM for the majority of their career post-Fellowship, with over 70% remaining in research-based roles for two years post-Fellowship and 57% for up to five years post-Fellowship. The scheme has been open to men for some years and I hope more will avail themselves of it – but it is a fantastic way of ensuring talented individuals remain in STEM and STEM reaps the benefit of previous investments in them.

Related posts (based on tags and chronology):