In Vitro Research Reproducibility: Keeping Up High Standards (original) (raw)

PERSPECTIVE article

Front. Pharmacol., 10 December 2019

Sec. Predictive Toxicology

Volume 10 - 2019 | https://doi.org/10.3389/fphar.2019.01484

Abstract

Concern regarding the reproducibility of observations in life science research has emerged in recent years, particularly in view of unfavorable experiences with preclinical in vivo research. The use of cell-based systems has increasingly replaced in vivo research and the application of in vitro models enjoys an ever-growing popularity. To avoid repeating past mistakes, high standards of reproducibility and reliability must be established and maintained in the field of in vitro biomedical research. Detailed guidance documenting the appropriate handling of cells has been authored, but was received with quite disparate perception by different branches in biomedical research. In that regard, we intend to raise awareness of the reproducibility issue among scientists in all branches of contemporary life science research and their individual responsibility in this matter. We have herein compiled a selection of the most susceptible steps of everyday in vitro cell culture routines that have the potential to influence cell quality and recommend practices to minimize the likelihood of poor cell quality impairing reproducibility with modest investment of time and resources.

Introduction

A survey published in Nature in 2016 (Baker, 2016) evaluating questionnaires on reproducibility in life science research disclosed not only the difficulties researchers have reproducing experiments from other laboratories, but also from their own. Even more surprising was the fact that awareness of this problem was widespread within the scientific community. The inability to reproduce study results, often inherent in observations from academic laboratories, are usually uncovered not without relevant delay, e.g. when potential therapies that are based on these findings transition from preclinical testing to the far more stringent conditions of clinical trials (Collins and Tabak, 2014). Needless to say, the societal costs associated with this problem are intolerable (Freedman et al., 2015). The controversial matter of insufficient reproducibility was, in fact, communicated openly in oncology and cardiovascular biology (Begley and Ellis, 2012; Errington et al., 2014; Libby, 2015). In toxicology, which may better reflect the background of most readers of this journal, awareness of this problem has emerged only gradually in association with insufficient in vivo reproducibility (Kilkenny et al., 2009; Voelkl et al., 2018). Such disclosures, in concert with studies indicating that in vivo data from rats and mice combined can only predict human clinical toxicology of less than 50% of candidate pharmaceuticals (Olson et al., 2000), promoted a revision of several toxicologists’ opinions towards mechanistic in vitro assays from the traditional reliance on pharmacological and toxicological in vivo animal testing.

In Vitro Models in Life Science Research

A major concern raised by researchers in different fields of biomedicine was how a cell culture model, often not even originating from the organ of interest, could provide information about multilayer processes and pathological outcomes in humans. In this context, it is important to understand that application-oriented fields, such as pharmacology or toxicology, operate to a large extent on the fundamental progress made in biomedical research over the past decades and exploit the wealth of information generated about cellular stress pathways and molecular processes. This paradigm shift was largely shaped by the US National Research Council’s (NRC) strategic plan to modernize methods for testing environmental toxicants (Natl. Res. Counc., 2007). The approach envisions the identification of molecular targets and pathways that are linked with a toxicological outcome and fosters the establishment and validation of high-throughput new approach methods (NAM) for quantitative assessment of target perturbations (Collins et al., 2008; U.S. Environ. Prot. Agency, 2009). A key element in the NRC’s strategy is its distinct focus on the quantitative detection of perturbations of defined molecular events [Key Events, (KE)], cellular stress pathways, and marker signatures that are predictive for a specific outcome (Adeleye et al., 2015). The experimental in vitro model of choice, therefore, needs to express the pathway or mechanism of interest and also needs to allow quantitative determination of the disturbance caused by the stressors. In this regard, the concept of adverse outcome pathways (AOP) was designed as a conceptual framework for the sequential organization of the molecular initiating event (MIE), connected with the adverse outcome (AO) via a series of KEs (Ankley et al., 2010). The AOP concept fosters the development or selection of assays allowing a quantitative detection of individual KEs, thereby enabling the definition of threshold levels (Leist et al., 2017; Terron et al., 2018). AOP also represents an organizational tool for identification of additive or synergistic effects that might occur through activation of identical or different KEs by two or more compounds. The stringent demand for precise quantitative and qualitative information required for AOPs illustrates the explicit necessity for experimental models with a high rate of reproducibility and the necessity for increased awareness of the reproducibility problem in all branches of life science research.

Insufficient Reproducibility in Cell Models

A defined assay performed with a defined in vitro model needs to yield identical results— no matter when or where it is performed. As trivial as this statement may appear, its implementation is quite difficult in reality. The Nature survey of 2016 (Baker, 2016) highlighted the degree of inadequate reproducibility in biomedical research and underlined the widespread awareness of the problem within the scientific community. It is, thus, all the more astonishing that systematic comparisons of experimental models applied in different laboratories are rather rare, particularly in the field of in vitro research. In nanotoxicology, in vitro toxicity assays are the most frequently used approaches to assess potential hazardous effects of engineered nanomaterials (Guggenheim et al., 2018). This is mainly due to the fact that researchers early on realized that the immense number of newly developed nanomaterials would make it impossible to perform classical in vivo animal tests due to the amount of time, money, and number of animals required (Hartung and Sabbioni, 2011; Schrurs and Lison, 2012; Guggenheim et al., 2018). Nanomaterials exhibit unique properties due to their small size that make them suitable for many different applications. However, these same particle properties often interfere with experimental test systems (Wörle-Knirsch et al., 2006; Laurent et al., 2012; Bohmer et al., 2018). Insufficient nanoparticle characterization, unidentified interference with test systems, and poor definition of controls for monitoring assay performance led to several contradictory observations in the early days of nanotoxicology research (Hirsch et al., 2011). However, this shaky start allowed nanoparticle toxicology to emerge as one of the fields in which the aspect of adequate reproducibility of in vitro studies gained appropriate attention, and consequently, allowed an open discussion of this topic. An exemplary illustration of this transparency is a publication by Elliott and colleagues (Elliott et al., 2017) assessing the reproducibility of MTS-tetrazolium reduction assay results as indicators of cell viability in an international inter-laboratory comparison study with five independent laboratories. Strict standard operating procedures (SOP) were employed using a sophisticated 96-well plate design that allowed detection of up to seven parameters of assay performance, including accuracy of multi-channel pipetting, cell handling/cell growth, and instrument performance (i.e. plate reader issues) (Rösslein et al., 2015). A549 cells were purchased from two independent, credible, accepted commercial sources and both, seemingly identical, cell cultures were used in all labs. Even under such strict conditions, EC50 values of the two A549 cultures upon CdSO4 treatment differed by a factor of two in all laboratories. In the course of these investigations, cell line authentication was discovered to be one of the main factors influencing assay results. Short tandem repeat sequencing revealed a partial chromosome deletion in one of the cell cultures. Technical aspects also contributed to result variability. For example, simple cell handling steps, such as PBS washing, were identified to significantly change assay outcomes. This example provides a vivid illustration of the impact of seemingly trivial details and the necessity to draw attention to all aspects of in vitro experimentation.

A recent evocative study of the mammary epithelial cell line MCF10A and growth rate inhibition by anti-cancer drugs systematically addressed inter- and intra-study center variations and identified factors contributing to insufficient reproducibility (Niepel et al., 2019). Although the five research centers applied cells and chemicals of the same stock, astonishing center-to-center variations up to 200-fold were observed in growth inhibition rates. Cell seeding, i.e. slight variations in initial cell numbers, was identified as one key source of these variations (for more details see Recommendation 5) (Cell density and medium change). Overall, the subtle interplay between experimental methods and a vast array of poorly defined sources of biological variability was found to be the main cause of the observed irreproducibility. For example, two distinct methods were used to quantify cell viability: a) microscopic cell counting as a direct measure of viable cell number and b) detection of intracellular ATP levels as a proxy of viable cells. ATP levels do not necessarily directly correlate to the number of viable cells, resulting in identical EC50 values for some drugs, but differing greatly for others. Changes in ATP levels following treatment could be the consequence of cell death, effects on cell proliferation or the alteration of cellular ATP metabolism. Furthermore, linearity between cellular ATP levels and cell viability is not justified for many cell types. In several cases, a reduction of ATP levels by almost 50% is tolerated by cells without significant influence on cell viability (Pöltl et al., 2012). In conclusion, while both assays (direct cell counting and ATP measurements) might be quite robust and reproducible per se, they provide different information from their results, e.g. drugs that alter cellular ATP metabolism, and are thus not interchangeable in these cases. As a consequence of the huge number of individual biological factors involved, Niepel and colleagues came to the rather discouraging conclusion that “most examples of irreproducibility are themselves irreproducible_”_ (Niepel et al., 2019).

This spectrum of biological factors further depends on the complexity of the cell model applied. The introduction of 2D co-culture models and 3D cell models was motivated by the ambition to recapitulate the natural in vivo environment of cells in a cell culture dish. In fact, cells in a 3D culture differ morphologically and physiologically from their counterparts in a 2D setup (Baharvand et al., 2006; Edmondson et al., 2014). Introduction of the third dimension in a cell culture model results in additional parameters that could potentially affect reproducibility, including spheroid size and consequently the oxygen and nutrient supplies to cells in different layers within the structure; spatial organization of surface receptors involved in interactions with neighboring cells; activation of signal transduction pathways; and induction of gene expression profiles (Vinci et al., 2012). All of these changes ultimately have the potential to influence cell biology and cellular response towards exogenous stressors. These aspects were exemplified by a study using HCT-116 cells in 3D culture that displayed increased resistance against anti-cancer drugs compared with the 2D model (Karlsson et al., 2012). The results in the 3D model more closely reflected in vivo observations. Nonetheless, the initial euphoria regarding such studies became gradually overshadowed by higher rates of insufficient reproducibility observed in complex cell models. To our knowledge, no systematic comparison of the reproducibility of cells in mono-culture vs. their integration into more complex models (co-culture, 3D, etc.) has yet been published. Rumors from the industry, however, indicate a returning trend towards more elementary cell models with robust readouts that allow both adequate predictivity and high reproducibility. This does not mean that complex cell models are inappropriate with respect to their reproducibility per se, but they require far higher investment in their characterization and validation to limit the degree of overall variability compared with less complex models.

Good Cell Culture Practice: Suggestions to Improve Reproducibility

To the best of our knowledge, no specific field of in vitro research faces greater issues of reproducibility than another. No particular cell line, cellular model, or particular assay seems to be favorable in this regard. In contrast, all fields of in vitro toxicology seem to face certain— though not necessarily the same— issues of irreproducibility. Therefore, the question arises which elements in in vitro biomedical research are potential sources of unsatisfactory reproducibility, and can be actively influenced by individual researchers with manageable effort and within the framework of the existing scientific system. Over the course of the past two decades, initiatives to improve the quality of in vitro research have identified critical aspects of in vitro cell culture routines and their influence on reproducibility. The concept of Good Cell Culture Practice (GCCP) (Coecke et al., 2005) was developed and gradually adapted to ongoing scientific progress (Pamies et al., 2018) as a guidance document for in vitro reporting standards. Other initiatives, such as the OECD guidance document for Good In Vitro Method Practices (GIVIMP) (Eskes et al., 2017), defined standards for regulatory testing under Good Laboratory Practice (GLP) rules. Recently, Petersen and colleagues very specifically discussed sources of variability in four distinct nano-bioassays (Petersen et al., 2019). The following selection of approaches to improve reproducibility of in vitro studies was loosely inspired by these initiatives and makes no claim to completeness. It is instead based on the experiences of the authors and communications with colleagues from adjacent scientific disciplines. An explicit emphasis was placed on broadly applicable techniques to improve the reproducibility of results obtained with cellular in vitro models, which can be implemented with relatively little investment and provide major benefits for the individual project and the scientific community as a whole. Figure 1 provides a summary of potential sources of variability that might influence in vitro results. The particular aspects marked in yellow in the diagram are discussed in more detail.

Recommendations

Discussion and Outlook

The necessity for enhancement of result reproducibility in the life sciences has been identified and is gradually being internalized by researchers and institutions alike (Prinz et al., 2011; Drucker, 2016). Achieving high rates of reproducibility often stands in contradiction to the discovery of novel scientific insights. Although it is obvious that new findings are only of use if they can be reproduced, Dirnagel recently cautioned that cutting-edge discovery is unavoidably associated with a high rate of false positive results (Dirnagl, 2019). These false positive results are an integral part of exploration and must not be conflated with intentional manipulation of results.

Even with the best of intentions, it must be concluded that the limits of reproducibility in cell culture work is reached when confronted with the question of reference standards, particularly for established and widely distributed cell lines. Simply put, which of the currently available and characterized stocks of common cell lines, like HeLa cells, should be considered as the gold standard? Even if a consensus could be reached for individual cell lines, storage capacity limitations force even large cell banks to passage their cells, which necessarily influences the cells in one way or another over time.

We have written the present article as a condensed introduction of effective, easy-to-implement measures to improve reproducibility of experimental results. In plain summary, the most relevant rules are:

Above all the aspects discussed, one of the most influential factors in any attempt to improve reproducibility is a researcher’s consolidated knowledge about the cell model in use (see Principle 1 of GCCP; Coecke et al., 2005). The in-depth characterization of cellular parameters relevant to a given scientific question is certainly a resource-consuming endeavor, but is a worthwhile investment in the long run, both in the selection of an appropriate cell model and interpretation of the results. As the residence time of scientists in laboratories is often limited, the quality and continuity of their supervision by experienced staff becomes a critical factor in knowledge transfer. The second of the most influential factors contributing to insufficient reproducibility is the lack of generally accepted and mandatory guidelines on the cultivation of individual cell lines. Guidance documents such as the GCCP or GIVIMP, have been published for quite some time, but so far have not gained the attention they deserve from researchers and publishers alike. The question therefore arises how consensus on a standard protocol for a given cell line can be achieved and how its application can be motivated. This task can only be accomplished by members of the communities regularly using a given cell line. The formulation of new guideline protocols should be fostered by the respective scientific societies. Well established and influential laboratories might play a key role to reach consensus on a ready-to-use standard protocol for the handling of a given cell line. Application of these protocols should in a next step become mandatory for the acceptance of a new study by the scientific community, unless deviations from the standard protocol can be scientifically justified. Such measures will not bring overwhelming scientific merit for the individual scientist, but are inevitable steps to re-establish and maintain confidence of both researchers and the general public in contemporary biomedical in vitro research.

Funding

We acknowledge funding from the NanoScreen Materials Challenge co-funded by the Competence Centre for Materials Science and Technology (CCMX) as well as support by the Doerenkamp-Zbinden foundation, the Land-BW (N EURODEG), the BMBF (NeuriTox) and by the Projects from the European Union’s Horizon 2020 research and innovation program EU-ToxRisk (grant agreement No 681002).

Statements

Acknowledgments

We would like to thank Editage (www.editage.com) for the English language editing. We thank Prof. Dr. Marcel Leist and Dr. Peter Wick for the critical reading of the manuscript.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Keywords

cell culture models, reproducibility, toxicology, good cell culture practice, new approach methods (NAM)

Citation

Hirsch C and Schildknecht S (2019) In Vitro Research Reproducibility: Keeping Up High Standards. Front. Pharmacol. 10:1484. doi: 10.3389/fphar.2019.01484

Received

21 August 2019

Accepted

15 November 2019

Published

10 December 2019

Volume

10 - 2019

Edited by

Sebastian Hoffmann, Seh consulting + services, Germany

Reviewed by

Zhichao Liu, National Center for Toxicological Research (FDA), United States; Albert P. Li, In Vitro ADMET Laboratories, United States

Updates

Copyright

© 2019 Hirsch and Schildknecht.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Stefan Schildknecht, Stefan.Schildknecht@uni-konstanz.de

This article was submitted to Predictive Toxicology, a section of the journal Frontiers in Pharmacology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.