Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection - PubMed (original) (raw)

Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection

Zuoli Dong et al. BMC Cancer. 2015.

Abstract

Background: An enduring challenge in personalized medicine is to select right drug for individual patients. Testing drugs on patients in large clinical trials is one way to assess their efficacy and toxicity, but it is impractical to test hundreds of drugs currently under development. Therefore the preclinical prediction model is highly expected as it enables prediction of drug response to hundreds of cell lines in parallel.

Methods: Recently, two large-scale pharmacogenomic studies screened multiple anticancer drugs on over 1000 cell lines in an effort to elucidate the response mechanism of anticancer drugs. To this aim, we here used gene expression features and drug sensitivity data in Cancer Cell Line Encyclopedia (CCLE) to build a predictor based on Support Vector Machine (SVM) and a recursive feature selection tool. Robustness of our model was validated by cross-validation and an independent dataset, the Cancer Genome Project (CGP).

Results: Our model achieved good cross validation performance for most drugs in the Cancer Cell Line Encyclopedia (≥80% accuracy for 10 drugs, ≥75% accuracy for 19 drugs). Independent tests on eleven common drugs between CCLE and CGP achieved satisfactory performance for three of them, i.e., AZD6244, Erlotinib and PD-0325901, using expression levels of only twelve, six and seven genes, respectively.

Conclusions: These results suggest that drug response could be effectively predicted from genomic features. Our model could be applied to predict drug response for some certain drugs and potentially play a complementary role in personalized medicine.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Computational framework. In the left panel, cell lines in CCLE were first divided into three groups according to their normalized drug response values. Then gene expression features were selected by SVM-RFE for building an SVM model, where the optimal feature number and parameters were obtained by a 10-fold cross validation. To test the generalization ability of the model, in the right panel, gene expression profile of CGP data set was fed to the model to get the attribute (sensitive or resistant) of each cell line. Then CGP performance was measured by comparing the model prediction with the sample classification based on the normalized IC50

Fig. 2

Fig. 2

Sample classification for the drug Panobinostat. All samples are classified into three groups according to the threshold 0.8 SDs of the normalized activity area

Fig. 3

Fig. 3

Gene expression of sensitive and resistant cell lines for Panobinostat

Fig. 4

Fig. 4

Elimination of Batch effect by ComBat. Boxplot showing gene expression distributions before (a) and after (b) ComBat for five cell lines in CCLE and CGP

Fig. 5

Fig. 5

Prediction accuracy and number of selected features for four drugs. Prediction accuracies at different numbers of selected top features for four drugs, i.e., AZD6244, Erlotinib, Sorafenib and AZD0530. The optimal feature numbers are highlighted in red

Fig. 6

Fig. 6

Cross validation results for CCLE drugs. For each drug in CCLE, model accuracy was obtained through a 10-fold cross validation. Barplot shows accuracy values for all drugs in CCLE

Fig. 7

Fig. 7

Independent tests on CCLE model for AZD6244, Erlotinib and PD-0325901. Boxplot and ROC curve (the bottom curve indicates drug response, measured as the area over the dose–response curve, i.e., activity area) have been built to evaluate the svm model. (a) For drug AZD6244, p-value by t test is 3.316e-12 and area under the curve is 0.668. (b) For drug Erlotinib, p-value by T test is 0.01885 and area under the curve is 0.57. (c) For drug PD-0325901, p-value by t test is 5.851e-14 and area under the curve is 0.70

Similar articles

Cited by

References

    1. Griffith LG, Swartz MA. Capturing complex 3D tissue physiology in vitro. Nat Rev Mol Cell Bio. 2006;7(3):211–224. doi: 10.1038/nrm1858. - DOI - PubMed
    1. Richmond A, Su YJ. Mouse xenograft models vs GEM models for human cancer therapeutics. Dis Model Mech. 2008;1(2–3):78–82. doi: 10.1242/dmm.000976. - DOI - PMC - PubMed
    1. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehar J, Kryukov GV, Sonkin D, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity (vol 483, pg 603, 2012) Nature. 2012;492(7428):290–290. doi: 10.1038/nature11735. - DOI - PMC - PubMed
    1. Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, Greninger P, Thompson IR, Luo X, Soares J, et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012;483(7391):570–U587. doi: 10.1038/nature11005. - DOI - PMC - PubMed
    1. Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer. 2006;6(10):813–823. doi: 10.1038/nrc1951. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources