Applications of proteochemometrics - from species extrapolation to cell line sensitivity modelling (original) (raw)

Background

Proteochemometrics (PCM) is a predictive bioactivity modelling method which simultaneously models the bioactivity of multiple ligands against multiple targets. PCM permits exploration of the selectivity and promiscuity of ligands on biomolecular systems of different complexity. This includes proteins and even cell-line models [1, 2]. The suitability of PCM to predict compound polypharmacology has been validated both retrospectively and in prospective experimental validation [1, 2]. In practice, each ligand-target interaction is encoded by the concatenation of ligand and target descriptor vectors used to train a single machine learning model. The inclusion of both chemical and target information enables the extra- and interpolation on the chemical and on the biological space. Therefore, PCM permits to predict compound bioactivities on targets not present in the training phase [3].

Results

In this contribution, we show a methodological advancement in the field [4], namely how Bayesian inference (Gaussian Processes) can be successfully applied in the context of PCM for (i) the prediction of compound bioactivity along with the error estimation of the prediction; (ii) the determination of the applicability domain of a PCM model; and (iii) the inclusion of experimental uncertainty of bioactivity measurements. We illustrate how the application of PCM can be useful in medicinal chemistry to concomitantly optimize compounds selectivity and potency, in the context of two application scenarios: (a) modelling isoform-selective cyclooxygenase inhibition; and (b) large-scale cancer cell line drug sensitivity prediction, where we benchmark the predictive signal of basal gene expression, gene copy-number variation, exome sequencing, and protein abundance data. We present the R package Chemically Aware Model Builder (camb) [[5](/article/10.1186/1471-2105-16-S3-A4#ref-CR5 "Murrell DS, Cortes-Ciriano I, van Westen GJP, Stott IP, Bender A, Malliavin TE, Glen RC: Chemically Aware Model Builder (camb): An R package for property and bioactivity modeling of small molecules. [ http://www.github.com/cambDI/camb

              ]")\], which is able to perform the above mentioned modelling tasks. _camb_ is an open source platform for the generation of Structure-Activity and Structure-Property models. The functionalities of _camb_ include: (i) standardisation of chemical structure representation, (ii) calculation of 905 one-dimensional descriptors and 14 fingerprints for small molecules, (iii) 8 types of amino acid descriptors, (iv) 13 whole protein sequence descriptors, and (iv) training, validation and visualization of predictive models.

Conclusions

Overall, the application of PCM in these two case scenarios let us conclude that PCM is a suitable technique, on this data, to model the activity of ligands exhibiting diverse bioactivity profiles across a panel of targets, which can range from protein binding sites (a), to cancer cell-lines (b). The camb package constitutes a platform encompassing all steps for the generation of predictive models from chemical structures and their associated bioactivities/properties, which will provide reproducibility and simplify the generation of predictive bioactivity/property models.

References

  1. van Westen GJP, Wegner JK, Ijzerman AP, van Vlijmen HWT, Bender A: Proteochemometric Modeling as a Tool to Design Selective Compounds and for Extrapolating to Novel Targets. Med Chem Commun. 2011, 2: 16-30. 10.1039/c0md00165a.
    Article CAS Google Scholar
  2. Cortes-Ciriano I, Ain QU, Subramanian V, Lenselink EB, Mendez-Lucio O, Ijzerman AP, Wohlfahrt G, Prusis P, Malliavin TE, van Westen GJP, Bender A: Polypharmacology Modelling Using Proteochemometrics (PCM): Recent Methodological Developments, Applications to Target Families, and Future Prospects. Med Chem Commun.
  3. van Westen GJP, Wegner JK, Geluykens P, Kwanten L, Vereycken I, Peeters A, Ijzerman AP, van Vlijmen HWT, Bender A: Which Compound to Select in Lead Optimization? Prospectively Validated Proteochemometric Models Guide Preclinical Development. PLoS ONE. 2011, 6: e27518-10.1371/journal.pone.0027518.
    Article PubMed Central CAS PubMed Google Scholar
  4. Cortes-Ciriano I, van Westen GJP, Lenselink EB, Murrell DS, Bender A, Malliavin TE: Proteochemometric Modelling in a Bayesian framework. J Cheminf. 2014, 6: 35-10.1186/1758-2946-6-35.
    Article Google Scholar
  5. Murrell DS, Cortes-Ciriano I, van Westen GJP, Stott IP, Bender A, Malliavin TE, Glen RC: Chemically Aware Model Builder (camb): An R package for property and bioactivity modeling of small molecules. [http://www.github.com/cambDI/camb]

Download references

Author information

Authors and Affiliations

  1. Département de Biologie Structurale et Chimie, Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3825, 25, rue du Dr Roux, 75015, Paris, France
    Isidro Cortes-Ciriano & Therese E Malliavin
  2. ChEMBL Group, European Molecular Biology Laboratory European Bioinformatics Institute, Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK
    Gerard JP van Westen
  3. Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
    Daniel S Murrell & Andreas Bender
  4. Division of Medicinal Chemistry, Leiden Academic Center for Drug Research, Leiden, The Netherlands
    Eelke B Lenselink

Authors

  1. Isidro Cortes-Ciriano
    You can also search for this author inPubMed Google Scholar
  2. Gerard JP van Westen
    You can also search for this author inPubMed Google Scholar
  3. Daniel S Murrell
    You can also search for this author inPubMed Google Scholar
  4. Eelke B Lenselink
    You can also search for this author inPubMed Google Scholar
  5. Andreas Bender
    You can also search for this author inPubMed Google Scholar
  6. Therese E Malliavin
    You can also search for this author inPubMed Google Scholar

Rights and permissions

This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

About this article

Cite this article

Cortes-Ciriano, I., van Westen, G.J., Murrell, D.S. et al. Applications of proteochemometrics - from species extrapolation to cell line sensitivity modelling.BMC Bioinformatics 16 (Suppl 3), A4 (2015). https://doi.org/10.1186/1471-2105-16-S3-A4

Download citation

Keywords