A.-c. Camproux - Academia.edu (original) (raw)
Papers by A.-c. Camproux
Journal of Molecular Biology, Jun 1, 2004
Understanding and predicting protein structures depend on the complexity and the accuracy of the ... more Understanding and predicting protein structures depend on the complexity and the accuracy of the models used to represent them. We have recently set up a Hidden Markov Model to optimally compress protein three-dimensional conformations into a one-dimensional series of letters of a structural alphabet. Such a model learns simultaneously the shape of representative structural letters describing the local conformation and the logic of their connections, i.e. the transition matrix between the letters. Here, we move one step further and report some evidence that such a model of protein local architecture also captures some accurate amino acid features. All the letters have specific and distinct amino acid distributions. Moreover, we show that words of amino acids can have significant propensities for some letters. Perspectives point towards the prediction of the series of letters describing the structure of a protein from its amino acid sequence.
The original publication is available at www.springerlink.com A reduced amino acid alphabet for u... more The original publication is available at www.springerlink.com A reduced amino acid alphabet for understanding and designing protein adaptation to mutation.
Translational psychiatry, Jun 20, 2017
Early identification of Alzheimer's disease (AD) risk factors would aid development of interv... more Early identification of Alzheimer's disease (AD) risk factors would aid development of interventions to delay the onset of dementia, but current biomarkers are invasive and/or costly to assess. Validated plasma biomarkers would circumvent these challenges. We previously identified the kinase DYRK1A in plasma. To validate DYRK1A as a biomarker for AD diagnosis, we assessed the levels of DYRK1A and the related markers brain-derived neurotrophic factor (BDNF) and homocysteine in two unrelated AD patient cohorts with age-matched controls. Receiver-operating characteristic curves and logistic regression analyses showed that combined assessment of DYRK1A, BDNF and homocysteine has a sensitivity of 0.952, a specificity of 0.889 and an accuracy of 0.933 in testing for AD. The blood levels of these markers provide a diagnosis assessment profile. Combined assessment of these three markers outperforms most of the previous markers and could become a useful substitute to the current panel of...
Nucleic Acids Research, 2015
Predicting protein pocket's ability to bind drug-like molecules with high affinity, i.e. druggabi... more Predicting protein pocket's ability to bind drug-like molecules with high affinity, i.e. druggability, is of major interest in the target identification phase of drug discovery. Therefore, pocket druggability investigations represent a key step of compound clinical progression projects. Currently computational druggability prediction models are attached to one unique pocket estimation method despite pocket estimation uncertainties. In this paper, we propose 'PockDrug-Server' to predict pocket druggability, efficient on both (i) estimated pockets guided by the ligand proximity (extracted by proximity to a ligand from a holo protein structure) and (ii) estimated pockets based solely on protein structure information (based on amino atoms that form the surface of potential binding cavities). PockDrug-Server provides consistent druggability results using different pocket estimation methods. It is robust with respect to pocket boundary and estimation uncertainties, thus efficient using apo pockets that are challenging to estimate. It clearly distinguishes druggable from less druggable pockets using different estimation methods and outperformed recent druggability models for apo pockets. It can be carried out from one or a set of apo/holo proteins using different pocket estimation methods proposed by our web server or from any pocket previously estimated by the user. PockDrug-Server is publicly available at: http://pockdrug.rpbs.univ-paris-diderot.fr.
The American journal of physiology, 1994
Luteinizing hormone (LH) is released by the pituitary in discrete pulses. In the monkey, the appe... more Luteinizing hormone (LH) is released by the pituitary in discrete pulses. In the monkey, the appearance of LH pulses in the plasma is invariably associated with sharp increases (i.e, volleys) in the frequency of the hypothalamic pulse generator electrical activity, so that continuous monitoring of this activity by telemetry provides a unique means to study the temporal structure of the mechanism generating the pulses. To assess whether the times of occurrence and durations of previous volleys exert significant influence on the timing of the next volley, we used a class of periodic counting process models that specify the stochastic intensity of the process as the product of two factors: 1) a periodic baseline intensity and 2) a stochastic regression function with covariates representing the influence of the past. This approach allows the characterization of circadian modulation and memory range of the process underlying hypothalamic pulse generator activity, as illustrated by fittin...
Understanding and predicting protein structures depends on the complexity and the accuracy of the... more Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have setup a Hidden Markov Model to optimally compress three dimensional (3D) conformation of protein into a structural alphabet, i.e. a library of exhaustive and representative states (describing short fragments) learnt simultaneously with connection logic. The discretization of protein backbone local conformation as a series of states results in a simplification of protein 3D coordinates into a unique unidimensional (1D) representation. We present some evidence that such approach can constitute a very relevant way to the analysis of protein architecture in particular for protein structure comparison or prediction.
Plant biotechnology journal, 2014
RNA-dependent RNA polymerase 6 (RDR6) and suppressor of gene silencing 3 (SGS3) act together in p... more RNA-dependent RNA polymerase 6 (RDR6) and suppressor of gene silencing 3 (SGS3) act together in post-transcriptional transgene silencing mediated by small interfering RNAs (siRNAs) and in biogenesis of various endogenous siRNAs including the tasiARFs, known regulators of auxin responses and plant development. Legumes, the third major crop family worldwide, has been widely improved through transgenic approaches. Here, we isolated rdr6 and sgs3 mutants in the model legume Medicago truncatula. Two sgs3 and one rdr6 alleles led to strong developmental defects and impaired biogenesis of tasiARFs. In contrast, the rdr6.1 homozygous plants produced sufficient amounts of tasiARFs to ensure proper development. High throughput sequencing of small RNAs from this specific mutant identified 354 potential MtRDR6 substrates, for which siRNA production was significantly reduced in the mutant. Among them, we found a large variety of novel phased loci corresponding to protein-encoding genes or transp...
Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), 2001
The prediction of loop conformations is one of the challenging problems of homology modeling, due... more The prediction of loop conformations is one of the challenging problems of homology modeling, due to the large sequence variability associated with these parts of protein structures. In the present study, we introduce a search procedure that evolves in a structural alphabet space deduced from a hidden Markov model to simplify the structural information. It uses a Bayesian criterion to predict, from the amino acid sequence of a loop region, its corresponding word in the structural alphabet space.
Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), 1999
Hidden Markov models were used to identify recurrent short 3D structural building blocks (SSBBs) ... more Hidden Markov models were used to identify recurrent short 3D structural building blocks (SSBBs) describing protein backbones. Polypeptide chains were broken down into successive short segments de®ned by their inter-alpha-carbon distances. Fitting the model to a database of nonredundant proteins identi®ed 12 distinct SSBBs and described the preferred pathways by which SSBBs were assembled to form the 3D structure of the proteins. Protein backbones were labelled in terms of these SSBBs. The observed SSBB preferences for fragments located between regular secondary structures suggested that they depended more on the following regular structure than on the preceding one. Extraction of repeated series of SSBBs between regular secondary structures showed some structural speci®city within dierent connection types. These results con®rm that SSBBs can be used as building blocks for analyzing protein structures, and can yield new information on the structures of the coils¯anking secondary structures.
Protein Engineering Design and Selection, 1999
The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks ... more The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks (SBBs) describing protein backbones, independently of any a priori knowledge. Polypeptide chains are decomposed into a series of short segments defined by their inter-α-carbon distances. Basically, the model takes into account the sequentiality of the observed segments and assumes that each one corresponds to one of several possible SBBs. Fitting the model to a database of non-redundant proteins allowed us to decode proteins in terms of 12 distinct SBBs with different roles in protein structure. Some SBBs correspond to classical regular secondary structures. Others correspond to a significant subdivision of their bounding regions previously considered to be a single pattern. The major contribution of the HMM is that this model implicitly takes into account the sequential connections between SBBs and thus describes the most probable pathways by which the blocks are connected to form the framework of the protein structures. Validation of the SBBs code was performed by extracting SBB series repeated in recoding proteins and examining their structural similarities. Preliminary results on the sequence specificity of SBBs suggest promising perspectives for the prediction of SBBs or series of SBBs from the protein sequences.
PLoS Computational Biology, 2010
Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic ta... more Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic targets. So far, only a minute fraction of the estimated 650,000 PPIs that comprise the human interactome are known with a tiny number of complexes being drugged. Such intricate biological systems cannot be cost-efficiently tackled using conventional high-throughput screening methods. Rather, time has come for designing new strategies that will maximize the chance for hit identification through a rationalization of the PPI inhibitor chemical space and the design of PPI-focused compound libraries (global or target-specific). Here, we train machine-learning-based models, mainly decision trees, using a dataset of known PPI inhibitors and of regular drugs in order to determine a global physico-chemical profile for putative PPI inhibitors. This statistical analysis unravels two important molecular descriptors for PPI inhibitors characterizing specific molecular shapes and the presence of a privileged number of aromatic bonds. The best model has been transposed into a computer program, PPI-HitProfiler, that can output from any drug-like compound collection a focused chemical library enriched in putative PPI inhibitors. Our PPI inhibitor profiler is challenged on the experimental screening results of 11 different PPIs among which the p53/MDM2 interaction screened within our own CDithem platform, that in addition to the validation of our concept led to the identification of 4 novel p53/MDM2 inhibitors. Collectively, our tool shows a robust behavior on the 11 experimental datasets by correctly profiling 70% of the experimentally identified hits while removing 52% of the inactive compounds from the initial compound collections. We strongly believe that this new tool can be used as a global PPI inhibitor profiler prior to screening assays to reduce the size of the compound collections to be experimentally screened while keeping most of the true PPI inhibitors. PPI-HitProfiler is freely available on request from our CDithem platform website, www.CDithem.com.
Nucleic Acids Research, 2011
The detection of functional motifs is an important step for the determination of protein function... more The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at
Nucleic Acids Research, 2004
SA-Search is a web tool that can be used to mine for protein structures and extract structural si... more SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of threedimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs. jussieu.fr/cgi-bin/SA-Search.
Nucleic Acids Research, 2004
SCit is a web server providing services for protein side chain conformation analysis and side cha... more SCit is a web server providing services for protein side chain conformation analysis and side chain positioning. Specific services use the dependence of the side chain conformations on the local backbone conformation, which is described using a structural alphabet that describes the conformation of fragments of four-residue length in a limited library of structural prototypes. Based on this concept, SCit uses sets of rotameric conformations dependent on the local backbone conformation of each protein for side chain positioning and the identification of side chains with unlikely conformations. The SCit web server is accessible at http://bioserv.rpbs.jussieu. fr/SCit.
Medicine, 2000
From January 1996 to January 1997, 321 patients with an average age of 46 +/- 16 years and chroni... more From January 1996 to January 1997, 321 patients with an average age of 46 +/- 16 years and chronically infected with hepatitis C virus (HCV) were prospectively enrolled in a study designed to determine the prevalence of extrahepatic manifestations associated with HCV infection in a large cohort of HCV patients, to identify associations between clinical and biologic manifestations, and to compare the results obtained in human immunodeficiency virus (HIV)-positive versus HIV-negative subsets. In a cross-sectional study, clinical extrahepatic manifestations, viral coinfections with HIV and/or hepatitis B virus, connective tissue diseases, and a wide panel of autoantibodies were assessed. Thirty-eight percent (122/321) of patients presented at least 1 clinical extrahepatic manifestation including arthralgia (60/321, 19%), skin manifestations (55/321, 17%), xerostomia (40/321, 12%), xerophthalmia (32/321, 10%), and sensory neuropathy (28/321, 9%). Main biologic abnormalities were mixed cryoglobulins (110/196, 56%), thrombocytopenia (50/291, 17%), and the presence of the following autoantibodies: antinuclear (123/302, 41%), rheumatoid factor (107/280, 38%), anticardiolipin (79/298, 27%), antithyroglobulin (36/287, 13%) and antismooth muscle cell (27/288, 9%). At least 1 autoantibody was present in 210/302 (70%) of sera. By multivariate logistic regression analysis, 4 parameters were significantly associated with cryoglobulin positivity: systemic vasculitis (p = 0.01, odds ratio OR[ = 17.3), HIV positivity (p = 0.0006, OR = 10.2), rheumatoid factor positivity (p = 0.01, OR = 2.8), and sicca syndrome (p = 0.03, OR = 0.27). A definite connective tissue disease was noted in 44 patients (14%), mainly symptomatic mixed cryoglobulinemia and systemic vasculitis, HIV coinfection (23%) was associated with 3 parameters: anticardiolipin (p = 0.003, OR = 4.18), thrombocytopenia (p = 0.01, OR = 3.56), and arthralgia or myalgia (p = 0.017, OR = 0.23). HIV-positive patients presented more severe histologic lesions (p = 0.0004). Extrahepatic clinical manifestations in HCV patients involve primarily the skin and joints. The most frequent immunologic abnormalities include mixed cryoglobulins, rheumatoid factor, antinuclear, anticardiolipin, and antithyroglobulin antibodies. Cryoglobulin positivity is associated with systemic vasculitis and rheumatoid factor and HIV positivity. HIV coinfection is associated with arthralgia or myalgia, anticardiolipin antibodies, and thrombocytopenia.
Journal of Molecular Biology, 2004
Understanding and predicting protein structures depends on the complexity and the accuracy of the... more Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction.
The Journal of Heart and Lung Transplantation, 2000
Objective: Review the acute and late results of percutaneous transluminal coronary angioplasty (P... more Objective: Review the acute and late results of percutaneous transluminal coronary angioplasty (PTCA) in heart transplant recipients and examine the factors predictive of restenosis.
Journal of Molecular Biology, Jun 1, 2004
Understanding and predicting protein structures depend on the complexity and the accuracy of the ... more Understanding and predicting protein structures depend on the complexity and the accuracy of the models used to represent them. We have recently set up a Hidden Markov Model to optimally compress protein three-dimensional conformations into a one-dimensional series of letters of a structural alphabet. Such a model learns simultaneously the shape of representative structural letters describing the local conformation and the logic of their connections, i.e. the transition matrix between the letters. Here, we move one step further and report some evidence that such a model of protein local architecture also captures some accurate amino acid features. All the letters have specific and distinct amino acid distributions. Moreover, we show that words of amino acids can have significant propensities for some letters. Perspectives point towards the prediction of the series of letters describing the structure of a protein from its amino acid sequence.
The original publication is available at www.springerlink.com A reduced amino acid alphabet for u... more The original publication is available at www.springerlink.com A reduced amino acid alphabet for understanding and designing protein adaptation to mutation.
Translational psychiatry, Jun 20, 2017
Early identification of Alzheimer's disease (AD) risk factors would aid development of interv... more Early identification of Alzheimer's disease (AD) risk factors would aid development of interventions to delay the onset of dementia, but current biomarkers are invasive and/or costly to assess. Validated plasma biomarkers would circumvent these challenges. We previously identified the kinase DYRK1A in plasma. To validate DYRK1A as a biomarker for AD diagnosis, we assessed the levels of DYRK1A and the related markers brain-derived neurotrophic factor (BDNF) and homocysteine in two unrelated AD patient cohorts with age-matched controls. Receiver-operating characteristic curves and logistic regression analyses showed that combined assessment of DYRK1A, BDNF and homocysteine has a sensitivity of 0.952, a specificity of 0.889 and an accuracy of 0.933 in testing for AD. The blood levels of these markers provide a diagnosis assessment profile. Combined assessment of these three markers outperforms most of the previous markers and could become a useful substitute to the current panel of...
Nucleic Acids Research, 2015
Predicting protein pocket's ability to bind drug-like molecules with high affinity, i.e. druggabi... more Predicting protein pocket's ability to bind drug-like molecules with high affinity, i.e. druggability, is of major interest in the target identification phase of drug discovery. Therefore, pocket druggability investigations represent a key step of compound clinical progression projects. Currently computational druggability prediction models are attached to one unique pocket estimation method despite pocket estimation uncertainties. In this paper, we propose 'PockDrug-Server' to predict pocket druggability, efficient on both (i) estimated pockets guided by the ligand proximity (extracted by proximity to a ligand from a holo protein structure) and (ii) estimated pockets based solely on protein structure information (based on amino atoms that form the surface of potential binding cavities). PockDrug-Server provides consistent druggability results using different pocket estimation methods. It is robust with respect to pocket boundary and estimation uncertainties, thus efficient using apo pockets that are challenging to estimate. It clearly distinguishes druggable from less druggable pockets using different estimation methods and outperformed recent druggability models for apo pockets. It can be carried out from one or a set of apo/holo proteins using different pocket estimation methods proposed by our web server or from any pocket previously estimated by the user. PockDrug-Server is publicly available at: http://pockdrug.rpbs.univ-paris-diderot.fr.
The American journal of physiology, 1994
Luteinizing hormone (LH) is released by the pituitary in discrete pulses. In the monkey, the appe... more Luteinizing hormone (LH) is released by the pituitary in discrete pulses. In the monkey, the appearance of LH pulses in the plasma is invariably associated with sharp increases (i.e, volleys) in the frequency of the hypothalamic pulse generator electrical activity, so that continuous monitoring of this activity by telemetry provides a unique means to study the temporal structure of the mechanism generating the pulses. To assess whether the times of occurrence and durations of previous volleys exert significant influence on the timing of the next volley, we used a class of periodic counting process models that specify the stochastic intensity of the process as the product of two factors: 1) a periodic baseline intensity and 2) a stochastic regression function with covariates representing the influence of the past. This approach allows the characterization of circadian modulation and memory range of the process underlying hypothalamic pulse generator activity, as illustrated by fittin...
Understanding and predicting protein structures depends on the complexity and the accuracy of the... more Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have setup a Hidden Markov Model to optimally compress three dimensional (3D) conformation of protein into a structural alphabet, i.e. a library of exhaustive and representative states (describing short fragments) learnt simultaneously with connection logic. The discretization of protein backbone local conformation as a series of states results in a simplification of protein 3D coordinates into a unique unidimensional (1D) representation. We present some evidence that such approach can constitute a very relevant way to the analysis of protein architecture in particular for protein structure comparison or prediction.
Plant biotechnology journal, 2014
RNA-dependent RNA polymerase 6 (RDR6) and suppressor of gene silencing 3 (SGS3) act together in p... more RNA-dependent RNA polymerase 6 (RDR6) and suppressor of gene silencing 3 (SGS3) act together in post-transcriptional transgene silencing mediated by small interfering RNAs (siRNAs) and in biogenesis of various endogenous siRNAs including the tasiARFs, known regulators of auxin responses and plant development. Legumes, the third major crop family worldwide, has been widely improved through transgenic approaches. Here, we isolated rdr6 and sgs3 mutants in the model legume Medicago truncatula. Two sgs3 and one rdr6 alleles led to strong developmental defects and impaired biogenesis of tasiARFs. In contrast, the rdr6.1 homozygous plants produced sufficient amounts of tasiARFs to ensure proper development. High throughput sequencing of small RNAs from this specific mutant identified 354 potential MtRDR6 substrates, for which siRNA production was significantly reduced in the mutant. Among them, we found a large variety of novel phased loci corresponding to protein-encoding genes or transp...
Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), 2001
The prediction of loop conformations is one of the challenging problems of homology modeling, due... more The prediction of loop conformations is one of the challenging problems of homology modeling, due to the large sequence variability associated with these parts of protein structures. In the present study, we introduce a search procedure that evolves in a structural alphabet space deduced from a hidden Markov model to simplify the structural information. It uses a Bayesian criterion to predict, from the amino acid sequence of a loop region, its corresponding word in the structural alphabet space.
Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), 1999
Hidden Markov models were used to identify recurrent short 3D structural building blocks (SSBBs) ... more Hidden Markov models were used to identify recurrent short 3D structural building blocks (SSBBs) describing protein backbones. Polypeptide chains were broken down into successive short segments de®ned by their inter-alpha-carbon distances. Fitting the model to a database of nonredundant proteins identi®ed 12 distinct SSBBs and described the preferred pathways by which SSBBs were assembled to form the 3D structure of the proteins. Protein backbones were labelled in terms of these SSBBs. The observed SSBB preferences for fragments located between regular secondary structures suggested that they depended more on the following regular structure than on the preceding one. Extraction of repeated series of SSBBs between regular secondary structures showed some structural speci®city within dierent connection types. These results con®rm that SSBBs can be used as building blocks for analyzing protein structures, and can yield new information on the structures of the coils¯anking secondary structures.
Protein Engineering Design and Selection, 1999
The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks ... more The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks (SBBs) describing protein backbones, independently of any a priori knowledge. Polypeptide chains are decomposed into a series of short segments defined by their inter-α-carbon distances. Basically, the model takes into account the sequentiality of the observed segments and assumes that each one corresponds to one of several possible SBBs. Fitting the model to a database of non-redundant proteins allowed us to decode proteins in terms of 12 distinct SBBs with different roles in protein structure. Some SBBs correspond to classical regular secondary structures. Others correspond to a significant subdivision of their bounding regions previously considered to be a single pattern. The major contribution of the HMM is that this model implicitly takes into account the sequential connections between SBBs and thus describes the most probable pathways by which the blocks are connected to form the framework of the protein structures. Validation of the SBBs code was performed by extracting SBB series repeated in recoding proteins and examining their structural similarities. Preliminary results on the sequence specificity of SBBs suggest promising perspectives for the prediction of SBBs or series of SBBs from the protein sequences.
PLoS Computational Biology, 2010
Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic ta... more Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic targets. So far, only a minute fraction of the estimated 650,000 PPIs that comprise the human interactome are known with a tiny number of complexes being drugged. Such intricate biological systems cannot be cost-efficiently tackled using conventional high-throughput screening methods. Rather, time has come for designing new strategies that will maximize the chance for hit identification through a rationalization of the PPI inhibitor chemical space and the design of PPI-focused compound libraries (global or target-specific). Here, we train machine-learning-based models, mainly decision trees, using a dataset of known PPI inhibitors and of regular drugs in order to determine a global physico-chemical profile for putative PPI inhibitors. This statistical analysis unravels two important molecular descriptors for PPI inhibitors characterizing specific molecular shapes and the presence of a privileged number of aromatic bonds. The best model has been transposed into a computer program, PPI-HitProfiler, that can output from any drug-like compound collection a focused chemical library enriched in putative PPI inhibitors. Our PPI inhibitor profiler is challenged on the experimental screening results of 11 different PPIs among which the p53/MDM2 interaction screened within our own CDithem platform, that in addition to the validation of our concept led to the identification of 4 novel p53/MDM2 inhibitors. Collectively, our tool shows a robust behavior on the 11 experimental datasets by correctly profiling 70% of the experimentally identified hits while removing 52% of the inactive compounds from the initial compound collections. We strongly believe that this new tool can be used as a global PPI inhibitor profiler prior to screening assays to reduce the size of the compound collections to be experimentally screened while keeping most of the true PPI inhibitors. PPI-HitProfiler is freely available on request from our CDithem platform website, www.CDithem.com.
Nucleic Acids Research, 2011
The detection of functional motifs is an important step for the determination of protein function... more The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at
Nucleic Acids Research, 2004
SA-Search is a web tool that can be used to mine for protein structures and extract structural si... more SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of threedimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs. jussieu.fr/cgi-bin/SA-Search.
Nucleic Acids Research, 2004
SCit is a web server providing services for protein side chain conformation analysis and side cha... more SCit is a web server providing services for protein side chain conformation analysis and side chain positioning. Specific services use the dependence of the side chain conformations on the local backbone conformation, which is described using a structural alphabet that describes the conformation of fragments of four-residue length in a limited library of structural prototypes. Based on this concept, SCit uses sets of rotameric conformations dependent on the local backbone conformation of each protein for side chain positioning and the identification of side chains with unlikely conformations. The SCit web server is accessible at http://bioserv.rpbs.jussieu. fr/SCit.
Medicine, 2000
From January 1996 to January 1997, 321 patients with an average age of 46 +/- 16 years and chroni... more From January 1996 to January 1997, 321 patients with an average age of 46 +/- 16 years and chronically infected with hepatitis C virus (HCV) were prospectively enrolled in a study designed to determine the prevalence of extrahepatic manifestations associated with HCV infection in a large cohort of HCV patients, to identify associations between clinical and biologic manifestations, and to compare the results obtained in human immunodeficiency virus (HIV)-positive versus HIV-negative subsets. In a cross-sectional study, clinical extrahepatic manifestations, viral coinfections with HIV and/or hepatitis B virus, connective tissue diseases, and a wide panel of autoantibodies were assessed. Thirty-eight percent (122/321) of patients presented at least 1 clinical extrahepatic manifestation including arthralgia (60/321, 19%), skin manifestations (55/321, 17%), xerostomia (40/321, 12%), xerophthalmia (32/321, 10%), and sensory neuropathy (28/321, 9%). Main biologic abnormalities were mixed cryoglobulins (110/196, 56%), thrombocytopenia (50/291, 17%), and the presence of the following autoantibodies: antinuclear (123/302, 41%), rheumatoid factor (107/280, 38%), anticardiolipin (79/298, 27%), antithyroglobulin (36/287, 13%) and antismooth muscle cell (27/288, 9%). At least 1 autoantibody was present in 210/302 (70%) of sera. By multivariate logistic regression analysis, 4 parameters were significantly associated with cryoglobulin positivity: systemic vasculitis (p = 0.01, odds ratio OR[ = 17.3), HIV positivity (p = 0.0006, OR = 10.2), rheumatoid factor positivity (p = 0.01, OR = 2.8), and sicca syndrome (p = 0.03, OR = 0.27). A definite connective tissue disease was noted in 44 patients (14%), mainly symptomatic mixed cryoglobulinemia and systemic vasculitis, HIV coinfection (23%) was associated with 3 parameters: anticardiolipin (p = 0.003, OR = 4.18), thrombocytopenia (p = 0.01, OR = 3.56), and arthralgia or myalgia (p = 0.017, OR = 0.23). HIV-positive patients presented more severe histologic lesions (p = 0.0004). Extrahepatic clinical manifestations in HCV patients involve primarily the skin and joints. The most frequent immunologic abnormalities include mixed cryoglobulins, rheumatoid factor, antinuclear, anticardiolipin, and antithyroglobulin antibodies. Cryoglobulin positivity is associated with systemic vasculitis and rheumatoid factor and HIV positivity. HIV coinfection is associated with arthralgia or myalgia, anticardiolipin antibodies, and thrombocytopenia.
Journal of Molecular Biology, 2004
Understanding and predicting protein structures depends on the complexity and the accuracy of the... more Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction.
The Journal of Heart and Lung Transplantation, 2000
Objective: Review the acute and late results of percutaneous transluminal coronary angioplasty (P... more Objective: Review the acute and late results of percutaneous transluminal coronary angioplasty (PTCA) in heart transplant recipients and examine the factors predictive of restenosis.