Codon arrangement modulates MHC-I peptides presentation (original) (raw)
Related papers
BMC Bioinformatics, 2009
The major histocompatibility complex (MHC) molecule plays a central role in controlling the adaptive immune response to infections. MHC class I molecules present peptides derived from intracellular proteins to cytotoxic T cells, whereas MHC class II molecules stimulate cellular and humoral immunity through presentation of extracellularly derived peptides to helper T cells. Identification of which peptides will bind a given MHC molecule is thus of great importance for the understanding of host-pathogen interactions, and large efforts have been placed in developing algorithms capable of predicting this binding event.
MHCSeqNet: a deep neural network model for universal MHC binding prediction
BMC Bioinformatics
Background: Immunotherapy is an emerging approach in cancer treatment that activates the host immune system to destroy cancer cells expressing unique peptide signatures (neoepitopes). Administrations of cancer-specific neoepitopes in the form of synthetic peptide vaccine have been proven effective in both mouse models and human patients. Because only a tiny fraction of cancer-specific neoepitopes actually elicits immune response, selection of potent, immunogenic neoepitopes remains a challenging step in cancer vaccine development. A basic approach for immunogenicity prediction is based on the premise that effective neoepitope should bind with the Major Histocompatibility Complex (MHC) with high affinity. Results: In this study, we developed MHCSeqNet, an open-source deep learning model, which not only outperforms state-of-the-art predictors on both MHC binding affinity and MHC ligand peptidome datasets but also exhibits promising generalization to unseen MHC class I alleles. MHCSeqNet employed neural network architectures developed for natural language processing to model amino acid sequence representations of MHC allele and epitope peptide as sentences with amino acids as individual words. This consideration allows MHCSeqNet to accept new MHC alleles as well as peptides of any length. Conclusions: The improved performance and the flexibility offered by MHCSeqNet should make it a valuable tool for screening effective neoepitopes in cancer vaccine development.
OETMAP: a new feature encoding scheme for MHC class I binding prediction
Molecular and Cellular Biochemistry
Deciphering the understanding of T cell epitopes is critical for vaccine development. As recognition of specific peptides bound to Major histocompatibility complex (MHC) class I molecules, cytotoxic T cells are activated. This is the major step to initiate of immune system response. Knowledge of the MHC specificity will enlighten the way of diagnosis, treatment of pathogens as well as peptide vaccine development. So far, a number of methods have been developed to predict MHC/peptide binding. In this article, a novel feature amino acid encoding scheme is proposed to predict MHC/peptide complexes. In the proposed method, we have combined orthonormal encoding (OE) and Taylor’s Venn-diagram, and have used Linear support vector machines as the classifier in the tests. We also have compared our method to current feature encoding scheme techniques. The tests have been carried out on comparatively large Human leukocyte antigen (HLA)-A and HLA-B allele peptide three binding datasets extracted from the Immune epitope database and analysis resource. On three datasets experimented, the IC50 cutoff a criteria is used to select the binders and non-binders peptides. Experimental results show that our amino acid encoding scheme leads to better classification performance than other amino acid encoding schemes on a standalone classifier.
Tissue Antigens, 2003
We have generated Artificial Neural Networks (ANN) capable of performing sensitive, quantitative predictions of peptide binding to the MHC class I molecule, HLA-A*0204. We have shown that such quantitative ANN are superior to conventional classification ANN, that have been trained to predict binding vs non-binding peptides. Furthermore, quantitative ANN allowed a straightforward application of a 'Query by Committee' (QBC) principle whereby particularly information-rich peptides could be identified and subsequently tested experimentally. Iterative training based on QBC-selected peptides considerably increased the sensitivity without compromising the efficiency of the prediction. This suggests a general, rational and unbiased approach to the development of high quality predictions of epitopes restricted to this and other HLA molecules. Due to their quantitative nature, such predictions will cover a wide range of MHC-binding affinities of immunological interest, and they can be readily integrated with predictions of other events involved in generating immunogenic epitopes. These predictions have the capacity to perform rapid proteome-wide searches for epitopes. Finally, it is an example of an iterative feedback loop whereby advanced, computational bioinformatics optimize experimental strategy, and vice versa. Proteomes are extremely diverse and can be used to ascertain the identity of any organism. This is true even at the level of oligopeptides. Indeed, the immune system has chosen peptides as one of its prime targets. It follows that proteomes can be translated into immunogens once it is known how the immune system generates and handles peptides (1). One of the most selective events is that of peptide binding to MHC. It is therefore important to establish accurate descriptions and predictions of peptide binding to the most common MHC haplotypes.
Generating Rules for Predicting MHC Class I Binding Peptide using ANN and Knowledge-based GA
International Journal of Digital Content: Technology and its Applications, 2009
Cytotoxic T cells recognize specific peptides bound to major histocompatibility complex (MHC) class I molecule. Accurate prediction for the binding peptides could be of much use for the design of efficient peptide vaccines, which substantially reduce the cost of synthesizing and testing candidate binders. In this paper, we demonstrated that a machine learning approach can be successfully applied to extract rules to predict MHC class I binding peptides. We introduce a new method using a feed-forward neural network and genetic algorithm, and show that the proposed method outperforms other methods in both quantity and quality of the prediction rules. In order to verify the rules generated by our method, we compared them with the known-rules available in the HLA FactBook. Our method successfully identified most of the known-rules, and found some new additional rules for HLA-A*0204 and HLA-B*2706. We also found new rules for HLA-A*3301 for which no rules have ever been reported before.
Reliable prediction of T-cell epitopes using neural networks with novel sequence representations
Protein Science, 2003
In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.
CapsNet-MHC predicts peptide-MHC class I binding based on capsule neural networks
Communications Biology
The Major Histocompatibility Complex (MHC) binds to the derived peptides from pathogens to present them to killer T cells on the cell surface. Developing computational methods for accurate, fast, and explainable peptide-MHC binding prediction can facilitate immunotherapies and vaccine development. Various deep learning-based methods rely on separate feature extraction from the peptide and MHC sequences and ignore their pairwise binding information. This paper develops a capsule neural network-based method to efficiently capture the peptide-MHC complex features to predict the peptide-MHC class I binding. Various evaluations confirmed our method outperformance over the alternative methods, while it can provide accurate prediction over less available data. Moreover, for providing precise insights into the results, we explored the essential features that contributed to the prediction. Since the simulation results demonstrated consistency with the experimental studies, we concluded that ...
2011
Fundamental step of an adaptive immune response to pathogen or vaccine is the binding of short peptides (also called epitopes) to major histocompatibility complex (MHC) molecules. The various prediction algorithms are being used to capture the MHC peptide binding preference, allowing the rapid scan of entire pathogen proteomes for peptide likely to bind MHC, saving the cost, effort, and time. However, the number of known binders/non-binders (BNB) to a specific MHC molecule is limited in many cases, which still poses a computational challenge for prediction. The training data should be adequate to predict BNB using any machine learning approach. In this study, variable learning rate has been demonstrated for training artificial neural network and predicting BNB for small datasets. The approach can be used for large datasets as well. The dataset for different MHC class I alleles for SARS Corona virus (Tor2 Replicase polyprotein 1ab) has been used for training and prediction of BNB. A total of 90 datasets (nine different MHC class I alleles with tenfold cross validation) have been retrieved from IEDB database for BNB. For fixed learning rate approach, the best value of AROC is 0.65, and in most of the cases it is 0.5, which shows the poor predictions. In case of variable learning rate, of the 90 datasets the value of AROC for 76 datasets is between 0.806 and 1.0 and for 7 datasets the value is between 0.7 and 0.8 and for rest of 7 datasets it is between 0.5 and 0.7, which indicates very good performance in most of the cases. Keywords Variable learning rate • Artificial neural network • SARS Corona virus • MHC class I binder/non-binder • Epitope prediction • Vaccine designing • T-cell immune response
Predicting epitopes recognized by cytotoxic T cells has been a long standing challenge within the field of immuno- and bioinformatics. While reliable predictions of peptide binding are available for most Major Histocompatibility Complex class I (MHCI) alleles, prediction models of T cell receptor (TCR) interactions with MHC class I-peptide complexes remain poor due to the limited amount of available training data. Recent next generation sequencing projects have however generated a considerable amount of data relating TCR sequences with their cognate HLA-peptide complex target. Here, we utilize such data to train a sequence-based predictor of the interaction between TCRs and peptides presented by the most common human MHCI allele, HLA-A*02:01. Our model is based on convolutional neural networks, which are especially designed to meet the challenges posed by the large length variations of TCRs. We show that such a sequence-based model allows for the identification of TCRs binding a giv...
Machine Learning Techniques: Approach for Mapping of MHC Class Binding Nonamers
The machine learning techniques are playing a major role in the field of immunoinformatics for DNA-binding domain analysis. Functional analysis of the binding ability of DNA-binding domain protein antigen peptides to major histocompatibility complex (MHC) class molecules is important in vaccine development. The variable length of each binding peptide complicates this prediction. Such predictions can be used to select epitopes for use in rational vaccine design and to increase the understanding of roles of the immune system in infectious diseases. Antigenic epitopes of DNA-binding domain protein form Human papilloma virus-31 are important determinant for protection of many host form viral infection. This study shows active part in host immune reactions and involvement of MHC class-I and MHC II in response to almost all antigens. We used PSSM and SVM algorithms for antigen design, which represented predicted binders as MHCII-IAb, MHCII-IAd, MHCII-IAg7, and MHCII-RT1.B nonamers from viral DNA-binding domain crystal structure. These peptide nonamers are from a set of aligned peptides known to bind to a given MHC molecule as the predictor of MHC-peptide binding. Analysis shows potential drug targets to identify active sites against diseases.