Derivation of HLA types from shotgun sequence datasets - PubMed (original) (raw)

Derivation of HLA types from shotgun sequence datasets

René L Warren et al. Genome Med. 2012.

Abstract

The human leukocyte antigen (HLA) is key to many aspects of human physiology and medicine. All current sequence-based HLA typing methodologies are targeted approaches requiring the amplification of specific HLA gene segments. Whole genome, exome and transcriptome shotgun sequencing can generate prodigious data but due to the complexity of HLA loci these data have not been immediately informative regarding HLA genotype. We describe HLAminer, a computational method for identifying HLA alleles directly from shotgun sequence datasets (http://www.bcgsc.ca/platform/bioinfo/software/hlaminer). This approach circumvents the additional time and cost of generating HLA-specific data and capitalizes on the increasing accessibility and affordability of massively parallel sequencing.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Computational predictions of HLA-I from shotgun data by targeted assembly (left) or read alignment (right). For targeted assembly, NGS reads having their first fifteen 5' bases matching one of HLA CDS (RNA-Seq) or genomic (WGS/exon capture) sequences are recruited and assembled de novo with TASR. Resulting sequence contigs are aligned against a database sequence of all predicted HLA CDS (RNA-Seq) or genomic sequences (WGS/exon capture), tracking best HLA hit(s). Reciprocal best alignments are considered in the same manner. Putative allele assignments from shotgun datasets (HLAminer) are informed by contig length, depth of coverage and similarity to reference sequences, when applicable. The probability of each prediction being correct is estimated by determining the probability of that prediction being observed by chance.

Figure 2

Figure 2

HLAminer performance. HLA allele group and protein coding allele predictions derived from targeted read assembly (black symbols) or direct read alignment (grey symbols) of simulated 100-nucleotide RNA-Seq, WGS and exon capture (ExCap) datasets were compared to original, spiked-in, HLA sequences and performance metrics evaluated (ambiguity, sensitivity and specificity represented by circle, triangle and square symbols, respectively). HLAminer predictions were also obtained from targeted assembly of colorectal cancer (CRC; blue symbols), lymphoma (DLBCL; red, orange and yellow symbols), 1000 Genomes (1KG; green symbols) and ovarian cancer (OV; violet and magenta symbols) patient tumor (T) and/or matched normal (N) shotgun datasets and compared to PCR-based HLA types to calculate performance metrics.

Similar articles

Cited by

References

    1. Carrington M, O'Brien SJ. The influence of HLA genotype on AIDS. Annu Rev Med. 2003;54:535–551. doi: 10.1146/annurev.med.54.101601.152346. - DOI - PubMed
    1. Dawson DV, Ozgur M, Sari K, Ghanayem M, Kostyu DD. Ramifications of HLA class I polymorphism and population genetics for vaccine development. Genet Epidemiol. 2001;20:87–106. doi: 10.1002/1098-2272(200101)20:1<87::AID-GEPI8>3.0.CO;2-R. - DOI - PubMed
    1. Fernando MM, Stevens CR, Walsh EC, De Jager PL, Goyette P, Plenge RM, Vyse TJ, Rioux JD. Defining the role of the MHC in autoimmunity: a review and pooled analysis. PLoS Genet. 2008;4:e1000024. doi: 10.1371/journal.pgen.1000024. - DOI - PMC - PubMed
    1. Mizuki N, Meguro A, Ota M, Ohno S, Shiota T, Kawagoe T, Ito N, Kera J, Okada E, Yatsu K, Song YW, Lee EB, Kitaichi N, Namba K, Horie Y, Takeno M, Sugita S, Mochizuki M, Bahram S, Ishigatsubo Y, Inoko H. Genome-wide association studies identify IL23R-IL12RB2 and IL10 as Behçet's disease susceptibility loci. Nat Genet. 2010;42:703–706. doi: 10.1038/ng.624. - DOI - PubMed
    1. Rioux JD, Goyette P, Vyse TJ, Hammarström L, Fernando MM, Green T, De Jager PL, Foisy S, Wang J, de Bakker PI, Leslie S, McVean G, Padyukov L, Alfredsson L, Annese V, Hafler DA, Pan-Hammarström Q, Matell R, Sawcer SJ, Compston AD, Cree BA, Mirel DB, Daly MJ, Behrens TW, Klareskog L, Gregersen PK, Oksenberg JR, Hauser SL. Mapping of multiple susceptibility variants within the MHC region for 7 immune-mediated diseases. Proc Natl Acad Sci USA. 2009;106:18680–18685. - PMC - PubMed

LinkOut - more resources