VDJviz: a versatile browser for immunogenomics data - PubMed (original) (raw)

VDJviz: a versatile browser for immunogenomics data

Dmitriy V Bagaev et al. BMC Genomics. 2016.

Abstract

Background: The repertoire of T- and B-cell receptor sequences encodes the antigen specificity of adaptive immunity system, determines its present state and guides its ability to mount effective response against encountered antigens in future. High throughput sequencing of immune repertoires (Rep-Seq) is a promising technique that allows to profile millions of antigen receptors of an individual in a single experiment. While a substantial number of tools for mapping and assembling Rep-Seq data were published recently, the field still lacks an intuitive and flexible tool that can be used by researchers with little or no computational background for in-depth analysis of immune repertoire profiles.

Results: Here we report VDJviz, a web tool that can be used to browse, analyze and perform quality control of Rep-Seq results generated by various pre-processing software. On a set of real data examples we show that VDJviz can be used to explore key repertoire characteristics such as spectratype, repertoire clonality, V-(D)-J recombination patterns and to identify shared clonotypes. We also demonstrate the utility of VDJviz in detection of critical Rep-Seq biases such as artificial repertoire diversity and cross-sample contamination.

Conclusions: VDJviz is a versatile and lightweight tool that can be easily employed by biologists, immunologists and immunogeneticists for routine analysis and quality control of Rep-Seq data. The software is freely available for non-commercial purposes, and can be downloaded from: https://github.com/antigenomics/vdjviz .

Keywords: B-cell; Browser; High-throughput sequencing; Immunology; Repertoire sequencing; T-cell.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Spectratypes of 6 and 64 years old donors. Each bin of the histograms corresponds to the length of CDR3 nucleotide sequence, clonotypes were weighted by their frequency and the fraction of top 10 most abundant clonotypes is shown with colored bars. Note that bars at CDR3 lengths that are not a multiple of 3 represent out-of-frame clonotypes

Fig. 2

Fig. 2

Variable segment spectratype and Variable-Joining segment usage chord diagram. a Distribution of CDR3 nucleotide sequence lengths weighted by clonotype frequency. Most enriched Variable segments are explicitly shown. b Chord diagram of Variable-Joining junction abundance. Segment lengths are scaled according to the abundance of a specific segment, arc widths scaled by the abundance of corresponding Variable-Joining junctions

Fig. 3

Fig. 3

VDJviz clonality plot, a nested pie chart divided into the following regions: singletons (clonotypes represented by a single read), doubletons (2 reads), high order (3 and more reads). High order clonotypes are divided into five quantiles (top 20 % of unique high order clonotypes and so on). Top ten clonotypes of the first quantile are explicitly shown. Size of each segment is the cumulative frequency of all clonotypes that fall into corresponding frequency category

Fig. 4

Fig. 4

VDJviz interactive rarefaction plot (diversity vs sampling depth) for T-cell repertoires from two replicate PBMC samples processed with various error correction strategies including quality filtering (q20 and q35 thresholds), elimination of clonotypes encountered only in one of the samples (“intersection”) and frequency-based error correction (“freq”). Solid lines show rarefaction curves computed using observed clonotype frequencies, dashed lines represent their extrapolations. Note that the expected sample diversity is ~90 clonotypes according to UMI-corrected data

Fig. 5

Fig. 5

VDJviz clonotype browser interface snapshots showing clonotypes matching a given CDR3 amino acid sequence in a single sample (a) and across multiple samples (b). a A trace of erroneous variants for one of the top clonotypes from sample2_q35 dataset described in example#4. b Matching the CDR3 nucleotide sequence of a cancer clonotype in post-treatment samples. The panel shows presence of minimal residual disease in corresponding patient (D29_17), as well as cross-sample contamination in two other patients

Fig. 6

Fig. 6

A snapshot of clonotype sharing (Join sample tab of VDJviz) across multiple samples. Clonotypes of 41 healthy donors of various ages were matched by their CDR3 amino acid sequence selecting the ones that were present in at least 10 repertoires

Similar articles

Cited by

References

    1. Janeway CA. Immunobiology: the immune system in health and disease. 8. New York: Garland Science; 2012.
    1. Xu JL, Davis MM. Diversity in the CDR3 region of V(H) is sufficient for most antibody specificities. Immunity. 2000;13(1):37–45. doi: 10.1016/S1074-7613(00)00006-6. - DOI - PubMed
    1. Benichou J, Ben-Hamo R, Louzoun Y, Efroni S. Rep-Seq: uncovering the immunological repertoire through next-generation sequencing. Immunology. 2012;135(3):183–91. doi: 10.1111/j.1365-2567.2011.03527.x. - DOI - PMC - PubMed
    1. Rocha B, von Boehmer H. Peripheral selection of the T cell repertoire. Science. 1991;251(4998):1225–8. doi: 10.1126/science.1900951. - DOI - PubMed
    1. Quigley MF, Greenaway HY, Venturi V, Lindsay R, Quinn KM, Seder RA, et al. Convergent recombination shapes the clonotypic landscape of the naive T-cell repertoire. Proc Natl Acad Sci U S A. 2010;107(45):19414–9. doi: 10.1073/pnas.1010586107. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources