Divergent human populations show extensive shared IGK rearrangements in peripheral blood B cells (original) (raw)

Automated analysis of high-throughput B cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles

PNAS

Individual variation in germline and expressed B-cell immunoglobulin (Ig) repertoires has been associated with aging, disease susceptibility, and differential response to infection and vaccination. Repertoire properties can now be studied at large-scale through next-generation sequencing of rearranged Ig genes. Accurate analysis of these repertoire-sequencing (Rep-Seq) data requires identifying the germline variable (V), diversity (D), and joining (J) gene segments used by each Ig sequence. Current V(D)J assignment methods work by aligning sequences to a database of known germline V(D)J segment alleles. However, existing databases are likely to be incomplete and novel polymorphisms are hard to differentiate from the frequent occurrence of somatic hypermutations in Ig sequences. Here we develop a Tool for Ig Genotype Elucidation via Rep-Seq (TIgGER). TIgGER analyzes mutation patterns in Rep-Seq data to identify novel V segment alleles, and also constructs a personalized germline database containing the specific set of alleles carried by a subject. This information is then used to improve the initial V segment assignments from existing tools, like IMGT/HighV-QUEST. The application of TIgGER to Rep-Seq data from seven subjects identified 11 novel V segment alleles, including at least one in every subject examined. These novel alleles constituted 13% of the total number of unique alleles in these subjects, and impacted 3% of V(D)J segment assignments. These results reinforce the highly polymorphic nature of human Ig V genes, and suggest that many novel alleles remain to be discovered. The integration of TIgGER into Rep-Seq processing pipelines will increase the accuracy of V segment assignments, thus improving B-cell repertoire analyses.

Polymorphisms in immunoglobulin heavy chain variable genes and their upstream regions

ABSTRACTGermline variations in immunoglobulin genes influence the repertoire of B cell receptors and antibodies, and such polymorphisms may impact disease susceptibility. However, the knowledge of the genomic variation of the immunoglobulin loci is scarce. Here, we report 25 novel germline IGHV alleles as inferred from rearranged naïve B cell cDNA repertoires of 98 individuals. Thirteen novel alleles were selected for validation, out of which ten were successfully confirmed by targeted amplification and Sanger sequencing of non-B cell DNA. Moreover, we detected a high degree of variability upstream of the V-region in the 5’UTR, leader 1, and leader 2 sequences, and found that identical V-region alleles can differ in upstream sequences. Thus, we have identified a large genetic variation not only in the V-region but also in the upstream sequences of IGHV genes. Our findings challenge current approaches used for annotating immunoglobulin repertoire sequencing data.

Tracing CLL-biased stereotyped immunoglobulin gene rearrangements in normal B cell subsets using a high-throughput immunogenetic approach

Molecular Medicine

Background B cell receptor Immunoglobulin (BcR IG) repertoire of Chronic Lymphocytic Leukemia (CLL) is characterized by the expression of quasi-identical BcR IG. These are observed in approximately 30% of patients, defined as stereotyped receptors and subdivided into subsets based on specific VH CDR3 aa motifs and phylogenetically related IGHV genes. Although relevant to CLL ontogeny, the distribution of CLL-biased stereotyped immunoglobulin rearrangements (CBS-IG) in normal B cells has not been so far specifically addressed using modern sequencing technologies. Here, we have investigated the presence of CBS-IG in splenic B cell subpopulations (s-BCS) and in CD5+ and CD5− B cells from the spleen and peripheral blood (PB). Methods Fractionation of splenic B cells into 9 different B cell subsets and that of spleen and PB into CD5+ and CD5− cells were carried out by FACS sorting. cDNA sequences of BcR IG gene rearrangements were obtained by NGS. Identification of amino acidic motifs ty...

Molecular mechanisms and selective influences that shape the kappa gene repertoire of IgM+ B cells

Journal of Clinical Investigation, 1997

To analyze the human kappa chain repertoire and the influences that shape it, a single cell PCR technique was used that amplified V J rearrangements from genomic DNA of individual human B cells. More than 350 productive and 250 nonproductive V J rearrangements were sequenced. Nearly every functional V gene segment was used in rearrangements, although six V gene segments, A27, L2, L6, L12a, A17, and O12/O2 were used preferentially. Of these, A27, L2, L6, and L12a showed evidence of positive selection based on the variable region and not CDR3, whereas A17 was overrepresented because of a rearrangement bias based on molecular mechanisms. Utilization of J segments was also nonrandom, with J 1 and J 2 being overrepresented and J 3 and J 5 underrepresented in the nonproductive repertoire, implying a molecular basis for the bias. In B cells with two V J rearrangements, marked differences were noted in the V segments used for the initial and subsequent rearrangements, whereas J segments were used comparably. Junctional diversity was generated by n-nucleotide addition in 60% and by exonuclease trimming in 75% of the V J rearrangements analyzed. Despite this large degree of diversity, a strict CDR3 length was maintained in both productive and nonproductive rearrangements. More than 23% of the productive rearrangements, but only 7% of the nonproductive rearrangements contained somatic hypermutations. Mutations were significantly more frequent in V sequences derived from CD5 Ϫ as compared with CD5 ϩ B cells. These results document that the gene segment utilization within the V repertoire is biased by both intrinsic molecular processes as well as selection after light chain expression. Moreover, IgM ϩ memory cells with highly mutated kappa genes reside within the CD5 Ϫ but not the CD5 ϩ B cell compartment.

Human B-cell isotype switching origins of IgE

The Journal of allergy and clinical immunology, 2015

B cells expressing IgE contribute to immunity against parasites and venoms and are the source of antigen specificity in allergic patients, yet the developmental pathways producing these B cells in human subjects remain a subject of debate. Much of our knowledge of IgE lineage development derives from model studies in mice rather than from human subjects. We evaluate models for isotype switching to IgE in human subjects using immunoglobulin heavy chain (IGH) mutational lineage data. We analyzed IGH repertoires in 9 allergic and 24 healthy adults using high-throughput DNA sequencing of 15,843,270 IGH rearrangements to identify clonal lineages of B cells containing members expressing IgE. Somatic mutations in IGH inherited from common ancestors within the clonal lineage are used to infer the relationships between B cells. Data from 613,641 multi-isotype B-cell clonal lineages, of which 592 include an IgE member, are consistent with indirect switching to IgE from IgG- or IgA-expressing ...

Characterizing Features of Human Circulating B Cells Carrying CLL-Like Stereotyped Immunoglobulin Rearrangements

Frontiers in Oncology

Chronic Lymphocytic Leukemia (CLL) is characterized by the accumulation of monoclonal CD5+ B cells with low surface immunoglobulins (IG). About 40% of CLL clones utilize quasi-identical B cell receptors, defined as stereotyped BCR. CLL-like stereotyped-IG rearrangements are present in normal B cells as a part of the public IG repertoire. In this study, we collected details on the representation and features of CLL-like stereotyped-IG in the IGH repertoire of B-cell subpopulations purified from the peripheral blood of nine healthy donors. The B-cell subpopulations were also fractioned according to the expression of surface CD5 molecules and IG light chain, IGκ and IGλ. IG rearrangements, obtained by high throughput sequencing, were scanned for the presence of CLL-like stereotyped-IG. CLL-like stereotyped-IG did not accumulate preferentially in the CD5+ B cells, nor in specific B-cell subpopulations or the CD5+ cell fraction thereof, and their distribution was not restricted to a sing...

Resolving haplotype variation and complex genetic architecture in the human immunoglobulin kappa chain locus in individuals of diverse ancestry

Immunoglobulins (IGs), critical components of the human immune system, are composed of heavy and light protein chains encoded at three genomic loci. The IG Kappa (IGK) chain locus consists of two large, inverted segmental duplications. The complexity of IG loci has hindered effective use of standard high- throughput methods for characterizing genetic variation within these regions. To overcome these limitations, we leverage long-read sequencing to create haplotype-resolved IGK assemblies in an ancestrally diverse cohort (n=36), representing the first comprehensive description of IGK haplotype variation at population-scale. We identify extensive locus polymorphism, including novel single nucleotide variants (SNVs) and a common novel ∼24.7 Kbp structural variant harboring a functional IGKV gene. Among 47 functional IGKV genes, we identify 141 alleles, 64 (45.4%) of which were not previously curated. We report inter-population differences in allele frequencies for 14 of the IGKV genes,...

Different B cell subpopulations show distinct patterns in their IgH repertoire metrics

Background: Several human B-cell subpopulations are recognized in the peripheral blood, which play distinct roles in the humoral immune response. These cells undergo developmental and maturational changes involving VDJ recombination, somatic hypermutation and class switch recombination, altogether shaping their immunoglobulin heavy chain (IgH) repertoire. Methods: Here, we sequenced the IgH repertoire of naïve, marginal zone, switched and plasma cells from 10 healthy adults along with matched unsorted and in silico separated CD19+ bulk B cells. We used advanced bioinformatic analysis and machine learning to thoroughly examine and compare these repertoires. Results: We show that sorted B cell subpopulations are characterised by distinct repertoire characteristics on both the individual sequence and the repertoire level. Sorted subpopulations shared similar repertoire characteristics with their corresponding in silico separated subsets. Furthermore, certain IgH repertoire characterist...

Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data

Bioinformatics, 2015

Advances in high-throughput sequencing technologies now allow for large-scale characterization of B cell immunoglobulin (Ig) repertoires. The high germline and somatic diversity of the Ig repertoire presents challenges for biologically meaningful analysis, which requires specialized computational methods. We have developed a suite of utilities, Change-O, which provides tools for advanced analyses of large-scale Ig repertoire sequencing data. Change-O includes tools for determining the complete set of Ig variable region gene segment alleles carried by an individual (including novel alleles), partitioning of Ig sequences into clonal populations, creating lineage trees, inferring somatic hypermutation targeting models, measuring repertoire diversity, quantifying selection pressure, and calculating sequence chemical properties. All Change-O tools utilize a common data format, which enables the seamless integration of multiple analyses into a single workflow. Availability and implementation: Change-O is freely available for non-commercial use and may be downloaded from http://clip.med.yale.edu/changeo.