T cell fate and clonality inference from single-cell transcriptomes (original) (raw)

Nature Methods volume 13, pages 329–332 (2016)Cite this article

Subjects

Abstract

We developed TraCeR, a computational method to reconstruct full-length, paired T cell receptor (TCR) sequences from T lymphocyte single-cell RNA sequence data. TraCeR links T cell specificity with functional response by revealing clonal relationships between cells alongside their transcriptional profiles. We found that T cell clonotypes in a mouse Salmonella infection model span early activated CD4+ T cells as well as mature effector and memory cells.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

Accession codes

Primary accessions

ArrayExpress

References

  1. Lieber, M.R. FASEB J. 5, 2934–2944 (1991).
    Article CAS PubMed Google Scholar
  2. Becattini, S. et al. Science 347, 400–406 (2015).
    Article CAS PubMed Google Scholar
  3. Mamedov, I.Z. et al. EMBO Mol. Med. 3, 201–207 (2011).
    Article CAS PubMed PubMed Central Google Scholar
  4. Dash, P. et al. J. Clin. Invest. 121, 288–295 (2011).
    Article CAS PubMed Google Scholar
  5. Linnemann, C. et al. Nat. Med. 19, 1534–1541 (2013).
    Article CAS PubMed Google Scholar
  6. Kim, S.-M. et al. PLoS One 7, e37338 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  7. Han, A., Glanville, J., Hansmann, L. & Davis, M.M. Nat. Biotechnol. 32, 684–692 (2014).
    Article CAS PubMed PubMed Central Google Scholar
  8. Buettner, F. et al. Nat. Biotechnol. 33, 155–160 (2015).
    Article CAS PubMed Google Scholar
  9. Jaitin, D.A. et al. Science 343, 776–779 (2014).
    Article CAS PubMed PubMed Central Google Scholar
  10. Trapnell, C. et al. Nat. Biotechnol. 32, 381–386 (2014).
    Article CAS PubMed PubMed Central Google Scholar
  11. Mahata, B. et al. Cell Rep. 7, 1130–1142 (2014).
    Article CAS PubMed PubMed Central Google Scholar
  12. Bolotin, D.A. et al. Nat. Methods 12, 380–381 (2015).
    Article CAS PubMed Google Scholar
  13. Shugay, M. et al. Nat. Methods 11, 653–655 (2014).
    Article CAS PubMed Google Scholar
  14. Thomas, N., Heather, J., Ndifon, W., Shawe-Taylor, J. & Chain, B. Bioinformatics 29, 542–550 (2013).
    Article CAS PubMed Google Scholar
  15. Kuchenbecker, L. et al. Bioinformatics 31, 2963–2971 (2015).
    Article CAS PubMed Google Scholar
  16. Ramsköld, D. et al. Nat. Biotechnol. 30, 777–782 (2012).
    Article PubMed PubMed Central Google Scholar
  17. Brady, B.L., Steinel, N.C. & Bassing, C.H. J. Immunol. 185, 3801–3808 (2010).
    Article CAS PubMed Google Scholar
  18. Gaublomme, J.T. et al. Cell 163, 1400–1412 (2015).
    Article CAS PubMed PubMed Central Google Scholar
  19. Kolodziejczyk, A.A. et al. Cell Stem Cell 17, 471–485 (2015).
    Article CAS PubMed PubMed Central Google Scholar
  20. Mittrücker, H.-W., Köhler, A. & Kaufmann, S.H.E. Infect. Immun. 70, 199–203 (2002).
    Article PubMed PubMed Central Google Scholar
  21. Brennan, P.J., Brigl, M. & Brenner, M.B. Nat. Rev. Immunol. 13, 101–117 (2013).
    Article CAS PubMed Google Scholar
  22. Stubbington, M.J.T. et al. Biol. Direct 10, 14 (2015).
    Article PubMed PubMed Central Google Scholar
  23. Kallies, A. Immunol. Cell Biol. 86, 325–332 (2008).
    Article CAS PubMed Google Scholar
  24. Sallusto, F., Lenig, D., Förster, R., Lipp, M. & Lanzavecchia, A. Nature 401, 708–712 (1999).
    Article CAS PubMed Google Scholar
  25. Whitfield, M.L., George, L.K., Grant, G.D. & Perou, C.M. Nat. Rev. Cancer 6, 99–106 (2006).
    Article CAS PubMed Google Scholar
  26. Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).
    Article CAS PubMed PubMed Central Google Scholar
  27. Anders, S., Pyl, P.T. & Huber, W. Bioinformatics 31, 166–169 (2015).
    Article CAS PubMed Google Scholar
  28. Lefranc, M.-P. et al. Nucleic Acids Res. 37, D1006–D1012 (2009).
    Article CAS PubMed Google Scholar
  29. Langmead, B. & Salzberg, S.L. Nat. Methods 9, 357–359 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  30. Grabherr, M.G. et al. Nat. Biotechnol. 29, 644–652 (2011).
    Article CAS PubMed PubMed Central Google Scholar
  31. Ye, J., Ma, N., Madden, T.L. & Ostell, J.M. Nucleic Acids Res. 41, W34–W40 (2013).
    Article PubMed PubMed Central Google Scholar
  32. Bosc, N. & Lefranc, M.-P. Dev. Comp. Immunol. 27, 465–497 (2003).
    Article CAS PubMed Google Scholar
  33. Bray, N., Pimentel, H., Melsted, P. & Pachter, L. Preprint at arXiv:1505.02710 (2015).
  34. Magocˇč, T. & Salzberg, S.L. Bioinformatics 27, 2957–2963 (2011).
    Article Google Scholar

Download references

Acknowledgements

We thank V. Svensson, T. Hagai, J. Henriksson and other members of the Teichmann laboratory along with G. Lythe for helpful discussions. We thank the Wellcome Trust Sanger Institute Sequencing Facility for performing Illumina sequencing and the Wellcome Trust Sanger Institute Research Support Facility for care of the mice used in these studies. This work was supported by European Research Council (grant ThSWITCH, number 260507, to S.A.T.) and the Lister Institute for Preventative Medicine (S.A.T.).

Author information

Author notes

  1. Michael J T Stubbington and Tapio Lönnberg: These authors contributed equally to this work.

Authors and Affiliations

  1. European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
    Michael J T Stubbington, Tapio Lönnberg, Valentina Proserpio & Sarah A Teichmann
  2. Wellcome Trust Sanger Institute, Cambridge, UK
    Simon Clare, Anneliese O Speak, Gordon Dougan & Sarah A Teichmann

Authors

  1. Michael J T Stubbington
    You can also search for this author inPubMed Google Scholar
  2. Tapio Lönnberg
    You can also search for this author inPubMed Google Scholar
  3. Valentina Proserpio
    You can also search for this author inPubMed Google Scholar
  4. Simon Clare
    You can also search for this author inPubMed Google Scholar
  5. Anneliese O Speak
    You can also search for this author inPubMed Google Scholar
  6. Gordon Dougan
    You can also search for this author inPubMed Google Scholar
  7. Sarah A Teichmann
    You can also search for this author inPubMed Google Scholar

Contributions

M.J.T.S. conceived the project, designed the computational method, wrote the software, designed PCR sequencing primers, analyzed data, generated figures and wrote the manuscript. T.L. and S.C. designed and performed the Salmonella experiments. T.L. performed cell collection and purification, generated scRNA-seq libraries, performed gene expression analyses, analyzed data, generated figures and wrote the manuscript. V.P. performed PCR-based TCR-sequencing experiments. A.O.S. designed the cell-sorting strategy, performed the sorting and generated figures. S.A.T. and G.D. supervised work and wrote the manuscript.

Corresponding author

Correspondence toSarah A Teichmann.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Method for reconstructing TCR sequences from single-cell RNA-seq data.

(a) Overview of data-processing steps for TCR sequence reconstruction. Single-cell RNA-sequencing was performed on individual T lymphocytes to produce a pool of paired-end sequencing reads for each cell. These reads were used to quantify gene expression within each cell. In addition, sequencing reads that are derived from TCR mRNA are extracted and assembled into long contiguous TCR sequences. TCR contigs are filtered and analysed with IgBlast to determine the gene segments used and the junctional nucleotides. (b) Example of a combinatorial recombinome entry used as alignment reference for extraction of TCR-derived reads. Each TCR locus is represented by a fasta file containing entries comprising every possible combination of V and J genes for that locus. V–J combinations contain the sequence of the appropriate constant gene along with stretches of N nucleotides to represent V leader and variable junctional regions.

Supplementary Figure 2 FACS strategy for cells used in this study.

(a) uninfected control and day 14 cells (b) day 49 cells. In both cases, cells were sorted to be CD4+TCRB+NK1.1−CD44hiCD62Llo. Additionally, cells from the mouse at day 49 were sorted to be CD127hi.

Supplementary Figure 3 Validation of RNA-seq TCR reconstruction.

(a) Schematic illustrating approach for targeted PCR amplification and sequencing of recombined TCR genes. (b) Numbers of concordant and discordant events from comparison between RNA-seq and PCR. Concordant events include 39 occasions where no sequence was detected by either method for a particular locus. (c) Discordancy between PCR and RNA-seq TCR sequence due to sequencing error. The TCR identifiers above were found in the same cell by RNA-seq and PCR. They differ solely by two G residues within the long homopolymeric G tract within the junctional region. (d) Expression levels of concordant sequence and discordant recombinant sequences as determined by RNA-seq (upper) or from targeted PCR (lower). Expression levels of TCR sequences were calculated as transcripts per million (TPM) from RNA-seq data or as numbers of reads from PCR data. P values were calculated using the Mann-Whitney U test since the data are not normally distributed. (e) Number of cells with zero, one, two or three recombinants for each TCR locus from combined RNA-seq and PCR results. Either for all (‘All’) or productive recombinants only (‘Prod’).

Supplementary Figure 4 Sensitivity analysis of RNA-seq reconstruction.

All single-cell datasets from day 14 mouse 1 were randomly subsampled three independent times to contain decreasing total read numbers followed by TCR reconstruction. Points representing each TCR sequence found in the full datasets are plotted according to their expression levels and the minimum total read depth required for detection in at least (a) three, (b) two or (c) one out of three subsamples. For clarity, points are jittered about the y-axis.

Supplementary Figure 6 Clonotype network graph from uninfected mouse.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. The lack of edges between nodes in this graph indicates that no nodes share TCR sequences.

Supplementary Figure 7 Clonotype network graph from day 14, mouse 1.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. Red edges between the nodes indicate shared TCRα sequences whilst blue edges indicate shared TCRβ sequences. Edge thickness is proportional to the number of shared sequences. For clarity, the nodes without edges are not displayed.

Supplementary Figure 8 Clonotype network graph from day 14, mouse 2.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. Red edges between the nodes indicate shared TCRα sequences whilst blue edges indicate shared TCRβ sequences. Edge thickness is proportional to the number of shared sequences. For clarity, the nodes without edges are not displayed.

Supplementary Figure 9 Clonotype network graph from day 49 mouse.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. Red edges between the nodes indicate shared TCRα sequences whilst blue edges indicate shared TCRβ sequences. Edge thickness is proportional to the number of shared sequences. For clarity, the nodes without edges are not displayed.

Supplementary Figure 11 Distribution of shared TCR sequences within the Fluidigm C1 integrated fluidics circuit (IFC) and 96-well plate.

Transcripts per million (TPM) expression values of the two most highly-shared TCR sequences are shown within the C1 IFC capture sites, harvest sites and the resulting 96-well plate that contained the associated single cells.

Supplementary Figure 12 Clonotype distribution in gene-expression space.

All clonotypes from day 14 mouse 2 are shown as purple points on top of all other cells within the gene expression space.

Supplementary Figure 13 Clonotype distribution in gene-expression space.

All clonotypes from day 14 mouse 2 are shown as purple points on top of all other cells within the gene expression space.

Supplementary Figure 14 Clonotype distribution in gene-expression space.

All clonotypes from the day 49 mouse are shown as turquoise points on top of all other cells within the gene expression space.

Supplementary Figure 15 Comparison of methods for calculating TCR productivity.

TCR productivity was assessed (a) from full-length sequences generated from IMGT reference sequences or (b) the assembled contigs generated from sequencing reads. See Online Methods for more detail.

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Stubbington, M., Lönnberg, T., Proserpio, V. et al. T cell fate and clonality inference from single-cell transcriptomes.Nat Methods 13, 329–332 (2016). https://doi.org/10.1038/nmeth.3800

Download citation

This article is cited by