G&T-seq: parallel sequencing of single-cell genomes and transcriptomes (original) (raw)

Accession codes

Primary accessions

ArrayExpress

Gene Expression Omnibus

References

Xu, X. et al. Cell 148, 886–895 (2012).
Article CAS PubMed PubMed Central Google Scholar
Shapiro, E., Biezuner, T. & Linnarsson, S. Nat. Rev. Genet. 14, 618–630 (2013).
Article CAS PubMed Google Scholar
Voet, T. et al. Nucleic Acids Res. 41, 6119–6138 (2013).
Article CAS PubMed PubMed Central Google Scholar
Cai, X. et al. Cell Rep. 8, 1280–1289 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. Cell Rep. 2, 666–673 (2012).
Article CAS PubMed Google Scholar
Ramsköld, D. et al. Nat. Biotechnol. 30, 777–782 (2012).
Article PubMed PubMed Central Google Scholar
Yan, L. et al. Nat. Struct. Mol. Biol. 20, 1131–1139 (2013).
Article CAS PubMed Google Scholar
Jaitin, D.A. et al. Science 343, 776–779 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pollen, A.A. et al. Nat. Biotechnol. 32, 1053–1058 (2014).
Article CAS PubMed PubMed Central Google Scholar
Shalek, A.K. et al. Nature 510, 363–369 (2014).
Article CAS PubMed PubMed Central Google Scholar
Klein, C.A. et al. Nat. Biotechnol. 20, 387–392 (2002).
Article CAS PubMed Google Scholar
Gužvic´, M. et al. Cancer Res. 74, 7383–7394 (2014).
Article PubMed Google Scholar
Picelli, S. et al. Nat. Methods 10, 1096–1098 (2013).
Article CAS PubMed Google Scholar
Picelli, S. et al. Nat. Protoc. 9, 171–181 (2014).
Article CAS PubMed Google Scholar
Gazdar, A.F. et al. Int. J. Cancer 78, 766–774 (1998).
Article CAS PubMed Google Scholar
Stephens, P.J. et al. Nature 462, 1005–1010 (2009).
Article CAS PubMed PubMed Central Google Scholar
Dean, F.B. et al. Proc. Natl. Acad. Sci. USA 99, 5261–5266 (2002).
Article CAS PubMed PubMed Central Google Scholar
Langmore, J.P. Pharmacogenomics 3, 557–560 (2002).
Article PubMed Google Scholar
de Bourcy, C.F. et al. PLoS One 9, e105585 (2014).
Article PubMed PubMed Central Google Scholar
D'Alise, A.M. et al. Mol. Cancer Ther. 7, 1140–1149 (2008).
Article CAS PubMed Google Scholar
Santaguida, S., Tighe, A., D'Alise, A.M., Taylor, S.S. & Musacchio, A. J. Cell Biol. 190, 73–87 (2010).
Article CAS PubMed PubMed Central Google Scholar
Letourneau, A. et al. Nature 508, 345–350 (2014).
Article CAS PubMed Google Scholar
McConnell, M.J. et al. Science 342, 632–637 (2013).
Article CAS PubMed PubMed Central Google Scholar
Mitelman, F., Johansson, B. & Mertens, F. Nat. Rev. Cancer 7, 233–245 (2007).
Article CAS PubMed Google Scholar
Stratton, M.R., Campbell, P.J. & Futreal, P.A. Nature 458, 719–724 (2009).
Article CAS PubMed PubMed Central Google Scholar
Ha, K.C. et al. BMC Med. Genomics 4, 75 (2011).
Article CAS PubMed PubMed Central Google Scholar
Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Nat. Biotechnol. 33, 285–289 (2015).
Article CAS PubMed PubMed Central Google Scholar
Park, I.H. et al. Cell 134, 877–886 (2008).
Article CAS PubMed PubMed Central Google Scholar
Shi, Y. et al. Sci. Transl. Med. 4, 124ra129 (2012).
Google Scholar
Shi, Y., Kirwan, P., Smith, J., Robinson, H.P. & Livesey, F.J. Nat. Neurosci. 15, 477–486, S471 (2012).
Article CAS PubMed Google Scholar
Chambers, S.M. et al. Nat. Biotechnol. 27, 275–280 (2009).
Article CAS PubMed PubMed Central Google Scholar
Shi, Y., Kirwan, P. & Livesey, F.J. Nat. Protoc. 7, 1836–1846 (2012).
Article CAS PubMed Google Scholar
Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Quinlan, A.R. & Hall, I.M. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Baslan, T. et al. Nat. Protoc. 7, 1024–1041 (2012).
Article CAS PubMed PubMed Central Google Scholar
Møller, E.K. et al. Front. Oncol. 3, 320 (2013).
Article PubMed PubMed Central Google Scholar
DePristo, M.A. et al. Nat. Genet. 43, 491–498 (2011).
Article CAS PubMed PubMed Central Google Scholar
Marcel, M. EMBnet.journal 17, 10–12 (2011).
Google Scholar
Trapnell, C. et al. Nat. Protoc. 7, 562–578 (2012).
Article CAS PubMed PubMed Central Google Scholar
Li, B., Ruotti, V., Stewart, R.M., Thomson, J.A. & Dewey, C.N. Bioinformatics 26, 493–500 (2010).
Article PubMed Google Scholar
Love, M.I., Huber, W. & Anders, S. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar
Kharchenko, P.V., Silberstein, L. & Scadden, D.T. Nat. Methods 11, 740–742 (2014).
Article CAS PubMed PubMed Central Google Scholar
Kim, D. et al. Genome Biol. 14, R36 (2013).
Article PubMed PubMed Central Google Scholar
McPherson, A. et al. PLoS Comput. Biol. 7, e1001138 (2011).
Article CAS PubMed PubMed Central Google Scholar
Kent, W.J. Genome Res. 12, 656–664 (2002).
CAS PubMed PubMed Central Google Scholar
Piskol, R., Ramaswami, G. & Li, J.B. Am. J. Hum. Genet. 93, 641–651 (2013).
Article CAS PubMed PubMed Central Google Scholar

Acknowledgements

We thank the Wellcome Trust Sanger Institute (UK) sequencing pipelines and F. Yang of the Cytogenetics Core Facility. This work was supported by the UK Wellcome Trust (to T.V. and C.P.P.) and funding from the Belgian Research Foundation Flanders (FWO) and the University of Leuven (KU Leuven, Belgium) to T.V. (FWO–G.0687.12; KU Leuven SymBioSys, PFV/10/016). N.V.d.A. is supported by an FWO scholarship (FWO–1.1.H28.12). W.H. and C.P.P. are funded by the UK Medical Research Council. L.M.S. was funded by the EU Seventh Framework Programme (FP7/2007-2013) under grant 262055. M.Z.-G. and the work in the lab are funded by the UK Wellcome Trust. M.G. is supported by a UK Mary Gray Studentship from St. John's College, Cambridge, UK. N.S. was supported by the New Zealand Woolf-Fisher Trust. F.J.L. is supported by a UK Wellcome Trust Senior Investigator award. M.J.T. is supported by a Wellcome Trust Sanger Institute Clinical Ph.D. Fellowship (UK). Y.I.L. was supported by a University of Oxford Nuffield Department of Medicine Prize Studentship, UK. Trisomy 21 iPSCs were obtained from the Harvard Stem Cell Institute (Cambridge, Massachusetts, USA), and control iPSCs were a gift from Y. Takashima (Cambridge Stem Cell Institute, Cambridge, UK).

Author information

Author notes

Yang I Li & Harold P Swerdlow
Present address: Present addresses: Department of Genetics, Stanford University, Stanford, California, USA (Y.I.L.). New York Genome Center, New York, New York, USA (H.P.S.).,
Wilfried Haerty and Parveen Kumar: These authors contributed equally to this work.
Chris P Ponting and Thierry Voet: These authors jointly directed this work.

Authors and Affiliations

Sanger Institute–EBI Single-Cell Genomics Centre, Wellcome Trust Sanger Institute, Hinxton, UK
Iain C Macaulay, Chris P Ponting & Thierry Voet
Department of Physiology, MRC Functional Genomics Unit, Anatomy and Genetics, University of Oxford, Oxford, UK
Wilfried Haerty, Yang I Li, Tim Xiaoming Hu & Chris P Ponting
Department of Human Genetics, University of Leuven, Leuven, Belgium
Parveen Kumar, Niels Van der Aa & Thierry Voet
Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Mabel J Teng
Department of Physiology, Development and Neuroscience, Downing Site, University of Cambridge, Cambridge, UK
Mubeen Goolam & Magdalena Zernicka-Goetz
Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
Nathalie Saurat & Frederick J Livesey
Sequencing R&D, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Paul Coupland, Lesley M Shirley, Miriam Smith, Peter D Ellis, Michael A Quail & Harold P Swerdlow
Cytogenetics Core Facility, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Ruby Banerjee

Authors

Iain C Macaulay
Wilfried Haerty
Parveen Kumar
Yang I Li
Tim Xiaoming Hu
Mabel J Teng
Mubeen Goolam
Nathalie Saurat
Paul Coupland
Lesley M Shirley
Miriam Smith
Niels Van der Aa
Ruby Banerjee
Peter D Ellis
Michael A Quail
Harold P Swerdlow
Magdalena Zernicka-Goetz
Frederick J Livesey
Chris P Ponting
Thierry Voet

Contributions

I.C.M. developed the method, performed experiments, analyzed data and wrote the paper. W.H., P.K., Y.I.L. and T.X.H. analyzed data and prepared figures and text for the paper. M.J.T. performed experiments and assisted with method development. N.V.d.A. provided cells and assisted with method development. M.G. and M.Z.-G. provided mouse blastomeres. N.S. and F.J.L. provided iPSC-derived neurons. P.C., L.M.S., M.S., P.D.E., M.A.Q. and H.P.S. assisted with library preparation for targeted, HiSeq X and PacBio sequencing. R.B. performed cytogenetic analysis of cell lines. C.P.P. and T.V. acquired funding, oversaw the research, designed the method, analyzed data and wrote the paper. All authors read and approved the manuscript for submission.

Corresponding authors

Correspondence toIain C Macaulay, Chris P Ponting or Thierry Voet.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Performance of G&T-seq whole-genome amplification in HCC38 and HCC38-BL cells.

(a) Copy-number concordance between bulk DNA sequencing of HCC38(-BL) cells and single-cell or multicell G&T-seq following MDA or PicoPlex WGA. For reference, single-cell DNA copy-number concordances obtained with conventional MDA and PicoPlex are shown. (b) Heat map of the genome-wide DNA copy number (LogR) in single cells and in multicell controls isolated from HCC38 and HCC38-BL cells and amplified using MDA. For reference, the copy-number profile derived from bulk HCC38 DNA (not subjected to WGA) is shown on the left. (c) Lorenz curve illustrating the relationship between the cumulative fraction of the genome covered (_x_-axis) and the cumulative fraction of mapped bases (_y_-axis). (d) Normalized read count as a function of %GC content. The distributions are shown for all HCC38 G&T-seq samples amplified with MDA (purple) and PicoPlex (green). For comparison, the distributions for bulk (no WGA, blue), conventional single-cell MDA (black) and conventional single-cell PicoPlex (orange) are shown.

Supplementary Figure 2 Performance of G&T-seq whole-transcriptome amplification in HCC38 and HCC38-BL cells.

(a) Transcript detection following G&T-seq of HCC38 and HCC38-BL single cells. The number of expressed genes (_y_-axis) in HCC38 single cells (red lines) and HCC38-BL single cells (blue lines) versus TPM (_x_-axis). At TPM > 1 (dashed line), between 4,000 and 11,000 transcripts were detected per cell, with substantially more transcripts detected in HCC38 cells. (b) Principal-component analysis of HCC38 and HCC38BL single-cell transcriptomes. Cells in which genomic aneuploidies were detected are highlighted. (c) Heat map displaying Spearman correlation of 8,237 protein-coding genes expressed in at least 32 samples with TPM > 1. (d) Expanded heat map showing the top 200 differentially expressed genes between HCC38 and HCC38-BL cells. The TPM of each gene is ‘normalized’ by the median of the TPM of this gene across all samples and is presented as the log2-fold difference from this median.

Supplementary Figure 3 Sequence coverage over transcript length and intronic and gene flanking regions in single-cell G&T-seq transcriptome data.

Read coverage in (a) 2 kb, (b) 10 kb and (c) 15 kb transcripts is shown. Numbers indicate the distance from the poly(A) tail in the exonic region only. Regions upstream of the transcription start site (TSS) and transcription termination site (TTS), as well as intronic regions, are also shown.

Supplementary Figure 4 Comparison of RNA-seq data generated with the G&T-seq and conventional Smart-seq2 protocols.

In this comparison, 28 single cells (8 HCC38 and 20 HCC38-BL single cells) were used for G&T-seq, and 20 single cells (14 HCC38 and 6 HCC38-BL single cells) were applied for conventional Smart-seq2. Importantly, these cells came from the same cultures, were isolated at the same time, were processed (when possible) with the same batches of reagents, and were eventually sequenced together. (a) Transcript detection following G&T-seq or conventional Smart-seq2 amplification of HCC38 and HCC38-BL single cells. The number of transcripts detected at TPM > 1 is displayed. (b) Detection of ERCC transcripts relative to ERCC input amount; the plot shows the averaged normalized read count across all single-cell samples in a G&T-seq experiment versus the number of molecules of each ERCC sequence that was spiked in. (c) Detection of ERCC transcripts relative to ERCC input amount in a parallel Smart-seq2 experiment. (d) Sequence coverage over transcript length and intronic and gene flanking regions in single-cell G&T-seq and Smart-seq2 transcriptome data. Read coverage in 2 kb transcripts is shown. Numbers indicate the distance from the poly(A) tail in the exonic region only. Regions upstream of the transcription start site (TSS) and transcription termination site (TTS), as well as intronic regions, are also shown. (e) Transcript detection in bins of transcript GC content for HCC38 and HCC38-BL single-cell transcriptomes generated by G&T-seq and Smart-seq2 (SS2). The upper panel shows the proportion of genes detected in each bin, and the lower panel displays the proportion of GC content in each bin.

Supplementary Figure 5 Interphase FISH to detect trisomy 11 in a subset of HCC38-BL cells.

Chromosomes 11 and 3 were hybridized with a centromeric probe (labeled with FITC and Texas Red, respectively). The majority of HCC38-BL cells had disomy 11 (a), whereas trisomy 11 was observed in 2 out of 100 HCC38-BL cells analyzed (b).

Supplementary Figure 6 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo A) containing sister cells with reciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). Cell 1 failed QC at the genome level. Reciprocal aneuploidies were observed for cells 4 and 5 at chromosomes 2, 5 and 16. (b) Genome-wide expression binned per chromosome in the control (n = 16 cells, untreated and shown in blue) and reversine-treated (n = 8 cells, shown in red) embryos (RPKM of the latter are relative to the median-centered control RPKMs). The expected expression dosage resulting from the aneuploidies for chromosomes 2, 5 and 16 in the blastomeres (cells 4 and 5) was detected in the correct cell’s transcriptome. Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with a black asterisk. Cell 1, which also failed DNA-seq QC, is highlighted with a red asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 7 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo B) containing sister cells with reciprocal and nonreciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). A complex pattern of aneuploidy was observed in cell 2 (gain of chromosomes 4 and 16 and loss of chromosomes 5, 14, 18 and 19). Cell 3 had a gain in chromosome 15, and cell 5 had a gain in chromosome 8, while cell 8 gained an X-chromosome. (b) Genome-wide expression binned per chromosome comparing the cells from embryo B (reversine-treated, shown in red, n = 8) with those from control embryos (n = 16 cells, untreated and shown in blue). Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with an asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 8 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo C) containing sister cells with reciprocal and nonreciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). Cell 1 failed QC following DNA-seq. Cell 2 had a loss of chromosome 11, whereas cells 3 and 6 showed reciprocal gains and losses at chromosomes 13 and 14. Cell 8 had lost a copy of chromosome 13. (b) Genome-wide expression binned per chromosome comparing the cells from embryo C (reversine-treated, shown in red, n = 8) with those from control embryos (n = 16 cells, untreated and shown in blue). Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with a black asterisk. Cell 1, which failed DNA-seq QC, is highlighted with a red asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 9 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo E) containing sister cells with reciprocal and nonreciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). Cells 1 and 4 had reciprocal aneuploidies for chromosomes 4, 7, 8, 10, 18 and 19, with cell 1 having an additional nonreciprocal loss of chromosome 6. Cell 2 had a gain at chromosomes 15 and X. Cell 3 had a gain of chromosome 1 and losses of chromosomes 4 and X. Cell 5 had a loss of chromosome 9 and 17. Cell 6 had a gain of chromosomes 6, 8 and 9 and losses of chromosomes 15 and X. Cells 7 and 8 had a loss of chromosome X. (b) Genome-wide expression binned per chromosome comparing the cells from embryo E (reversine-treated, shown in red) with those from control embryos (n = 16 cells, untreated and shown in blue). Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with an asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 10 Relationship between chromosomal-arm copy number and chromosome-arm-wide expression in iPSC-derived neurons.

MA plot comparing the log2 ratio in mRNA expression levels between p and q chromosomal arms (M) to the average expression across the chromosome arms (A) for all cells containing trisomy 21. The acrocentric chromosomes 13, 14, 15, 21 and 22 and chromosome Y have been excluded. The values for chromosome 20 are shown in green for cells without evidence of gain or loss of the chromosomal arms, and cells with genomic evidence for loss of 20p and gain of 20q are shown in purple. Numbers indicate cell identifiers.

Supplementary Figure 11 Detection of a coding interchromosomal fusion in the genome and transcriptome of a single cell.

(a) Identification of a fusion transcript in the RNA-seq data from a single HCC38 cell (cell 63). A subset of the reads mapping to a fusion between exon 6 of MTAP (gene locus on chromosome 9) and exon 3 of PCDH7 (gene locus on chromosome 4) are shown. (b) Sequencing of single-cell cDNA using the PacBio RSII revealed that the full-length _MTAP_-PCDH7 fusion transcript consisted of exons 1–6 of MTAP and exons 3, 4 and 6 of PCDH7. Six mapped reads following single-molecule PacBio cDNA sequencing of a single cell are shown. (c) Illumina HiSeq X DNA sequence reads crossing the causative interchromosomal fusion between chromosomes 4 and 9 in the genome of the same single cell (HCC38 cell 63). A subset of the reads mapping across the genomic fusion are shown; the breakpoint itself is located at a distance of 3,208 bases downstream of exon 6 of MTAP and 105,180 bases upstream from exon 3 of PCDH7.

Supplementary Figure 12 Confirmation of _MTAP_-PCDH7 expression and detection of the associated genomic fusion by qPCR.

Taqman primer and probe sets were designed to detect (a) the _MTAP_-PCDH7 fusion transcript and (b) the genomic breakpoint that fuses chromosomes 4 and 9. Examples of consensus reads mapping across both breakpoints are shown, with the MTAP side colored red and the PCDH7 side colored blue. Primer/probe sets were specifically designed to span the breakpoints in both cases. (c) Detection of the _MTAP_-PCDH7 fusion transcript in cDNA from G&T-seq of HCC38 and HCC38-BL cells. (d) Detection of the _MTAP_-PCDH7 genomic fusion in MDA-amplified DNA from G&T-seq of HCC38 and HCC38-BL cells. (e) Venn diagram showing the overlap of detection of the fusion transcript and associated genomic rearrangement in parallel from the same single cells.

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Macaulay, I., Haerty, W., Kumar, P. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes.Nat Methods 12, 519–522 (2015). https://doi.org/10.1038/nmeth.3370

Download citation

Received: 18 November 2014
Accepted: 27 March 2015
Published: 27 April 2015
Issue Date: June 2015
DOI: https://doi.org/10.1038/nmeth.3370