David Buck - Academia.edu (original) (raw)
Papers by David Buck
b-III spectrin is present in the brain and is known to be important in the function of the cerebe... more b-III spectrin is present in the brain and is known to be important in the function of the cerebellum. Heterozygous mutations in SPTBN2, the gene encoding b-III spectrin, cause Spinocerebellar Ataxia Type 5 (SCA5), an adult-onset, slowly progressive, autosomal-dominant pure cerebellar ataxia. SCA5 is sometimes known as ‘‘Lincoln ataxia,’ ’ because the largest known family is descended from relatives of the United States President Abraham Lincoln. Using targeted capture and next-generation sequencing, we identified a homozygous stop codon in SPTBN2 in a consanguineous family in which childhood developmental ataxia co-segregates with cognitive impairment. The cognitive impairment could result from mutations in a
The MIT Faculty has made this article openly available. Please share how this access benefits you... more The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
Nature communications, Oct 15, 2018
Barrett's oesophagus is a precursor of oesophageal adenocarcinoma. In this common condition, ... more Barrett's oesophagus is a precursor of oesophageal adenocarcinoma. In this common condition, squamous epithelium in the oesophagus is replaced by columnar epithelium in response to acid reflux. Barrett's oesophagus is highly heterogeneous and its relationships to normal tissues are unclear. Here we investigate the cellular complexity of Barrett's oesophagus and the upper gastrointestinal tract using RNA-sequencing of single cells from multiple biopsies from six patients with Barrett's oesophagus and two patients without oesophageal pathology. We find that cell populations in Barrett's oesophagus, marked by LEFTY1 and OLFM4, exhibit a profound transcriptional overlap with oesophageal submucosal gland cells, but not with gastric or duodenal cells. Additionally, SPINK4 and ITLN1 mark cells that precede morphologically identifiable goblet cells in colon and Barrett's oesophagus, potentially aiding the identification of metaplasia. Our findings reveal striking tra...
Diabetes, Jul 24, 2017
To identify novel coding association signals and facilitate characterization of mechanisms influe... more To identify novel coding association signals and facilitate characterization of mechanisms influencing glycemic traits and type 2 diabetes risk, we analyzed 109,215 variants derived from exome array genotyping together with an additional 390,225 variants from exome sequence in up to 39,339 normoglycemic individuals from five ancestry groups. We identified a novel association between the coding variant (p.Pro50Thr) in AKT2 and fasting insulin, a gene in which rare fully penetrant mutations are causal for monogenic glycemic disorders. The low-frequency allele is associated with a 12% increase in fasting plasma insulin (FI) levels. This variant is present at 1.1% frequency in Finns but virtually absent in individuals from other ancestries. Carriers of the FI-increasing allele had increased 2-hour insulin values, decreased insulin sensitivity, and increased risk of type 2 diabetes (odds ratio=1.05). In cellular studies, the AKT2-Thr50 protein exhibited a partial loss of function. We ext...
Bladder Cancer, 2015
Background: Germline mutations in DNA damage signalling and repair genes predispose individuals t... more Background: Germline mutations in DNA damage signalling and repair genes predispose individuals to cancer. Rare germline variants may also increase cancer risk and be predictive of outcomes following cancer treatments, but require high-throughput sequencing (HTS) for detection in large cohorts. Objective: To use a dual indexing system on a HTS platform to detect novel variants in CtIP (RBBP8) which may be associated with clinical outcomes following radiotherapy treatment for bladder cancer. Methods: All exons and flanking introns of CtIP were amplified from germline DNA from bladder cancer patients using seven primer pairs by automated long-range PCR. Amplicons were pooled, fragmented and ligated to adaptor sequences. One of 96 tag sequences was introduced at each end by PCR. Sequencing was performed on a single flow cell of an Illumina MiSeq. Reads were mapped by Stampy and variants called by Platypus. For phasing experiments, target regions were amplified and cloned for Sanger sequencing. Results: Of 201 samples, 160 were successfully amplified. Eleven CtIP variants were called, within the exons and 15 bp adjacent intronic DNA, including eight known variants from the 1000 Genomes project, plus three previously unreported variants now confirmed by Sanger sequencing. In two individuals, phasing experiments showed two variants of interest to be on separate alleles, likely to result in stronger impairment of gene function. Conclusions: We have demonstrated proof of principle for dual indexing on 160 samples on one MiSeq flow cell sequencing surface, and show that for the CtIP gene multiplexing of up to 720 samples would provide sufficient coverage to achieve >98% detection power for rare germline variation, reducing HTS costs substantially.
F1000Research, 2015
The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing c... more The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the c...
Nature genetics, Jan 18, 2015
To assess factors influencing the success of whole-genome sequencing for mainstream clinical diag... more To assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for mendelian disorders and 57% (8/14) in family trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, although only 4 were ultimately considered reportable. Our resu...
Rapid and accurate detection of antibiotic resistance in pathogens is an urgent need, affecting b... more Rapid and accurate detection of antibiotic resistance in pathogens is an urgent need, affecting both patient care and population-scale control. Microbial genome sequencing promises much, but many barriers exist to its routine deployment. Here, we address these challenges, using a de Bruijn graph comparison of clinical isolate and curated knowledge-base to identify species and predict resistance profile, including minor populations. This is implemented in a package, Mykrobe predictor, for S. aureus and M. tuberculosis, running in under three minutes on a laptop from raw data. For S. aureus, we train and validate in 495/471 samples respectively, finding error rates comparable to gold-standard phenotypic methods, with sensitivity/specificity of 99.3%/99.5% across 12 drugs. For M. tuberculosis, we identify species and predict resistance with specificity of 98.5% (training/validating on 1920/1609 samples). Sensitivity of 82.6% is limited by current understanding of genetic mechanisms. We...
BMJ open, 2012
To investigate the prospects of newly available benchtop sequencers to provide rapid whole-genome... more To investigate the prospects of newly available benchtop sequencers to provide rapid whole-genome data in routine clinical practice. Next-generation sequencing has the potential to resolve uncertainties surrounding the route and timing of person-to-person transmission of healthcare-associated infection, which has been a major impediment to optimal management. The authors used Illumina MiSeq benchtop sequencing to undertake case studies investigating potential outbreaks of methicillin-resistant Staphylococcus aureus (MRSA) and Clostridium difficile. Isolates were obtained from potential outbreaks associated with three UK hospitals. Isolates were sequenced from a cluster of eight MRSA carriers and an associated bacteraemia case in an intensive care unit, another MRSA cluster of six cases and two clusters of C difficile. Additionally, all C difficile isolates from cases over 6 weeks in a single hospital were rapidly sequenced and compared with local strain sequences obtained in the pre...
Biochemical and biophysical research communications, Jan 19, 2014
Genome-wide association studies (GWAS) have identified over 70 loci associated with type 2 diabet... more Genome-wide association studies (GWAS) have identified over 70 loci associated with type 2 diabetes (T2D). Most genetic variants associated with T2D are common variants with modest effects on T2D and are shared with major ancestry groups. To what extent the genetic component of T2D can be explained by common variants relies upon the shape of the genetic architecture of T2D. Fine mapping utilizing populations with different patterns of linkage disequilibrium and functional annotation derived from experiments in relevant tissues are mandatory to track down causal variants responsible for the pathogenesis of T2D.
Proceedings of the National Academy of Sciences, 2013
Significance Harvey rat sarcoma viral oncogene homolog ( HRAS ) occupies an important place in me... more Significance Harvey rat sarcoma viral oncogene homolog ( HRAS ) occupies an important place in medical history, because it was the first gene in which acquired mutations that led to activation of a normal protein were associated with cancer, making it the prototype of the now canonical oncogene mechanism. Here, we explore what happens when similar HRAS mutations occur in male germ cells, an issue of practical importance because the mutations cause a serious congenital disorder, Costello syndrome, if transmitted to offspring. We provide evidence that the mutant germ cells are positively selected, leading to an increased burden of the mutations as men age. Although there are many parallels between this germline process and classical oncogenesis, there are interesting differences of detail, which are explored in this paper.
Nature Genetics, 2012
Adaptor protein-2 (AP2), a central component of clathrin-coated vesicles (CCVs), is pivotal in cl... more Adaptor protein-2 (AP2), a central component of clathrin-coated vesicles (CCVs), is pivotal in clathrin-mediated endocytosis which internalises plasma membrane constituents such as G protein-coupled receptors (GPCRs) 1-3. AP2, a heterotetramer of alpha, beta, mu and sigma subunits, links clathrin to vesicle membranes and binds to tyrosine-based and dileucine-based motifs of membrane-associated cargo proteins 1,4. Here, we show that AP2 sigma subunit (AP2S1) missense mutations, which all involved the Arg15 residue (Arg15Cys, Arg15His and Arg15Leu) that forms key contacts with dileucine-based motifs of CCV cargo proteins 4 , result in familial hypocalciuric hypercalcemia type 3 (FHH3), an extracellular-calcium homeostasis disorder
Nature Communications, 2014
Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict dis... more Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions wit...
The Journal of Clinical Endocrinology & Metabolism, 2012
Context: Genetic abnormalities, such as those of multiple endocrine neoplasia type 1 (MEN1) and C... more Context: Genetic abnormalities, such as those of multiple endocrine neoplasia type 1 (MEN1) and Cyclin D1 (CCND1) genes, occur in Ͻ50% of nonhereditary (sporadic) parathyroid adenomas. Objective: To identify genetic abnormalities in nonhereditary parathyroid adenomas by wholeexome sequence analysis. Design: Whole-exome sequence analysis was performed on parathyroid adenomas and leukocyte DNA samples from 16 postmenopausal women without a family history of parathyroid tumors or MEN1 and in whom primary hyperparathyroidism due to single-gland disease was cured by surgery. Somatic variants confirmed in this discovery set were assessed in 24 other parathyroid adenomas. Results: Over 90% of targeted exons were captured and represented by more than 10 base reads. Analysis identified 212 somatic variants (median eight per tumor; range, 2-110), with the majority being heterozygous nonsynonymous single-nucleotide variants that predicted missense amino acid substitutions. Somatic MEN1 mutations occurred in six of 16 (ϳ35%) parathyroid adenomas, in association with loss of heterozygosity on chromosome 11. However, no other gene was mutated in more than one tumor. Mutations in several genes that may represent low-frequency driver mutations were identified, including a protection of telomeres 1 (POT1) mutation that resulted in exon skipping and disruption to the single-stranded DNA-binding domain, which may contribute to increased genomic instability and the observed high mutation rate in one tumor. Conclusions: Parathyroid adenomas typically harbor few somatic variants, consistent with their low proliferation rates. MEN1 mutation represents the major driver in sporadic parathyroid tumorigenesis although multiple low-frequency driver mutations likely account for tumors not harboring somatic MEN1 mutations.
Genome Research, 2011
New sequencing technologies can address diverse biomedical questions but are limited by a minimum... more New sequencing technologies can address diverse biomedical questions but are limited by a minimum required DNA input of typically 1 μg. We describe how sequencing libraries can be reproducibly created from 20 pg of input DNA using a modified transpososome-mediated fragmentation technique. Resulting libraries incorporate in-line bar-coding, which facilitates sample multiplexes that can be sequenced using Illumina platforms with the manufacturer's sequencing primer. We demonstrate this technique by providing deep coverage sequence of the Escherichia coli K-12 genome that shows equivalent target coverage to a 1-μg input library prepared using standard Illumina methods. Reducing template quantity does, however, increase the proportion of duplicate reads and enriches coverage in low-GC regions. This finding was confirmed with exhaustive resequencing of a mouse library constructed from 20 pg of gDNA input (about seven haploid genomes) resulting in ∼0.4-fold statistical coverage of uni...
Equal contributors The advent of a miniaturized DNA sequencing device with a high-throughput cont... more Equal contributors The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION ™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies ™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five
Genome Research, 2021
Thymic epithelial cells (TEC) control the selection of a T cell repertoire reactive to pathogens ... more Thymic epithelial cells (TEC) control the selection of a T cell repertoire reactive to pathogens but tolerant of self. This process is known to involve the promiscuous expression of virtually the entire protein-coding gene repertoire, but the extent to which TEC recapitulate peripheral isoforms, and the mechanisms by which they do so, remain largely unknown. We performed the first assembly-based transcriptomic census of transcript structures and splicing factor (SF) expression in mouse medullary TEC (mTEC) and 21 peripheral tissues. Mature mTEC expressed 60.1% of all protein-coding transcripts, more than was detected in any of the peripheral tissues. However, for genes with tissue-restricted expression, mTEC produced fewer isoforms than did the relevant peripheral tissues. Analysis of exon inclusion revealed an absence of brain-specific microexons in mTEC. We did not find unusual numbers of novel transcripts in TEC, and we show that Aire, the facilitator of promiscuous gene expressi...
Science, 2021
Patterns and bottlenecks A year into the severe acute respiratory syndrome coronavirus 2 pandemic... more Patterns and bottlenecks A year into the severe acute respiratory syndrome coronavirus 2 pandemic, we are experiencing waves of new variants emerging. Some of these variants have worrying functional implications, such as increased transmissibility or antibody treatment escape. Lythgoe et al. have undertaken in-depth sequencing of more than 1000 hospital patients' isolates to find out how the virus is mutating within individuals. Overall, there seem to be consistent and reproducible patterns of within-host virus diversity. The authors observed only one or two variants in most samples, but a few carried many variants. Although the evidence indicates strong purifying selection, including in the spike protein responsible for viral entry, the authors also saw evidence for transmission clusters associated with households and other possible superspreader events. After transmission, most variants fizzled out, but occasionally some initiated ongoing transmission and wider dissemination. ...
Driven by the necessity to survive environmental pathogens, the human immune system has evolved e... more Driven by the necessity to survive environmental pathogens, the human immune system has evolved exceptional diversity and plasticity, to which several factors contribute including inheritable structural polymorphism of the underlying genes. Characterizing this variation is challenging due to the complexity of these loci, which contain extensive regions of paralogy, segmental duplication and high copy-number repeats, but recent progress in long-read sequencing and optical mapping techniques suggests this problem may now be tractable. Here we assess this by using long-read sequencing platforms from PacBio and Oxford Nanopore, supplemented with short-read sequencing and Bionano optical mapping, to sequence DNA extracted from CD14+ monocytes and peripheral blood mononuclear cells from a single European individual identified as HV31. We use this data to build a de novo assembly of eight genomic regions encoding four key components of the immune system, namely the human leukocyte antigen,...
P1151 Objective: To investigate the relatedness of atypical meticillin resistant isolates of Stap... more P1151 Objective: To investigate the relatedness of atypical meticillin resistant isolates of Staphylococus aureus in an intensive care unit setting using a rapid turnaround bench top sequencer. Methods: 7 cases over a two week period were found to be colonised with S. aureus on routine screening using MRSA selective agar; however the isolates had an oxacillin MIC of < 2 µgm/ml on routine E-strip testing suggesting that they were meticillin susceptible. These were sent to a reference laboratory and were shown to be spa type t5973 and mecA positive by PCR. No further cases were detected on repeated screening of all patients on the unit. Two months later a case grew similar isolates from a blood culture and a screening swab. These were also t5973 and mecA positive. These isolates were tetracycline resistant on routine testing whereas the earlier isolates were susceptible. The Illumina MiSeq platform was used to sequence and assess the relationship between these 2 later isolates to t...
b-III spectrin is present in the brain and is known to be important in the function of the cerebe... more b-III spectrin is present in the brain and is known to be important in the function of the cerebellum. Heterozygous mutations in SPTBN2, the gene encoding b-III spectrin, cause Spinocerebellar Ataxia Type 5 (SCA5), an adult-onset, slowly progressive, autosomal-dominant pure cerebellar ataxia. SCA5 is sometimes known as ‘‘Lincoln ataxia,’ ’ because the largest known family is descended from relatives of the United States President Abraham Lincoln. Using targeted capture and next-generation sequencing, we identified a homozygous stop codon in SPTBN2 in a consanguineous family in which childhood developmental ataxia co-segregates with cognitive impairment. The cognitive impairment could result from mutations in a
The MIT Faculty has made this article openly available. Please share how this access benefits you... more The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
Nature communications, Oct 15, 2018
Barrett's oesophagus is a precursor of oesophageal adenocarcinoma. In this common condition, ... more Barrett's oesophagus is a precursor of oesophageal adenocarcinoma. In this common condition, squamous epithelium in the oesophagus is replaced by columnar epithelium in response to acid reflux. Barrett's oesophagus is highly heterogeneous and its relationships to normal tissues are unclear. Here we investigate the cellular complexity of Barrett's oesophagus and the upper gastrointestinal tract using RNA-sequencing of single cells from multiple biopsies from six patients with Barrett's oesophagus and two patients without oesophageal pathology. We find that cell populations in Barrett's oesophagus, marked by LEFTY1 and OLFM4, exhibit a profound transcriptional overlap with oesophageal submucosal gland cells, but not with gastric or duodenal cells. Additionally, SPINK4 and ITLN1 mark cells that precede morphologically identifiable goblet cells in colon and Barrett's oesophagus, potentially aiding the identification of metaplasia. Our findings reveal striking tra...
Diabetes, Jul 24, 2017
To identify novel coding association signals and facilitate characterization of mechanisms influe... more To identify novel coding association signals and facilitate characterization of mechanisms influencing glycemic traits and type 2 diabetes risk, we analyzed 109,215 variants derived from exome array genotyping together with an additional 390,225 variants from exome sequence in up to 39,339 normoglycemic individuals from five ancestry groups. We identified a novel association between the coding variant (p.Pro50Thr) in AKT2 and fasting insulin, a gene in which rare fully penetrant mutations are causal for monogenic glycemic disorders. The low-frequency allele is associated with a 12% increase in fasting plasma insulin (FI) levels. This variant is present at 1.1% frequency in Finns but virtually absent in individuals from other ancestries. Carriers of the FI-increasing allele had increased 2-hour insulin values, decreased insulin sensitivity, and increased risk of type 2 diabetes (odds ratio=1.05). In cellular studies, the AKT2-Thr50 protein exhibited a partial loss of function. We ext...
Bladder Cancer, 2015
Background: Germline mutations in DNA damage signalling and repair genes predispose individuals t... more Background: Germline mutations in DNA damage signalling and repair genes predispose individuals to cancer. Rare germline variants may also increase cancer risk and be predictive of outcomes following cancer treatments, but require high-throughput sequencing (HTS) for detection in large cohorts. Objective: To use a dual indexing system on a HTS platform to detect novel variants in CtIP (RBBP8) which may be associated with clinical outcomes following radiotherapy treatment for bladder cancer. Methods: All exons and flanking introns of CtIP were amplified from germline DNA from bladder cancer patients using seven primer pairs by automated long-range PCR. Amplicons were pooled, fragmented and ligated to adaptor sequences. One of 96 tag sequences was introduced at each end by PCR. Sequencing was performed on a single flow cell of an Illumina MiSeq. Reads were mapped by Stampy and variants called by Platypus. For phasing experiments, target regions were amplified and cloned for Sanger sequencing. Results: Of 201 samples, 160 were successfully amplified. Eleven CtIP variants were called, within the exons and 15 bp adjacent intronic DNA, including eight known variants from the 1000 Genomes project, plus three previously unreported variants now confirmed by Sanger sequencing. In two individuals, phasing experiments showed two variants of interest to be on separate alleles, likely to result in stronger impairment of gene function. Conclusions: We have demonstrated proof of principle for dual indexing on 160 samples on one MiSeq flow cell sequencing surface, and show that for the CtIP gene multiplexing of up to 720 samples would provide sufficient coverage to achieve >98% detection power for rare germline variation, reducing HTS costs substantially.
F1000Research, 2015
The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing c... more The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the c...
Nature genetics, Jan 18, 2015
To assess factors influencing the success of whole-genome sequencing for mainstream clinical diag... more To assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for mendelian disorders and 57% (8/14) in family trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, although only 4 were ultimately considered reportable. Our resu...
Rapid and accurate detection of antibiotic resistance in pathogens is an urgent need, affecting b... more Rapid and accurate detection of antibiotic resistance in pathogens is an urgent need, affecting both patient care and population-scale control. Microbial genome sequencing promises much, but many barriers exist to its routine deployment. Here, we address these challenges, using a de Bruijn graph comparison of clinical isolate and curated knowledge-base to identify species and predict resistance profile, including minor populations. This is implemented in a package, Mykrobe predictor, for S. aureus and M. tuberculosis, running in under three minutes on a laptop from raw data. For S. aureus, we train and validate in 495/471 samples respectively, finding error rates comparable to gold-standard phenotypic methods, with sensitivity/specificity of 99.3%/99.5% across 12 drugs. For M. tuberculosis, we identify species and predict resistance with specificity of 98.5% (training/validating on 1920/1609 samples). Sensitivity of 82.6% is limited by current understanding of genetic mechanisms. We...
BMJ open, 2012
To investigate the prospects of newly available benchtop sequencers to provide rapid whole-genome... more To investigate the prospects of newly available benchtop sequencers to provide rapid whole-genome data in routine clinical practice. Next-generation sequencing has the potential to resolve uncertainties surrounding the route and timing of person-to-person transmission of healthcare-associated infection, which has been a major impediment to optimal management. The authors used Illumina MiSeq benchtop sequencing to undertake case studies investigating potential outbreaks of methicillin-resistant Staphylococcus aureus (MRSA) and Clostridium difficile. Isolates were obtained from potential outbreaks associated with three UK hospitals. Isolates were sequenced from a cluster of eight MRSA carriers and an associated bacteraemia case in an intensive care unit, another MRSA cluster of six cases and two clusters of C difficile. Additionally, all C difficile isolates from cases over 6 weeks in a single hospital were rapidly sequenced and compared with local strain sequences obtained in the pre...
Biochemical and biophysical research communications, Jan 19, 2014
Genome-wide association studies (GWAS) have identified over 70 loci associated with type 2 diabet... more Genome-wide association studies (GWAS) have identified over 70 loci associated with type 2 diabetes (T2D). Most genetic variants associated with T2D are common variants with modest effects on T2D and are shared with major ancestry groups. To what extent the genetic component of T2D can be explained by common variants relies upon the shape of the genetic architecture of T2D. Fine mapping utilizing populations with different patterns of linkage disequilibrium and functional annotation derived from experiments in relevant tissues are mandatory to track down causal variants responsible for the pathogenesis of T2D.
Proceedings of the National Academy of Sciences, 2013
Significance Harvey rat sarcoma viral oncogene homolog ( HRAS ) occupies an important place in me... more Significance Harvey rat sarcoma viral oncogene homolog ( HRAS ) occupies an important place in medical history, because it was the first gene in which acquired mutations that led to activation of a normal protein were associated with cancer, making it the prototype of the now canonical oncogene mechanism. Here, we explore what happens when similar HRAS mutations occur in male germ cells, an issue of practical importance because the mutations cause a serious congenital disorder, Costello syndrome, if transmitted to offspring. We provide evidence that the mutant germ cells are positively selected, leading to an increased burden of the mutations as men age. Although there are many parallels between this germline process and classical oncogenesis, there are interesting differences of detail, which are explored in this paper.
Nature Genetics, 2012
Adaptor protein-2 (AP2), a central component of clathrin-coated vesicles (CCVs), is pivotal in cl... more Adaptor protein-2 (AP2), a central component of clathrin-coated vesicles (CCVs), is pivotal in clathrin-mediated endocytosis which internalises plasma membrane constituents such as G protein-coupled receptors (GPCRs) 1-3. AP2, a heterotetramer of alpha, beta, mu and sigma subunits, links clathrin to vesicle membranes and binds to tyrosine-based and dileucine-based motifs of membrane-associated cargo proteins 1,4. Here, we show that AP2 sigma subunit (AP2S1) missense mutations, which all involved the Arg15 residue (Arg15Cys, Arg15His and Arg15Leu) that forms key contacts with dileucine-based motifs of CCV cargo proteins 4 , result in familial hypocalciuric hypercalcemia type 3 (FHH3), an extracellular-calcium homeostasis disorder
Nature Communications, 2014
Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict dis... more Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions wit...
The Journal of Clinical Endocrinology & Metabolism, 2012
Context: Genetic abnormalities, such as those of multiple endocrine neoplasia type 1 (MEN1) and C... more Context: Genetic abnormalities, such as those of multiple endocrine neoplasia type 1 (MEN1) and Cyclin D1 (CCND1) genes, occur in Ͻ50% of nonhereditary (sporadic) parathyroid adenomas. Objective: To identify genetic abnormalities in nonhereditary parathyroid adenomas by wholeexome sequence analysis. Design: Whole-exome sequence analysis was performed on parathyroid adenomas and leukocyte DNA samples from 16 postmenopausal women without a family history of parathyroid tumors or MEN1 and in whom primary hyperparathyroidism due to single-gland disease was cured by surgery. Somatic variants confirmed in this discovery set were assessed in 24 other parathyroid adenomas. Results: Over 90% of targeted exons were captured and represented by more than 10 base reads. Analysis identified 212 somatic variants (median eight per tumor; range, 2-110), with the majority being heterozygous nonsynonymous single-nucleotide variants that predicted missense amino acid substitutions. Somatic MEN1 mutations occurred in six of 16 (ϳ35%) parathyroid adenomas, in association with loss of heterozygosity on chromosome 11. However, no other gene was mutated in more than one tumor. Mutations in several genes that may represent low-frequency driver mutations were identified, including a protection of telomeres 1 (POT1) mutation that resulted in exon skipping and disruption to the single-stranded DNA-binding domain, which may contribute to increased genomic instability and the observed high mutation rate in one tumor. Conclusions: Parathyroid adenomas typically harbor few somatic variants, consistent with their low proliferation rates. MEN1 mutation represents the major driver in sporadic parathyroid tumorigenesis although multiple low-frequency driver mutations likely account for tumors not harboring somatic MEN1 mutations.
Genome Research, 2011
New sequencing technologies can address diverse biomedical questions but are limited by a minimum... more New sequencing technologies can address diverse biomedical questions but are limited by a minimum required DNA input of typically 1 μg. We describe how sequencing libraries can be reproducibly created from 20 pg of input DNA using a modified transpososome-mediated fragmentation technique. Resulting libraries incorporate in-line bar-coding, which facilitates sample multiplexes that can be sequenced using Illumina platforms with the manufacturer's sequencing primer. We demonstrate this technique by providing deep coverage sequence of the Escherichia coli K-12 genome that shows equivalent target coverage to a 1-μg input library prepared using standard Illumina methods. Reducing template quantity does, however, increase the proportion of duplicate reads and enriches coverage in low-GC regions. This finding was confirmed with exhaustive resequencing of a mouse library constructed from 20 pg of gDNA input (about seven haploid genomes) resulting in ∼0.4-fold statistical coverage of uni...
Equal contributors The advent of a miniaturized DNA sequencing device with a high-throughput cont... more Equal contributors The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION ™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies ™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five
Genome Research, 2021
Thymic epithelial cells (TEC) control the selection of a T cell repertoire reactive to pathogens ... more Thymic epithelial cells (TEC) control the selection of a T cell repertoire reactive to pathogens but tolerant of self. This process is known to involve the promiscuous expression of virtually the entire protein-coding gene repertoire, but the extent to which TEC recapitulate peripheral isoforms, and the mechanisms by which they do so, remain largely unknown. We performed the first assembly-based transcriptomic census of transcript structures and splicing factor (SF) expression in mouse medullary TEC (mTEC) and 21 peripheral tissues. Mature mTEC expressed 60.1% of all protein-coding transcripts, more than was detected in any of the peripheral tissues. However, for genes with tissue-restricted expression, mTEC produced fewer isoforms than did the relevant peripheral tissues. Analysis of exon inclusion revealed an absence of brain-specific microexons in mTEC. We did not find unusual numbers of novel transcripts in TEC, and we show that Aire, the facilitator of promiscuous gene expressi...
Science, 2021
Patterns and bottlenecks A year into the severe acute respiratory syndrome coronavirus 2 pandemic... more Patterns and bottlenecks A year into the severe acute respiratory syndrome coronavirus 2 pandemic, we are experiencing waves of new variants emerging. Some of these variants have worrying functional implications, such as increased transmissibility or antibody treatment escape. Lythgoe et al. have undertaken in-depth sequencing of more than 1000 hospital patients' isolates to find out how the virus is mutating within individuals. Overall, there seem to be consistent and reproducible patterns of within-host virus diversity. The authors observed only one or two variants in most samples, but a few carried many variants. Although the evidence indicates strong purifying selection, including in the spike protein responsible for viral entry, the authors also saw evidence for transmission clusters associated with households and other possible superspreader events. After transmission, most variants fizzled out, but occasionally some initiated ongoing transmission and wider dissemination. ...
Driven by the necessity to survive environmental pathogens, the human immune system has evolved e... more Driven by the necessity to survive environmental pathogens, the human immune system has evolved exceptional diversity and plasticity, to which several factors contribute including inheritable structural polymorphism of the underlying genes. Characterizing this variation is challenging due to the complexity of these loci, which contain extensive regions of paralogy, segmental duplication and high copy-number repeats, but recent progress in long-read sequencing and optical mapping techniques suggests this problem may now be tractable. Here we assess this by using long-read sequencing platforms from PacBio and Oxford Nanopore, supplemented with short-read sequencing and Bionano optical mapping, to sequence DNA extracted from CD14+ monocytes and peripheral blood mononuclear cells from a single European individual identified as HV31. We use this data to build a de novo assembly of eight genomic regions encoding four key components of the immune system, namely the human leukocyte antigen,...
P1151 Objective: To investigate the relatedness of atypical meticillin resistant isolates of Stap... more P1151 Objective: To investigate the relatedness of atypical meticillin resistant isolates of Staphylococus aureus in an intensive care unit setting using a rapid turnaround bench top sequencer. Methods: 7 cases over a two week period were found to be colonised with S. aureus on routine screening using MRSA selective agar; however the isolates had an oxacillin MIC of < 2 µgm/ml on routine E-strip testing suggesting that they were meticillin susceptible. These were sent to a reference laboratory and were shown to be spa type t5973 and mecA positive by PCR. No further cases were detected on repeated screening of all patients on the unit. Two months later a case grew similar isolates from a blood culture and a screening swab. These were also t5973 and mecA positive. These isolates were tetracycline resistant on routine testing whereas the earlier isolates were susceptible. The Illumina MiSeq platform was used to sequence and assess the relationship between these 2 later isolates to t...