A Mismatch EndoNuclease Array-Based Methodology (MENA) for Identifying Known SNPs or Novel Point Mutations (original) (raw)

Rapid identification of disease-causing mutations using copy number analysis within linkage intervals

Human Mutation, 2007

Communicated by Jing Cheng SNP and comparative genome hybridization arrays (aCGH) are powerful techniques for identifying genome rearrangements, deletions, and duplications. We hypothesized that current array-based detection of copy number variation (CNV) could complement parametric linkage analysis and allow the rapid identification of functional mutations in families with inherited disorders. Herein, we demonstrate the utility of this technique by rapidly identifying a disease causing microdeletion within the PARK2 gene in a family with autosomal recessive Parkinsonism.

High-throughput detection of mutations responsible for childhood hearing loss using resequencing microarrays

BMC Biotechnology, 2010

Background: Despite current knowledge of mutations in 45 genes that can cause nonsyndromic sensorineural hearing loss (SNHL), no unified clinical test has been developed that can comprehensively detect mutations in multiple genes. We therefore designed Affymetrix resequencing microarrays capable of resequencing 13 genes mutated in SNHL (GJB2, GJB6, CDH23, KCNE1, KCNQ1, MYO7A, OTOF, PDS, MYO6, SLC26A5, TMIE, TMPRSS3, USH1C). We present results from hearing loss arrays developed in two different research facilities and highlight some of the approaches we adopted to enhance the applicability of resequencing arrays in a clinical setting. Results: We leveraged sequence and intensity pattern features responsible for diminished coverage and accuracy and developed a novel algorithm, sPROFILER, which resolved >80% of no-calls from GSEQ and allowed 99.6% (range: 99.2-99.8%) of sequence to be called, while maintaining overall accuracy at >99.8% based upon dideoxy sequencing comparison. Conclusions: Together, these findings provide insight into critical issues for disease-centered resequencing protocols suitable for clinical application and support the use of array-based resequencing technology as a valuable molecular diagnostic tool for pediatric SNHL and other genetic diseases with substantial genetic heterogeneity.

High-throughput discovery of rare human nucleotide polymorphisms by Ecotilling

Nucleic Acids Research, 2006

Human individuals differ from one another at only $0.1% of nucleotide positions, but these single nucleotide differences account for most heritable phenotypic variation. Large-scale efforts to discover and genotype human variation have been limited to common polymorphisms. However, these efforts overlook rare nucleotide changes that may contribute to phenotypic diversity and genetic disorders, including cancer. Thus, there is an increasing need for high-throughput methods to robustly detect rare nucleotide differences. Toward this end, we have adapted the mismatch discovery method known as Ecotilling for the discovery of human single nucleotide polymorphisms. To increase throughput and reduce costs, we developed a universal primer strategy and implemented algorithms for automated band detection. Ecotilling was validated by screening 90 human DNA samples for nucleotide changes in 5 gene targets and by comparing results to public resequencing data. To increase throughput for discovery of rare alleles, we pooled samples 8-fold and found Ecotilling to be efficient relative to resequencing, with a false negative rate of 5% and a false discovery rate of 4%. We identified 28 new rare alleles, including some that are predicted to damage protein function. The detection of rare damaging mutations has implications for models of human disease.

Computational identification of candidate loci for recessively inherited mutation using high-throughput SNP arrays

Bioinformatics, 2007

Motivation: Single nucleic polymorphisms (SNPs) are one of the most abundant genetic variations in the human genome. Recently, several platforms for high-throughput SNP analysis have become available, capable of measuring thousands of SNPs across the genome. Tools for analysing and visualising these large genetic datasets in biologically relevant manner are rare. This hinders effective use of the SNP-array data in research on complex diseases, such as cancer. Results: We describe a computational framework to analyse and visualise SNP-array data, and link the results in relevant databases. Our major objective is to develop methods for identifying DNA regions that likely harbour recessive mutations. Thus, the algorithms are designed to have high sensitivity and the identified regions are ranked using a scoring algorithm. We have also developed annotation tools that automatically query gene IDs, exon counts, microarray probe IDs etc. In our case study we apply the methods for identifying candidate regions for recessively inherited colorectal cancer predisposition and suggest directions for wet-lab experiments. Availability: R-package implementation is available at

The discovery of human genetic variations and their use as disease markers: past, present and future

Journal of Human Genetics, 2010

The field of human genetic variations has progressed rapidly over the past few years. It has added much information and deepened our knowledge and understanding of the diversity of genetic variations in the human genome. This significant progress has been driven mainly by the developments of microarray and next generation sequencing technologies. The array-based methods have been widely used for large-scale copy number variation (CNV) detection in the human genome. The arrival of next generation sequencing technologies, which enabled the completion of several whole genome resequencing studies, has also resulted in a massive discovery of genetic variations. These studies have identified several hundred thousand short indels and a total of thousands of CNVs and other structural variations in the human genome. The discovery of these 'newer' types of genetic variations, indels, CNVs and copy neutral variations (inversions and translocations) has also widened the scope of genetic markers in human genetic and disease gene mapping studies. The aim of this review article is to summarize the latest developments in the discovery of human genetic variations and address the issue of inadequate coverage of genetic variations in the current genome-wide association studies, which mainly focuses on common SNPs. Finally, we also discuss the future directions in the field and their impacts on next generation genome-wide association studies.

Whole-Exome Sequencing Efficiently Detects Rare Mutations in Autosomal Recessive Nonsyndromic Hearing Loss

PLoS ONE, 2012

Identification of the pathogenic mutations underlying autosomal recessive nonsyndromic hearing loss (ARNSHL) is difficult, since causative mutations in 39 different genes have so far been reported. After excluding mutations in the most common ARNSHL gene, GJB2, via Sanger sequencing, we performed whole-exome sequencing (WES) in 30 individuals from 20 unrelated multiplex consanguineous families with ARNSHL. Agilent SureSelect Human All Exon 50 Mb kits and an Illumina Hiseq2000 instrument were used. An average of 93%, 84% and 73% of bases were covered to 1X, 10X and 20X within the ARNSHL-related coding RefSeq exons, respectively. Uncovered regions with WES included those that are not targeted by the exome capture kit and regions with high GC content. Twelve homozygous mutations in known deafness genes, of which eight are novel, were identified in 12 families: .R785Sfs*50. Each mutation was within a homozygous run documented via WES. Sanger sequencing confirmed co-segregation of the mutation with deafness in each family. Four rare heterozygous variants, predicted to be pathogenic, in known deafness genes were detected in 12 families where homozygous causative variants were already identified. Six heterozygous variants that had similar characteristics to those abovementioned variants were present in 15 ethnically-matched individuals with normal hearing. Our results show that rare causative mutations in known ARNSHL genes can be reliably identified via WES. The excess of heterozygous variants should be considered during search for causative mutations in ARNSHL genes, especially in small-sized families.

DIAMUND : Direct Comparison of Genomes to Detect Mutations

Human Mutation, 2014

DNA sequencing has become a powerful method to discover the genetic basis of disease. Standard, widely used protocols for analysis usually begin by comparing each individual to the human reference genome. When applied to a set of related individuals, this approach reveals millions of differences, most of which are shared among the individuals and unrelated to the disease being investigated. We have developed a novel algorithm for variant detection, one that compares DNA sequences directly to one another, without aligning them to the reference genome. When used to find de novo mutations in exome sequences from family trios, or to compare normal and diseased samples from the same individual, the new method, direct alignment for mutation discovery (DIAMUND), produces a dramatically smaller list of candidate mutations than previous methods, without losing sensitivity to detect the true cause of a genetic disease. We demonstrate our results on several example cases, including two family trios in which it correctly found the disease-causing variant while excluding thousands of harmless variants that standard methods had identified.

Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias

BMC Genomics, 2012

Background: High-density genotyping arrays that measure hybridization of genomic DNA fragments to allelespecific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information -as much as 50% in a recent genome-wide association study in dogs. Results: We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used wholegenome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. Conclusion: The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses.