Michael Nothnagel - Academia.edu (original) (raw)
Papers by Michael Nothnagel
Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren's Disease
PloS one, 2016
Dupuytren´s disease, a fibromatosis of the connective tissue in the palm, is a common complex dis... more Dupuytren´s disease, a fibromatosis of the connective tissue in the palm, is a common complex disease with a strong genetic component. Up to date nine genetic loci have been found to be associated with the disease. Six of these loci contain genes that code for Wnt signalling proteins. In spite of this striking first insight into the genetic factors in Dupuytren´s disease, much of the inherited risk in Dupuytren´s disease still needs to be discovered. The already identified loci jointly explain ~1% of the heritability in this disease. To further elucidate the genetic basis of Dupuytren´s disease, we performed a genome-wide meta-analysis combining three genome-wide association study (GWAS) data sets, comprising 1,580 cases and 4,480 controls. We corroborated all nine previously identified loci, six of these with genome-wide significance (p-value < 5x10-8). In addition, we identified 14 new suggestive loci (p-value < 10-5). Intriguingly, several of these new loci contain genes as...
The minor allele of the PPARγ2 Prol2Ala polymorphism is associated with lower postprandial TAG and insulin levels in non-obese healthy men
British Journal of Nutrition, 2007
Human Molecular Genetics, Aug 1, 2010
The availability of high-density panels of genetic polymorphisms has led to the discovery of exte... more The availability of high-density panels of genetic polymorphisms has led to the discovery of extended regions of apparent autozygosity in the human genome. At the genotype level, these regions present as sizeable stretches, or 'runs', of homozygosity (ROH). Here, we investigated both the genomic and the geographic distribution of ROHs in a large European sample of individuals originating from 23 subpopulations. The genomic ROH distribution was found to be characterized by a pattern of highly significant non-uniformity that was virtually identical in all subpopulations studied. Some 77 chromosomal regions contained ROHs at considerable frequency, thereby forming 'ROH islands' that were not explicable by high linkage disequilibrium alone. At the geographic level, the number and cumulative length of ROHs followed a prominent South to North gradient in agreement with expectations from European population history. The individual ROH length, in contrast, showed only minor and unsystematic geographic variation. While our findings are thus consistent with a larger effective population size in Southern than in Northern Europe, combined with a higher historic population density and mobility, they also indicate that the patterns of meiotic recombination in humans must have been very similar throughout the continent. Extending previous reports of a strong correlation between geography and identity-by-state, our data show that the genomic identity-by-descent patterns of Europeans are also clinal. As a consequence, the planning, design and interpretation of ROH-based genetic studies must take sample origin into account in order for such studies to be sensible and valid.
Journal of Molecular Medicine Jmm, 2008
C-C chemokine receptors have been suggested to play an important role in sarcoidosis pathogenesis... more C-C chemokine receptors have been suggested to play an important role in sarcoidosis pathogenesis. Previous investigation of the C-C chemokine receptor 5 (CCR5) gene revealed the association of the HHC haplotype with "persistent lung involvement" in two European sarcoidosis populations. Based on this finding, we investigated a possible association of the HHC haplotype and its marker alleles in an extended German sarcoidosis sample that comprised 995 German sarcoidosis families including individuals with the chronic and acute form of the disease, further refined to patients with and without Löfgren's syndrome. We genotyped this sample and 538 healthy control subjects for 8 single nucleotide polymorphisms (SNPs) that define the HHC haplotype in the CCR5 genomic region. Analysis of 3 sarcoidosis phenotypes (chronic, acute and Löfgren's syndrome) revealed that the HHC haplotype was not associated with chronic sarcoidosis although a substantial overlap can be assumed between the chronic form examined in our study and "persistent parenchymal lung involvement", the phenotype for which an association was previously established. However, 2 marker alleles in the putative CCR5 promoter, which are part of the HHC haplotype, are associated with Löfgren's syndrome. Strikingly, the association is restricted to females. This finding is consistent with recently described sex-specific manifestations of Löfgren's syndrome and with previous functional studies suggesting an estrogen-dependent CCR5 expression. The female-specific association of SNPs in the putative CCR5 promoter region with Löfgren's syndrome raises the possibility that the dysregulated, sexspecific modification of CCR5 expression could contribute to the increased risk of women to develop the disease.
Human Genetics, 2009
Genome-wide association studies have contributed significantly to the genetic dissection of compl... more Genome-wide association studies have contributed significantly to the genetic dissection of complex diseases. In order to increase the power of existing marker sets even further, methods have been proposed to predict individual genotypes at un-typed loci from other marker sets by imputation, usually employing HapMap data as a reference. Although various imputation algorithms have been used in practice already, a comprehensive evaluation and comparison of these approaches, using genome-wide SNP data from one and the same population is still lacking. We therefore investigated four publicly available programs for genotype imputation (BEAGLE, IMPUTE, MACH, and PLINK) using data from 449 German individuals genotyped in our laboratory for three genome-wide SNP sets [Affymetrix 5.0 (500 k), Affymetrix 6.0 (1,000 k), and Illumina 550 k]. We observed that HapMap-based imputation in a northern European population is powerful and reliable, even in highly variable genomic regions such as the extended MHC on chromosome 6p21. However, while genotype predictions were found to be highly accurate with all four programs, the number of SNPs for which imputation was actually carried out ('imputation efficacy') varied substantially. BEAGLE, IMPUTE, and MACH yielded nearly identical trade-offs between imputation accuracy and efficacy whereas PLINK performed consistently poorer. We nevertheless recommend either MACH or BEAGLE for practical use because these two programs are more userfriendly and generally require less memory than IMPUTE.
Spatial autocorrelation analysis of Y-STR genotypes
A genome-wide linkage analysis in 181 German sarcoidosis families using clustered bi-allelic markers
Chest, 2010
Sarcoidosis (SA) is a systemic granulomatous inflammatory disorder with complex etiology and stro... more Sarcoidosis (SA) is a systemic granulomatous inflammatory disorder with complex etiology and strong clustering in families. Genome-wide association studies have been successful in the identification of common risk variants for the disease. To reveal susceptibility variants with low frequencies but strong effects, we performed a genome-wide linkage scan in a large sample of SA families. We genotyped 528 members of 181 German SA families for 3,882 single nucleotide polymorphism assays from the SNPlex System Human Linkage Mapping Set 4K. Nonparametric linkage analysis revealed one region of suggestive linkage on chromosome 12p13.31 at 20 cM (logarithm of odds [LOD] = 2.53; local P value = .0003) and another linkage peak of nearly suggestive linkage on 9q33.1 at 134 cM (LOD = 2.12; local P value = .0009). The latter has been reported to show suggestive evidence for linkage in a sample of 229 African American SA families previously. Analysis of acute and chronically affected families revealed a subphenotype-specific linkage pattern and an additional, nearly suggestive linkage peak on chromosome 16p13.11 at 38 cM (LOD = 2.09; local P value = .001), which was confined to acute SA. Our results propose that the respective regions might harbor yet-unidentified, possibly subphenotype-specific risk factors for the disease (eg, with immune-related functions like the tumor necrosis factor receptor 1). They should be proved to be important for SA pathogenesis and investigated in detail with an emphasis on rare variants. Subphenotype-specific risk factors might serve for prognosis of the clinical course of the disease.
Statistical gene mapping of traits in humans[mdash ]hypertension as a complex trait: Is it amenable to genetic analysis?
Semin Nephrol, 2002
Increased Probability of Co-Occurrence of Two Rare Diseases in Consanguineous Families and Resolution of a Complex Phenotype by Next Generation Sequencing
PLOS ONE, 2016
Massively parallel sequencing of whole genomes and exomes has facilitated a direct assessment of ... more Massively parallel sequencing of whole genomes and exomes has facilitated a direct assessment of causative genetic variation, now enabling the identification of genetic factors involved in rare diseases (RD) with Mendelian inheritance patterns on an almost routine basis. Here, we describe the illustrative case of a single consanguineous family where this strategy suffered from the difficulty to distinguish between two etiologically distinct disorders, namely the co-occurrence of hereditary hypophosphatemic rickets (HRR) and congenital myopathies (CM), by their phenotypic manifestation alone. We used parametric linkage analysis, homozygosity mapping and whole exome-sequencing to identify mutations underlying HRR and CM. We also present an approximate approach for assessing the probability of co-occurrence of two unlinked recessive RD in a single family as a function of the degree of consanguinity and the frequency of the disease-causing alleles. Linkage analysis and homozygosity mapping yielded elusive results when assuming a single RD, but whole-exome sequencing helped to identify two mutations in two genes, namely SLC34A3 and SEPN1, that segregated independently in this family and that have previously been linked to two etiologically different diseases. We assess the increase in chance co-occurrence of rare diseases due to consanguinity, i.e. under circumstances that generally favor linkage mapping of recessive disease, and show that this probability can increase by several orders of magnitudes. We conclude that such potential co-occurrence represents an underestimated risk when analyzing rare or undefined diseases in consanguineous families and should be given more consideration in the clinical and genetic evaluation.
Multidimensional scaling (MDS) analysis of Y-STR genotypes
Benign infantile seizures and paroxysmal dyskinesia caused by an SCN8A mutation
Annals of neurology, Jan 17, 2015
Benign familial infantile seizures (BFIS), paroxysmal kinesigenic dyskinesia (PKD), and their com... more Benign familial infantile seizures (BFIS), paroxysmal kinesigenic dyskinesia (PKD), and their combination - known as infantile convulsions and paroxysmal choreoathetosis (ICCA) - are related autosomal dominant diseases. PRRT2 (proline-rich transmembrane protein 2 gene) has been identified as the major gene in all three conditions, found to be mutated in 80-90% of familial and 30-35% of sporadic cases. We searched for the genetic defect in PRRT2-negative, unrelated families with BFIS or ICCA using whole exome or targeted gene panel sequencing, and performed a detailed clinico-neurophysiological workup. In three families with a total of 16 affected members, we identified the same, co-segregating heterozygous missense mutation (c.4447G>A; p.E1483K) in SCN8A, encoding a voltage-gated sodium channel. A founder effect was excluded by linkage analysis. All individuals but one had normal cognitive and motor milestones, neuroimaging and interictal neurological status. Fifteen affected mem...
Power and Sample Size Calculations for
ABSTRACT
Genetic models for multifactorial diseases
Medizinische Genetik
Nature genetics, Jan 19, 2015
Alcohol misuse is the leading cause of cirrhosis and the second most common indication for liver ... more Alcohol misuse is the leading cause of cirrhosis and the second most common indication for liver transplantation in the Western world. We performed a genome-wide association study for alcohol-related cirrhosis in individuals of European descent (712 cases and 1,426 controls) with subsequent validation in two independent European cohorts (1,148 cases and 922 controls). We identified variants in the MBOAT7 (P = 1.03 × 10(-9)) and TM6SF2 (P = 7.89 × 10(-10)) genes as new risk loci and confirmed rs738409 in PNPLA3 as an important risk locus for alcohol-related cirrhosis (P = 1.54 × 10(-48)) at a genome-wide level of significance. These three loci have a role in lipid processing, suggesting that lipid turnover is important in the pathogenesis of alcohol-related cirrhosis.
CoNCoS: Copy number estimation in cancer with controlled support
Journal of Bioinformatics and Computational Biology, 2015
Somatic copy number (CN) alterations are major drivers of tumorigenesis and growth. Although next... more Somatic copy number (CN) alterations are major drivers of tumorigenesis and growth. Although next-generation sequencing (NGS) technologies enable a deep genomic analysis of cancers, the analysis of the data remains subject to biases and multiple sources of error, including varying local read coverage. The currently existing algorithms for NGS-based detection of CN abberations do not incorporate information on the local coverage quality. We have developed a new algorithm, copy number estimation with controlled support (CoNCoS) that increases the accuracy of CN estimation in paired tumor/normal exome sequencing data sets by assessing and optimizing the support for a site-specific CN estimate. We show by simulations and in a benchmarking study against single nucleotide polymorphism (SNP) microarray data that our approach outperforms the commonly used methods CNAnorm and VarScan2. Our algorithm is suitable to increase the accuracy of somatic CN analysis by a support-optimized estimation approach.
Family-Based Benchmarking of Copy Number Variation Detection Software
PloS one, 2015
The analysis of structural variants, in particular of copy-number variations (CNVs), has proven v... more The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated...
Mutations Causing Complex Disease May under Certain Circumstances Be Protective in an Epidemiological Sense
PloS one, 2015
Guided by the practice of classical epidemiology, research into the genetic basis of complex dise... more Guided by the practice of classical epidemiology, research into the genetic basis of complex disease has usually taken for granted the dictum that causative mutations are invariably over-represented among clinically affected as compared to unaffected individuals. However, we show that this supposition is not true and that a mutation contributing to the etiology of a complex disease can, under certain circumstances, be depleted among patients. Populations with defined disease prevalence were repeatedly simulated under a Wright-Fisher model, assuming various types of population history and genotype-phenotype relationship. For each simulation, the resulting mutation-specific population frequencies and odds ratios (ORs) were evaluated. In addition, the relationship between mutation frequency and OR was studied using real data from the NIH GWAS catalogue of reported phenotype associations of single-nucleotide polymorphisms (SNPs). While rare diseases (prevalence <1%) were found to be ...
NOD1 gene polymorphisms in relation to aggressive periodontitis
Innate immunity, 2009
NOD proteins are part of innate immunity mechanisms. They play a role in epithelial barrier funct... more NOD proteins are part of innate immunity mechanisms. They play a role in epithelial barrier functions and inflammatory responses to bacteria. Single nucleotide polymorphisms (SNPs) in the NOD1 gene have proven to be associated with inflammatory bowel disease (IBD) and asthma. To investigate SNPs in the NOD1 gene in relation to aggressive periodontitis (AgP), a multifactorial, inflammatory disease of the supporting tissues of the teeth. Five SNPs in the NOD1 gene (4 intronic and 1 exonic) were tested for association in a total of 415 AgP patients and 874 controls both of Northern European ancestry. The frequencies of the rare SNP alleles ranged between 21% and 26% among cases, and 20-27% among controls, and were not statistically different between cases and controls. Two SNPs were in strong linkage disequilibrium (r(2) = 0.97 in cases and 0.94 in controls). The overall haplotype distributions did not differ between cases and controls. We observed 8 haplotypes with a frequency of >...
Human genomics, 2009
Genotype imputation for single nucleotide polymorphisms (SNPs) has been shown to be a powerful me... more Genotype imputation for single nucleotide polymorphisms (SNPs) has been shown to be a powerful means to include genetic markers in exploratory genetic association studies without having to genotype them, and is becoming a standard procedure. A number of different software programs are available. In our experience, user-friendliness is often the deciding factor in the choice of software to solve a particular task. We therefore evaluated the usability of three publicly available imputation programs: BEAGLE, IMPUTE and MACH. We found all three programs to perform well with HapMap reference data, with little effort needed for data preparation and subsequent association analysis. Each of them has different strengths and weaknesses, however, and none is optimal for all situations.
Forensic Science International: Genetics, 2015
Short tandem repeat (STR) markers are widely and continuously used in forensic applications. Howe... more Short tandem repeat (STR) markers are widely and continuously used in forensic applications. However, past research has demonstrated substantial allelic association between STR markers on both autosomes and the X chromosome, leading to partially redundant information that these markers can provide. Here, we quantify the allelic association between Y-chromosomal STR markers that are part of established forensic panels, separately for three different continental groups. We further propose a sequential marker selection procedure that is based on Shannon's equivocation and that accounts for allelic association between STR markers, leading to a maximal gain in independent information. In application to three realworld data sets, we demonstrate the procedure's superior performance when compared to single-locus diversity selection strategies, resulting in the optimal marker set for a given data set in the majority of marker subsets. Noting the inferior performance of the established Y-STR marker panels in a retrospective investigation, we suggest that future forensic marker selection should be guided, besides by other technical selection criteria, by an equivocation-based approach to obtain maximally discriminatory marker sets at minimal cost.
Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren's Disease
PloS one, 2016
Dupuytren´s disease, a fibromatosis of the connective tissue in the palm, is a common complex dis... more Dupuytren´s disease, a fibromatosis of the connective tissue in the palm, is a common complex disease with a strong genetic component. Up to date nine genetic loci have been found to be associated with the disease. Six of these loci contain genes that code for Wnt signalling proteins. In spite of this striking first insight into the genetic factors in Dupuytren´s disease, much of the inherited risk in Dupuytren´s disease still needs to be discovered. The already identified loci jointly explain ~1% of the heritability in this disease. To further elucidate the genetic basis of Dupuytren´s disease, we performed a genome-wide meta-analysis combining three genome-wide association study (GWAS) data sets, comprising 1,580 cases and 4,480 controls. We corroborated all nine previously identified loci, six of these with genome-wide significance (p-value < 5x10-8). In addition, we identified 14 new suggestive loci (p-value < 10-5). Intriguingly, several of these new loci contain genes as...
The minor allele of the PPARγ2 Prol2Ala polymorphism is associated with lower postprandial TAG and insulin levels in non-obese healthy men
British Journal of Nutrition, 2007
Human Molecular Genetics, Aug 1, 2010
The availability of high-density panels of genetic polymorphisms has led to the discovery of exte... more The availability of high-density panels of genetic polymorphisms has led to the discovery of extended regions of apparent autozygosity in the human genome. At the genotype level, these regions present as sizeable stretches, or 'runs', of homozygosity (ROH). Here, we investigated both the genomic and the geographic distribution of ROHs in a large European sample of individuals originating from 23 subpopulations. The genomic ROH distribution was found to be characterized by a pattern of highly significant non-uniformity that was virtually identical in all subpopulations studied. Some 77 chromosomal regions contained ROHs at considerable frequency, thereby forming 'ROH islands' that were not explicable by high linkage disequilibrium alone. At the geographic level, the number and cumulative length of ROHs followed a prominent South to North gradient in agreement with expectations from European population history. The individual ROH length, in contrast, showed only minor and unsystematic geographic variation. While our findings are thus consistent with a larger effective population size in Southern than in Northern Europe, combined with a higher historic population density and mobility, they also indicate that the patterns of meiotic recombination in humans must have been very similar throughout the continent. Extending previous reports of a strong correlation between geography and identity-by-state, our data show that the genomic identity-by-descent patterns of Europeans are also clinal. As a consequence, the planning, design and interpretation of ROH-based genetic studies must take sample origin into account in order for such studies to be sensible and valid.
Journal of Molecular Medicine Jmm, 2008
C-C chemokine receptors have been suggested to play an important role in sarcoidosis pathogenesis... more C-C chemokine receptors have been suggested to play an important role in sarcoidosis pathogenesis. Previous investigation of the C-C chemokine receptor 5 (CCR5) gene revealed the association of the HHC haplotype with "persistent lung involvement" in two European sarcoidosis populations. Based on this finding, we investigated a possible association of the HHC haplotype and its marker alleles in an extended German sarcoidosis sample that comprised 995 German sarcoidosis families including individuals with the chronic and acute form of the disease, further refined to patients with and without Löfgren's syndrome. We genotyped this sample and 538 healthy control subjects for 8 single nucleotide polymorphisms (SNPs) that define the HHC haplotype in the CCR5 genomic region. Analysis of 3 sarcoidosis phenotypes (chronic, acute and Löfgren's syndrome) revealed that the HHC haplotype was not associated with chronic sarcoidosis although a substantial overlap can be assumed between the chronic form examined in our study and "persistent parenchymal lung involvement", the phenotype for which an association was previously established. However, 2 marker alleles in the putative CCR5 promoter, which are part of the HHC haplotype, are associated with Löfgren's syndrome. Strikingly, the association is restricted to females. This finding is consistent with recently described sex-specific manifestations of Löfgren's syndrome and with previous functional studies suggesting an estrogen-dependent CCR5 expression. The female-specific association of SNPs in the putative CCR5 promoter region with Löfgren's syndrome raises the possibility that the dysregulated, sexspecific modification of CCR5 expression could contribute to the increased risk of women to develop the disease.
Human Genetics, 2009
Genome-wide association studies have contributed significantly to the genetic dissection of compl... more Genome-wide association studies have contributed significantly to the genetic dissection of complex diseases. In order to increase the power of existing marker sets even further, methods have been proposed to predict individual genotypes at un-typed loci from other marker sets by imputation, usually employing HapMap data as a reference. Although various imputation algorithms have been used in practice already, a comprehensive evaluation and comparison of these approaches, using genome-wide SNP data from one and the same population is still lacking. We therefore investigated four publicly available programs for genotype imputation (BEAGLE, IMPUTE, MACH, and PLINK) using data from 449 German individuals genotyped in our laboratory for three genome-wide SNP sets [Affymetrix 5.0 (500 k), Affymetrix 6.0 (1,000 k), and Illumina 550 k]. We observed that HapMap-based imputation in a northern European population is powerful and reliable, even in highly variable genomic regions such as the extended MHC on chromosome 6p21. However, while genotype predictions were found to be highly accurate with all four programs, the number of SNPs for which imputation was actually carried out ('imputation efficacy') varied substantially. BEAGLE, IMPUTE, and MACH yielded nearly identical trade-offs between imputation accuracy and efficacy whereas PLINK performed consistently poorer. We nevertheless recommend either MACH or BEAGLE for practical use because these two programs are more userfriendly and generally require less memory than IMPUTE.
Spatial autocorrelation analysis of Y-STR genotypes
A genome-wide linkage analysis in 181 German sarcoidosis families using clustered bi-allelic markers
Chest, 2010
Sarcoidosis (SA) is a systemic granulomatous inflammatory disorder with complex etiology and stro... more Sarcoidosis (SA) is a systemic granulomatous inflammatory disorder with complex etiology and strong clustering in families. Genome-wide association studies have been successful in the identification of common risk variants for the disease. To reveal susceptibility variants with low frequencies but strong effects, we performed a genome-wide linkage scan in a large sample of SA families. We genotyped 528 members of 181 German SA families for 3,882 single nucleotide polymorphism assays from the SNPlex System Human Linkage Mapping Set 4K. Nonparametric linkage analysis revealed one region of suggestive linkage on chromosome 12p13.31 at 20 cM (logarithm of odds [LOD] = 2.53; local P value = .0003) and another linkage peak of nearly suggestive linkage on 9q33.1 at 134 cM (LOD = 2.12; local P value = .0009). The latter has been reported to show suggestive evidence for linkage in a sample of 229 African American SA families previously. Analysis of acute and chronically affected families revealed a subphenotype-specific linkage pattern and an additional, nearly suggestive linkage peak on chromosome 16p13.11 at 38 cM (LOD = 2.09; local P value = .001), which was confined to acute SA. Our results propose that the respective regions might harbor yet-unidentified, possibly subphenotype-specific risk factors for the disease (eg, with immune-related functions like the tumor necrosis factor receptor 1). They should be proved to be important for SA pathogenesis and investigated in detail with an emphasis on rare variants. Subphenotype-specific risk factors might serve for prognosis of the clinical course of the disease.
Statistical gene mapping of traits in humans[mdash ]hypertension as a complex trait: Is it amenable to genetic analysis?
Semin Nephrol, 2002
Increased Probability of Co-Occurrence of Two Rare Diseases in Consanguineous Families and Resolution of a Complex Phenotype by Next Generation Sequencing
PLOS ONE, 2016
Massively parallel sequencing of whole genomes and exomes has facilitated a direct assessment of ... more Massively parallel sequencing of whole genomes and exomes has facilitated a direct assessment of causative genetic variation, now enabling the identification of genetic factors involved in rare diseases (RD) with Mendelian inheritance patterns on an almost routine basis. Here, we describe the illustrative case of a single consanguineous family where this strategy suffered from the difficulty to distinguish between two etiologically distinct disorders, namely the co-occurrence of hereditary hypophosphatemic rickets (HRR) and congenital myopathies (CM), by their phenotypic manifestation alone. We used parametric linkage analysis, homozygosity mapping and whole exome-sequencing to identify mutations underlying HRR and CM. We also present an approximate approach for assessing the probability of co-occurrence of two unlinked recessive RD in a single family as a function of the degree of consanguinity and the frequency of the disease-causing alleles. Linkage analysis and homozygosity mapping yielded elusive results when assuming a single RD, but whole-exome sequencing helped to identify two mutations in two genes, namely SLC34A3 and SEPN1, that segregated independently in this family and that have previously been linked to two etiologically different diseases. We assess the increase in chance co-occurrence of rare diseases due to consanguinity, i.e. under circumstances that generally favor linkage mapping of recessive disease, and show that this probability can increase by several orders of magnitudes. We conclude that such potential co-occurrence represents an underestimated risk when analyzing rare or undefined diseases in consanguineous families and should be given more consideration in the clinical and genetic evaluation.
Multidimensional scaling (MDS) analysis of Y-STR genotypes
Benign infantile seizures and paroxysmal dyskinesia caused by an SCN8A mutation
Annals of neurology, Jan 17, 2015
Benign familial infantile seizures (BFIS), paroxysmal kinesigenic dyskinesia (PKD), and their com... more Benign familial infantile seizures (BFIS), paroxysmal kinesigenic dyskinesia (PKD), and their combination - known as infantile convulsions and paroxysmal choreoathetosis (ICCA) - are related autosomal dominant diseases. PRRT2 (proline-rich transmembrane protein 2 gene) has been identified as the major gene in all three conditions, found to be mutated in 80-90% of familial and 30-35% of sporadic cases. We searched for the genetic defect in PRRT2-negative, unrelated families with BFIS or ICCA using whole exome or targeted gene panel sequencing, and performed a detailed clinico-neurophysiological workup. In three families with a total of 16 affected members, we identified the same, co-segregating heterozygous missense mutation (c.4447G>A; p.E1483K) in SCN8A, encoding a voltage-gated sodium channel. A founder effect was excluded by linkage analysis. All individuals but one had normal cognitive and motor milestones, neuroimaging and interictal neurological status. Fifteen affected mem...
Power and Sample Size Calculations for
ABSTRACT
Genetic models for multifactorial diseases
Medizinische Genetik
Nature genetics, Jan 19, 2015
Alcohol misuse is the leading cause of cirrhosis and the second most common indication for liver ... more Alcohol misuse is the leading cause of cirrhosis and the second most common indication for liver transplantation in the Western world. We performed a genome-wide association study for alcohol-related cirrhosis in individuals of European descent (712 cases and 1,426 controls) with subsequent validation in two independent European cohorts (1,148 cases and 922 controls). We identified variants in the MBOAT7 (P = 1.03 × 10(-9)) and TM6SF2 (P = 7.89 × 10(-10)) genes as new risk loci and confirmed rs738409 in PNPLA3 as an important risk locus for alcohol-related cirrhosis (P = 1.54 × 10(-48)) at a genome-wide level of significance. These three loci have a role in lipid processing, suggesting that lipid turnover is important in the pathogenesis of alcohol-related cirrhosis.
CoNCoS: Copy number estimation in cancer with controlled support
Journal of Bioinformatics and Computational Biology, 2015
Somatic copy number (CN) alterations are major drivers of tumorigenesis and growth. Although next... more Somatic copy number (CN) alterations are major drivers of tumorigenesis and growth. Although next-generation sequencing (NGS) technologies enable a deep genomic analysis of cancers, the analysis of the data remains subject to biases and multiple sources of error, including varying local read coverage. The currently existing algorithms for NGS-based detection of CN abberations do not incorporate information on the local coverage quality. We have developed a new algorithm, copy number estimation with controlled support (CoNCoS) that increases the accuracy of CN estimation in paired tumor/normal exome sequencing data sets by assessing and optimizing the support for a site-specific CN estimate. We show by simulations and in a benchmarking study against single nucleotide polymorphism (SNP) microarray data that our approach outperforms the commonly used methods CNAnorm and VarScan2. Our algorithm is suitable to increase the accuracy of somatic CN analysis by a support-optimized estimation approach.
Family-Based Benchmarking of Copy Number Variation Detection Software
PloS one, 2015
The analysis of structural variants, in particular of copy-number variations (CNVs), has proven v... more The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated...
Mutations Causing Complex Disease May under Certain Circumstances Be Protective in an Epidemiological Sense
PloS one, 2015
Guided by the practice of classical epidemiology, research into the genetic basis of complex dise... more Guided by the practice of classical epidemiology, research into the genetic basis of complex disease has usually taken for granted the dictum that causative mutations are invariably over-represented among clinically affected as compared to unaffected individuals. However, we show that this supposition is not true and that a mutation contributing to the etiology of a complex disease can, under certain circumstances, be depleted among patients. Populations with defined disease prevalence were repeatedly simulated under a Wright-Fisher model, assuming various types of population history and genotype-phenotype relationship. For each simulation, the resulting mutation-specific population frequencies and odds ratios (ORs) were evaluated. In addition, the relationship between mutation frequency and OR was studied using real data from the NIH GWAS catalogue of reported phenotype associations of single-nucleotide polymorphisms (SNPs). While rare diseases (prevalence <1%) were found to be ...
NOD1 gene polymorphisms in relation to aggressive periodontitis
Innate immunity, 2009
NOD proteins are part of innate immunity mechanisms. They play a role in epithelial barrier funct... more NOD proteins are part of innate immunity mechanisms. They play a role in epithelial barrier functions and inflammatory responses to bacteria. Single nucleotide polymorphisms (SNPs) in the NOD1 gene have proven to be associated with inflammatory bowel disease (IBD) and asthma. To investigate SNPs in the NOD1 gene in relation to aggressive periodontitis (AgP), a multifactorial, inflammatory disease of the supporting tissues of the teeth. Five SNPs in the NOD1 gene (4 intronic and 1 exonic) were tested for association in a total of 415 AgP patients and 874 controls both of Northern European ancestry. The frequencies of the rare SNP alleles ranged between 21% and 26% among cases, and 20-27% among controls, and were not statistically different between cases and controls. Two SNPs were in strong linkage disequilibrium (r(2) = 0.97 in cases and 0.94 in controls). The overall haplotype distributions did not differ between cases and controls. We observed 8 haplotypes with a frequency of >...
Human genomics, 2009
Genotype imputation for single nucleotide polymorphisms (SNPs) has been shown to be a powerful me... more Genotype imputation for single nucleotide polymorphisms (SNPs) has been shown to be a powerful means to include genetic markers in exploratory genetic association studies without having to genotype them, and is becoming a standard procedure. A number of different software programs are available. In our experience, user-friendliness is often the deciding factor in the choice of software to solve a particular task. We therefore evaluated the usability of three publicly available imputation programs: BEAGLE, IMPUTE and MACH. We found all three programs to perform well with HapMap reference data, with little effort needed for data preparation and subsequent association analysis. Each of them has different strengths and weaknesses, however, and none is optimal for all situations.
Forensic Science International: Genetics, 2015
Short tandem repeat (STR) markers are widely and continuously used in forensic applications. Howe... more Short tandem repeat (STR) markers are widely and continuously used in forensic applications. However, past research has demonstrated substantial allelic association between STR markers on both autosomes and the X chromosome, leading to partially redundant information that these markers can provide. Here, we quantify the allelic association between Y-chromosomal STR markers that are part of established forensic panels, separately for three different continental groups. We further propose a sequential marker selection procedure that is based on Shannon's equivocation and that accounts for allelic association between STR markers, leading to a maximal gain in independent information. In application to three realworld data sets, we demonstrate the procedure's superior performance when compared to single-locus diversity selection strategies, resulting in the optimal marker set for a given data set in the majority of marker subsets. Noting the inferior performance of the established Y-STR marker panels in a retrospective investigation, we suggest that future forensic marker selection should be guided, besides by other technical selection criteria, by an equivocation-based approach to obtain maximally discriminatory marker sets at minimal cost.