Meghna Singh - Academia.edu (original) (raw)
Papers by Meghna Singh
BMC Research Notes
Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak re... more Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak response system, especially in the case of emerging infectious diseases and viral epidemics. This process is generally tedious and time-consuming thus making it ineffective in traditional settings. The added complexity in these situations is the non-availability of pure isolates of pathogens as they are present as mixed genomes or hologenomes. Next-generation sequencing approaches offer an attractive solution in this scenario as it provides adequate depth of sequencing at fast and affordable costs, apart from making it possible to decipher complex interactions between genomes at a scale that was not possible before. The widespread application of next-generation sequencing in this field has been limited by the non-availability of an efficient computational pipeline to systematically analyze data to delineate pathogen genomes from mixed population of genomes or hologenomes. We applied next-g...
Exome sequencing reveals a novel mutation, p.L325H, in the KRT5 gene associated with autosomal do... more Exome sequencing reveals a novel mutation, p.L325H, in the KRT5 gene associated with autosomal dominant Epidermolysis Bullosa Simplex Koebner type in a large family from western India We report a large, non-consanguineous family comprising five generations of individuals residing in Gujarat, India affected with localized Epidermolysis Bullosa Simplex (EBS) Koebner type. We analyzed 14 individuals including 9 affected individuals from this family. Exome sequencing in two cases suggested a novel non-synonymous variation, p.L325H, in the KRT5 gene. The present analysis also reports the first causative mutation of EBS Koebner type from India. Human Genome Variation (2014) 1, 14007; doi:10.1038/hgv.2014.7; published online 4 September 2014 REPORT Epidermolysis bullosa simplex (EBS) is a rare and polymorphic skin disorder characterized by skin fragility, which causes blisters and further erosion owing to mechanical pressure or friction. The pathophysiology of this disorder has been traced...
PloS one, 2015
The tubercle complex consists of closely related mycobacterium species which appear to be variant... more The tubercle complex consists of closely related mycobacterium species which appear to be variants of a single species. Comparative genome analysis of different strains could provide useful clues and insights into the genetic diversity of the species. We integrated genome assemblies of 96 strains from Mycobacterium tuberculosis complex (MTBC), which included 8 Indian clinical isolates sequenced and assembled in this study, to understand its pangenome architecture. We predicted genes for all the 96 strains and clustered their respective CDSs into homologous gene clusters (HGCs) to reveal a hard-core, soft-core and accessory genome component of MTBC. The hard-core (HGCs shared amongst 100% of the strains) was comprised of 2,066 gene clusters whereas the soft-core (HGCs shared amongst at least 95% of the strains) comprised of 3,374 gene clusters. The change in the core and accessory genome components when observed as a function of their size revealed that MTBC has an open pangenome. We...
These authors contributed equally to this work.
Zebrafish, 2013
Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred lab... more Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred laboratory animals. The nonisogenic nature of the zebrafish as a model system offers the opportunity to understand natural variations and their effect in modulating phenotype. In an effort to better characterize the range of natural variation in this model system and to complement the zebrafish reference genome project, the whole genome sequence of a wild zebrafish at 39-fold genome coverage was determined. Comparative analysis with the zebrafish reference genome revealed approximately 5.2 million single nucleotide variations and over 1.6 million insertion-deletion variations. This dataset thus represents a new catalog of genetic variations in the zebrafish genome. Further analysis revealed selective enrichment for variations in genes involved in immune function and response to the environment, suggesting genome-level adaptations to environmental niches. We also show that human disease gene orthologs in the sequenced wild zebrafish genome show a lower ratio of nonsynonymous to synonymous single nucleotide variations.
PLoS ONE, 2012
DNA methylation is crucial for gene regulation and maintenance of genomic stability. Rat has been... more DNA methylation is crucial for gene regulation and maintenance of genomic stability. Rat has been a key model system in understanding mammalian systemic physiology, however detailed rat methylome remains uncharacterized till date. Here, we present the first high resolution methylome of rat liver generated using Methylated DNA immunoprecipitation and high throughput sequencing (MeDIP-Seq) approach. We observed that within the DNA/RNA repeat elements, simple repeats harbor the highest degree of methylation. Promoter hypomethylation and exon hypermethylation were common features in both RefSeq genes and expressed genes (as evaluated by proteomic approach). We also found that although CpG islands were generally hypomethylated, about 6% of them were methylated and a large proportion (37%) of methylated islands fell within the exons. Notably, we obeserved significant differences in methylation of terminal exons (UTRs); methylation being more pronounced in coding/partially coding exons compared to the non-coding exons. Further, events like alternate exon splicing (cassette exon) and intron retentions were marked by DNA methylation and these regions are retained in the final transcript. Thus, we suggest that DNA methylation could play a crucial role in marking coding regions thereby regulating alternative splicing. Apart from generating the first high resolution methylome map of rat liver tissue, the present study provides several critical insights into methylome organization and extends our understanding of interplay between epigenome, gene expression and genome stability.
Human mutation, 2012
Whole genome sequencing of personal genomes has revealed a large repertoire of genomic variations... more Whole genome sequencing of personal genomes has revealed a large repertoire of genomic variations and has provided a rich template for identification of common and rare variants in genomes in addition to understanding the genetic basis of diseases. The widespread application of personal genome sequencing in clinical settings for predictive and preventive medicine has been limited due to the lack of comprehensive computational analysis pipelines. We have used next-generation sequencing technology to sequence the whole genome of a self-declared healthy male of Indian origin. We have generated around 28X of the reference human genome with over 99% coverage. Analysis revealed over 3 million single nucleotide variations and about 490,000 small insertion-deletion events including several novel variants. Using this dataset as a template, we designed a comprehensive computational analysis pipeline for the systematic analysis and annotation of functionally relevant variants in the genome. Th...
Database, 2014
These authors contributed equally to this work.
BMC Research Notes, 2012
Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak re... more Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak response system, especially in the case of emerging infectious diseases and viral epidemics. This process is generally tedious and time-consuming thus making it ineffective in traditional settings. The added complexity in these situations is the non-availability of pure isolates of pathogens as they are present as mixed genomes or hologenomes. Next-generation sequencing approaches offer an attractive solution in this scenario as it provides adequate depth of sequencing at fast and affordable costs, apart from making it possible to decipher complex interactions between genomes at a scale that was not possible before. The widespread application of next-generation sequencing in this field has been limited by the non-availability of an efficient computational pipeline to systematically analyze data to delineate pathogen genomes from mixed population of genomes or hologenomes. We applied next-generation sequencing on a sample containing mixed population of genomes from an epidemic with appropriate processing and enrichment. The data was analyzed using an extensive computational pipeline involving mapping to reference genome sets and de-novo assembly. In depth analysis of the data generated revealed the presence of sequences corresponding to Japanese encephalitis virus. The genome of the virus was also independently de-novo assembled. The presence of the virus was in addition, verified using standard molecular biology techniques. Our approach can accurately identify causative pathogens from cell culture hologenome samples containing mixed population of genomes and in principle can be applied to patient hologenome samples without any background information. This methodology could be widely applied to identify and isolate pathogen genomes and understand their genomic variability during outbreaks.
Genome announcements, 2013
We describe the genome sequencing and analysis of a clinical isolate of Mycobacterium tuberculosi... more We describe the genome sequencing and analysis of a clinical isolate of Mycobacterium tuberculosis East African Indian (EAI) strain OSDD271 from India.
BMC Research Notes
Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak re... more Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak response system, especially in the case of emerging infectious diseases and viral epidemics. This process is generally tedious and time-consuming thus making it ineffective in traditional settings. The added complexity in these situations is the non-availability of pure isolates of pathogens as they are present as mixed genomes or hologenomes. Next-generation sequencing approaches offer an attractive solution in this scenario as it provides adequate depth of sequencing at fast and affordable costs, apart from making it possible to decipher complex interactions between genomes at a scale that was not possible before. The widespread application of next-generation sequencing in this field has been limited by the non-availability of an efficient computational pipeline to systematically analyze data to delineate pathogen genomes from mixed population of genomes or hologenomes. We applied next-g...
Exome sequencing reveals a novel mutation, p.L325H, in the KRT5 gene associated with autosomal do... more Exome sequencing reveals a novel mutation, p.L325H, in the KRT5 gene associated with autosomal dominant Epidermolysis Bullosa Simplex Koebner type in a large family from western India We report a large, non-consanguineous family comprising five generations of individuals residing in Gujarat, India affected with localized Epidermolysis Bullosa Simplex (EBS) Koebner type. We analyzed 14 individuals including 9 affected individuals from this family. Exome sequencing in two cases suggested a novel non-synonymous variation, p.L325H, in the KRT5 gene. The present analysis also reports the first causative mutation of EBS Koebner type from India. Human Genome Variation (2014) 1, 14007; doi:10.1038/hgv.2014.7; published online 4 September 2014 REPORT Epidermolysis bullosa simplex (EBS) is a rare and polymorphic skin disorder characterized by skin fragility, which causes blisters and further erosion owing to mechanical pressure or friction. The pathophysiology of this disorder has been traced...
PloS one, 2015
The tubercle complex consists of closely related mycobacterium species which appear to be variant... more The tubercle complex consists of closely related mycobacterium species which appear to be variants of a single species. Comparative genome analysis of different strains could provide useful clues and insights into the genetic diversity of the species. We integrated genome assemblies of 96 strains from Mycobacterium tuberculosis complex (MTBC), which included 8 Indian clinical isolates sequenced and assembled in this study, to understand its pangenome architecture. We predicted genes for all the 96 strains and clustered their respective CDSs into homologous gene clusters (HGCs) to reveal a hard-core, soft-core and accessory genome component of MTBC. The hard-core (HGCs shared amongst 100% of the strains) was comprised of 2,066 gene clusters whereas the soft-core (HGCs shared amongst at least 95% of the strains) comprised of 3,374 gene clusters. The change in the core and accessory genome components when observed as a function of their size revealed that MTBC has an open pangenome. We...
These authors contributed equally to this work.
Zebrafish, 2013
Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred lab... more Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred laboratory animals. The nonisogenic nature of the zebrafish as a model system offers the opportunity to understand natural variations and their effect in modulating phenotype. In an effort to better characterize the range of natural variation in this model system and to complement the zebrafish reference genome project, the whole genome sequence of a wild zebrafish at 39-fold genome coverage was determined. Comparative analysis with the zebrafish reference genome revealed approximately 5.2 million single nucleotide variations and over 1.6 million insertion-deletion variations. This dataset thus represents a new catalog of genetic variations in the zebrafish genome. Further analysis revealed selective enrichment for variations in genes involved in immune function and response to the environment, suggesting genome-level adaptations to environmental niches. We also show that human disease gene orthologs in the sequenced wild zebrafish genome show a lower ratio of nonsynonymous to synonymous single nucleotide variations.
PLoS ONE, 2012
DNA methylation is crucial for gene regulation and maintenance of genomic stability. Rat has been... more DNA methylation is crucial for gene regulation and maintenance of genomic stability. Rat has been a key model system in understanding mammalian systemic physiology, however detailed rat methylome remains uncharacterized till date. Here, we present the first high resolution methylome of rat liver generated using Methylated DNA immunoprecipitation and high throughput sequencing (MeDIP-Seq) approach. We observed that within the DNA/RNA repeat elements, simple repeats harbor the highest degree of methylation. Promoter hypomethylation and exon hypermethylation were common features in both RefSeq genes and expressed genes (as evaluated by proteomic approach). We also found that although CpG islands were generally hypomethylated, about 6% of them were methylated and a large proportion (37%) of methylated islands fell within the exons. Notably, we obeserved significant differences in methylation of terminal exons (UTRs); methylation being more pronounced in coding/partially coding exons compared to the non-coding exons. Further, events like alternate exon splicing (cassette exon) and intron retentions were marked by DNA methylation and these regions are retained in the final transcript. Thus, we suggest that DNA methylation could play a crucial role in marking coding regions thereby regulating alternative splicing. Apart from generating the first high resolution methylome map of rat liver tissue, the present study provides several critical insights into methylome organization and extends our understanding of interplay between epigenome, gene expression and genome stability.
Human mutation, 2012
Whole genome sequencing of personal genomes has revealed a large repertoire of genomic variations... more Whole genome sequencing of personal genomes has revealed a large repertoire of genomic variations and has provided a rich template for identification of common and rare variants in genomes in addition to understanding the genetic basis of diseases. The widespread application of personal genome sequencing in clinical settings for predictive and preventive medicine has been limited due to the lack of comprehensive computational analysis pipelines. We have used next-generation sequencing technology to sequence the whole genome of a self-declared healthy male of Indian origin. We have generated around 28X of the reference human genome with over 99% coverage. Analysis revealed over 3 million single nucleotide variations and about 490,000 small insertion-deletion events including several novel variants. Using this dataset as a template, we designed a comprehensive computational analysis pipeline for the systematic analysis and annotation of functionally relevant variants in the genome. Th...
Database, 2014
These authors contributed equally to this work.
BMC Research Notes, 2012
Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak re... more Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak response system, especially in the case of emerging infectious diseases and viral epidemics. This process is generally tedious and time-consuming thus making it ineffective in traditional settings. The added complexity in these situations is the non-availability of pure isolates of pathogens as they are present as mixed genomes or hologenomes. Next-generation sequencing approaches offer an attractive solution in this scenario as it provides adequate depth of sequencing at fast and affordable costs, apart from making it possible to decipher complex interactions between genomes at a scale that was not possible before. The widespread application of next-generation sequencing in this field has been limited by the non-availability of an efficient computational pipeline to systematically analyze data to delineate pathogen genomes from mixed population of genomes or hologenomes. We applied next-generation sequencing on a sample containing mixed population of genomes from an epidemic with appropriate processing and enrichment. The data was analyzed using an extensive computational pipeline involving mapping to reference genome sets and de-novo assembly. In depth analysis of the data generated revealed the presence of sequences corresponding to Japanese encephalitis virus. The genome of the virus was also independently de-novo assembled. The presence of the virus was in addition, verified using standard molecular biology techniques. Our approach can accurately identify causative pathogens from cell culture hologenome samples containing mixed population of genomes and in principle can be applied to patient hologenome samples without any background information. This methodology could be widely applied to identify and isolate pathogen genomes and understand their genomic variability during outbreaks.
Genome announcements, 2013
We describe the genome sequencing and analysis of a clinical isolate of Mycobacterium tuberculosi... more We describe the genome sequencing and analysis of a clinical isolate of Mycobacterium tuberculosis East African Indian (EAI) strain OSDD271 from India.