Liuyang Wang | Duke University (original) (raw)

Papers by Liuyang Wang

Research paper thumbnail of A Bayesian Approach to Inferring Rates of Selfing and Locus-Specific Mutation

Genetics, Jan 14, 2015

We present a Bayesian method for characterizing the mating system of populations reproducing thro... more We present a Bayesian method for characterizing the mating system of populations reproducing through a mixture of self-fertilization and random outcrossing. Our method uses patterns of genetic variation across the genome as a basis for inference about reproduction under pure hermaphroditism, gynodioecy, and a model developed to describe the self-fertilizing killifish Kryptolebias marmoratus. We extend the standard coalescence model to accommodate these mating systems, accounting explicitly for multilocus identity disequilibrium, inbreeding depression, and variation in fertility among mating types. We incorporate the Ewens Sampling Formula (ESF) under the infinite-alleles model of mutation to obtain a novel expression for the likelihood of mating system parameters. Our Markov chain Monte Carlo (MCMC) algorithm assigns locus-specific mutation rates, drawn from a common mutation rate distribution that is itself estimated from the data using a Dirichlet Process Prior model. Our sampler ...

Research paper thumbnail of Introductory Editorial–RNA: An Expanding View of Function and Evolution

Evolutionary Bioinformatics, 2016

Research paper thumbnail of CPAG: software for leveraging pleiotropy in GWAS to reveal similarity between human traits links plasma fatty acids and intestinal inflammation

Genome biology, 2015

Meta-analyses of genome-wide association studies (GWAS) have demonstrated that the same genetic v... more Meta-analyses of genome-wide association studies (GWAS) have demonstrated that the same genetic variants can be associated with multiple diseases and other complex traits. We present software called CPAG (Cross-Phenotype Analysis of GWAS) to look for similarities between 700 traits, build trees with informative clusters, and highlight underlying pathways. Clusters are consistent with pre-defined groups and literature-based validation but also reveal novel connections. We report similarity between plasma palmitoleic acid and Crohn's disease and find that specific fatty acids exacerbate enterocolitis in zebrafish. CPAG will become increasingly powerful as more genetic variants are uncovered, leading to a deeper understanding of complex traits. CPAG is freely available at www.sourceforge.net/projects/CPAG/.

Research paper thumbnail of Expression Divergence of Duplicate Genes in the Protein Kinase Superfamily in Pacific Oyster

Evolutionary Bioinformatics, 2015

Gene duplication has been proposed to serve as the engine of evolutionary innovation. It is well ... more Gene duplication has been proposed to serve as the engine of evolutionary innovation. It is well recognized that eukaryotic genomes contain a large number of duplicated genes that evolve new functions or expression patterns. However, in mollusks, the evolutionary mechanisms underlying the divergence and the functional maintenance of duplicate genes remain little understood. In the present study, we performed a comprehensive analysis of duplicate genes in the protein kinase superfamily using whole genome and transcriptome data for the Pacific oyster. A total of 64 duplicated gene pairs were identified based on a phylogenetic approach and the reciprocal best BLAST method. By analyzing gene expression from RNA-seq data from 69 different developmental and stimuli-induced conditions (nine tissues, 38 developmental stages, eight dry treatments, seven heat treatments, and seven salty treatments), we found that expression patterns were significantly correlated for a number of duplicate gene pairs, suggesting the conservation of regulatory mechanisms following divergence. Our analysis also identified a subset of duplicate gene pairs with very high expression divergence, indicating that these gene pairs may have been subjected to transcriptional subfunctionalization or neofunctionalization after the initial duplication events. Further analysis revealed a significant correlation between expression and sequence divergence (as revealed by synonymous or nonsynonymous substitution rates) under certain conditions. Taken together, these results provide evidence for duplicate gene sequence and expression divergence in the Pacific oyster, accompanying its adaptation to harsh environments. Our results provide new insights into the evolution of duplicate genes and their expression levels in the Pacific oyster.

Research paper thumbnail of De Novo Transcriptome Assembly and Development of Novel Microsatellite Markers for the Traditional Chinese Medicinal Herb, Veratrilla baillonii Franch (Gentianaceae)

Evolutionary Bioinformatics, 2015

Veratrilla baillonii Franch is an important Chinese medicinal herb for treating liver-related dis... more Veratrilla baillonii Franch is an important Chinese medicinal herb for treating liver-related diseases, which has been over-collected in the recent decades. However, the effective conservation and related population genetic study has been hindered because of the lack of genome sequences and genetic markers in the natural population. We have conducted RNA-seq on V. baillonii. We performed de novo assembly of these data to characterize the V. baillonii transcriptome, resulting in 133,019 contigs with size 200 bp. These contigs were annotated using the NCBI nonredundant database and Gene Ontology (GO) terms. From these contigs, we developed novel microsatellite simple sequence repeat (SSR) markers, identifying a total of 40,885 SSRs. SSRs with repeat motifs of 1-4 bp (mono-, di-, tri-, and tetranucleotides) accounted for 99.8% of all SSRs, with mononucleotide repeats most common, followed by dinucleotide (16.2%) and trinucleotide repeats (14.7%). We selected 151 SSRs for experimental validation, of which 74 were confirmed by polymerase chain reaction. Fourteen SSRs were determined to be polymorphic by screening 40 individuals from six distant populations. The number of alleles per locus ranged from two to four, and the expected heterozygosity varied from 0.2637 to 0.8571, suggesting that these SSR markers are highly polymorphic and effective for further genetic analysis in the nature population. In addition, we explored the genetic structure of V. baillonii using five SSRs in four geographic populations and found that the identified genotypes were clustered into two phylogenetic clades: the Mekong River clade and Jinsha River clade. This result indicates that these two regions may harbor highly divergent genetic lineages and enriched genetic diversity. The de novo transcriptome sequences and new SSR markers discovered by this study provide an initial step for understanding the population genetics of V. baillonii, and a valuable resource for effective conservation management.

Research paper thumbnail of RNA-seq Reveals Complicated Transcriptomic Responses to Drought Stress in a Nonmodel Tropic Plant, Bombax ceiba L

Evolutionary Bioinformatics, 2015

High-throughput transcriptome provides an unbiased approach for understanding the genetic basis a... more High-throughput transcriptome provides an unbiased approach for understanding the genetic basis and gene functions in response to different conditions. Here we sequenced RNA-seq libraries derived from a Bombax ceiba L. system under a controlled experiment. As a known medicinal and ornamental plant, B. ceiba grows mainly in hot-dry monsoon rainforests in Southeast Asia and Australia. Due to the specific growth environment, it has evolved a unique system that enables a physiologic response to drought stress. To date, few studies have characterized the genome-wide features of drought endurance in B. ceiba. In this study, we first attempted to characterize and identify the most differentially expressed genes and associated functional pathways under drought treatment and normal condition. Using RNA-seq technology, we generated the first transcriptome of B. ceiba and identified 59 differentially expressed genes with greater than 1,000-fold changes under two conditions. The set of upregulated genes implicates interplay among various pathways: plants growth, ubiquitin-mediated proteolysis, polysaccharides hydrolyzation, oxidative phosphorylation and photosynthesis, etc. In contrast, genes associated with stem growth, cell division, fruit ripening senescence, disease resistance, and proline synthesis are repressed. Notably, key genes of high RPKM levels in drought are AUX1, JAZ, and psbS, which are known to regulate the growth of plants, the resistance against abiotic stress, and the photosynthesis process. Furthermore, 16,656 microsatellite markers and 3,071 single-nucleotide polymorphisms (SNPs) were predicted by in silico methods. The identification and functional annotation of differentially expressed genes, microsatellites, and SNPs represent a major step forward and would serve as a valuable resource for understanding the complexity underlying drought endurance and adaptation in B. ceiba.

Research paper thumbnail of Comparative Analyses of Clinical and Environmental Populations of Cryptococcus neoformans in Botswana

Molecular ecology, Jan 6, 2015

Cryptococcus neoformans var. grubii (Cng) is the most common cause of fungal meningitis and its p... more Cryptococcus neoformans var. grubii (Cng) is the most common cause of fungal meningitis and its prevalence is highest in sub-Saharan Africa. Patients become infected by inhaling airborne spores or desiccated yeast cells from the environment, where the fungus thrives in avian droppings, trees, and soil. To investigate the prevalence and population structure of Cng in southern Africa, we analyzed isolates from 77 environmental samples and 64 patients. We detected significant genetic diversity among isolates and strong evidence of geographic structure at the local level. High proportions of isolates with the rare MATa allele were observed in both clinical and environmental isolates; however, the mating type alleles were unevenly distributed among different subpopulations. Nearly equal proportions of the MATa and MATα mating types were observed among all clinical isolates and in one environmental subpopulation from the eastern part of Botswana. As previously reported, there was evidence...

Research paper thumbnail of Refugial isolation and range expansions drive the genetic structure of Oxyria sinensis (Polygonaceae) in the Himalaya-Hengduan Mountains

Scientific reports, 2015

The formation of the Mekong-Salween Divide and climatic oscillations in Pleistocene were the main... more The formation of the Mekong-Salween Divide and climatic oscillations in Pleistocene were the main drivers for the contemporary diversity and genetic structure of plants in the Himalaya-Hengduan Mountains (HHM). To identify the relative roles of the two historical events in shaping population history of plants in HHM, we investigated the phylogeographic pattern of Oxyria sinensis, a perennial plant endemic to the HHM. Sixteen chloroplast haplotypes were identified and were clustered into three phylogenetic clades. The age of the major clades was estimated to be in the Pleistocene, falling into several Pleistocene glacial stages and postdating the formation of the Mekong-Salween Divide. Range expansions occurred at least twice in the early and middle Pleistocene, but the spatial genetic distribution rarely changed since the Last Glacial Maximum. Our results suggest that temporary mountain glaciers may act as barriers in promoting the lineage divergence in O. sinensis and that subseque...

Research paper thumbnail of Measures of linkage disequilibrium among neighbouring SNPs indicate asymmetries across the house mouse hybrid zone

Molecular Ecology, 2011

Theory predicts that naturally occurring hybrid zones between genetically distinct taxa can move ... more Theory predicts that naturally occurring hybrid zones between genetically distinct taxa can move over space and time as a result of selection and/or demographic processes, with certain types of hybrid zones being more or less likely to move. Determining whether a hybrid zone is stationary or moving has important implications for understanding evolutionary processes affecting interactions in hybrid populations. However, direct observations of hybrid zone movement are difficult to make unless the zone is moving rapidly. Here, evidence for movement in the house mouse Mus musculus domesticus × Mus musculus musculus hybrid zone is provided using measures of LD and haplotype structure among neighbouring SNP markers from across the genome. Local populations of mice across two transects in Germany and the Czech Republic were sampled, and a total of 1301 mice were genotyped at 1401 markers from the nuclear genome. Empirical measures of LD provide evidence for extinction and (re)colonization in single populations and, together with simulations, suggest hybrid zone movement because of either geography-dependent asymmetrical dispersal or selection favouring one subspecies over the other.

Research paper thumbnail of Confirmation of natural hybrids between Gentiana straminea and G. siphonantha (Gentianaceae) based on molecular evidence

Frontiers of Biology in China, 2008

Abstract A few individuals with intermediate morphology always appeared in the sympatric distribu... more Abstract A few individuals with intermediate morphology always appeared in the sympatric distributions of Gentiana straminea and G. siphonantha. These intermediate individuals were hypothesized to be the hybrids of two species after a careful evaluation of their ...

Research paper thumbnail of Linkage disequilibrium approaches for detecting hybrid zone movement

Evolution of the House Mouse, 2012

Research paper thumbnail of Genome-wide architecture of reproductive isolation in a naturally occurring hybrid zone between Mus musculus musculus and M. m. domesticus

Molecular Ecology, 2012

Studies of a hybrid zone between two house mouse subspecies (Mus musculus musculus and M. m. dome... more Studies of a hybrid zone between two house mouse subspecies (Mus musculus musculus and M. m. domesticus) along with studies using laboratory crosses reveal a large role for the X chromosome and multiple autosomal regions in reproductive isolation as a consequence of disrupted epistasis in hybrids. One limitation of previous work has been that most of the identified genomic regions have been large. The goal here is to detect and characterize precise genomic regions underlying reproductive isolation. We surveyed 1401 markers evenly spaced across the genome in 679 mice collected from two different transects. Comparisons between transects provide a means for identifying common patterns that likely reflect intrinsic incompatibilities. We used a genomic cline approach to identify patterns that correspond to epistasis. From both transects, we identified contiguous regions on the X chromosome in which markers were inferred to be involved in epistatic interactions. We then searched for autosomal regions showing the same patterns and found they constitute about 5% of autosomal markers. We discovered substantial overlap between these candidate regions underlying reproductive isolation and QTL for hybrid sterility identified in laboratory crosses. Analysis of gene content in these regions suggests a key role for several mechanisms, including the regulation of transcription, sexual conflict and sexual selection operating at both the postmating prezygotic and postzygotic stages of reproductive isolation. Taken together, these results indicate that speciation in two recently diverged (c. 0.5 Ma) house mouse subspecies is complex, involving many genes dispersed throughout the genome and associated with distinct functions.

Research paper thumbnail of History and evolution of alpine plants endemic to the Qinghai-Tibetan Plateau: Aconitum gymnandrum (Ranunculaceae)

Molecular Ecology, 2009

Here, we report a survey of chloroplast DNA (cpDNA) and nuclear ribosomal internal transcribed sp... more Here, we report a survey of chloroplast DNA (cpDNA) and nuclear ribosomal internal transcribed spacer (ITS) DNA variation aimed at exploring the phylogeographical history of the QTP alpine endemic Aconitum gymnandrum. We sequenced three cpDNA fragments (rpl20-rps12 intergenic spacer, the trnV intron and psbA-trnH spacer) and also the nuclear (ITS) region in 245 individuals from 23 populations sampled throughout the species' range. Two distinct lineages, with eastern and western geographical distributions respectively, were identified from a phylogenetic analysis of ITS sequence variation. Based on a fast substitution rate, these were estimated to have diverged from each other in the early Pleistocene approximately 1.45 Ma. The analysis of cpDNA variation identified nine chlorotypes that clustered into two major clades that were broadly congruent in geographical distribution with the two ITS lineages. The east-west split of cpDNA divergence was supported by an AMOVA which partitioned approximately half of the total variance between these two groups of populations. Analysis of the spatial distribution of chlorotypes showed that each clade was subdivided into two groups of populations such that a total of four population groups existed in the species. It is suggested that these different groups derive from four independent glacial refugia that existed during the Last Glacial Maximum (LGM), and that three of these refugia were located at high altitude on the QTP platform itself at that time. Coalescent simulation of chlorotype genealogies supported both an early Pleistocene origin of the two main cpDNA clades and also the 'four-refugia' hypothesis during the LGM. Two previous phylogeographical studies of QTP alpine plants indicated that such plants retreated to refugia at the eastern/south-eastern plateau edge during the LGM and/or previous glacial maxima. However, the results for A. gymnandrum suggest that at least some of these cold-tolerant species may have also survived centrally on the QTP platform throughout the Quaternary.

Research paper thumbnail of The uncharacterized gene 1700093K21Rik and flanking regions are correlated with reproductive isolation in the house mouse, Mus musculus

Mammalian Genome, 2014

Reproductive barriers exist between the house mouse subspecies, Mus musculus musculus and M. m. d... more Reproductive barriers exist between the house mouse subspecies, Mus musculus musculus and M. m. domesticus, members of the Mus musculus species complex, primarily as a result of hybrid male infertility, and a hybrid zone exists where their ranges intersect in Europe. Using single nucleotide polymorphisms (SNPs) diagnostic for the two taxa, the extent of introgression across the genome was previously compared in these hybrid populations. Sixty-nine of 1316 autosomal SNPs exhibited reduced introgression in two hybrid zone transects suggesting maladaptive interactions among certain loci. One of these markers is within a region on chromosome 11 that, in other studies, has been associated with hybrid male sterility of these subspecies. We assessed sequence variation in a 20 Mb region on chromosome 11 flanking this marker, and observed its inclusion within a roughly 150 kb stretch of DNA showing elevated sequence differentiation between the two subspecies. Four genes are associated with this genomic subregion, with two entirely encompassed. One of the two genes, the uncharacterized 1700093K21Rik gene, displays distinguishing features consistent with a potential role in reproductive isolation between these subspecies. Along with its expression specifically within spermatogenic cells, we present various sequence analyses that demonstrate a high rate of molecular evolution of this gene, as well as identify a subspecies amino acid variant resulting in a structural difference. Taken together, the data suggest a role for this gene in reproductive isolation.

Research paper thumbnail of Repeated Range Expansion and Glacial Endurance of Potentilla glabra (Rosaceae) in the Qinghai-Tibetan Plateau

Journal of Integrative Plant Biology, 2009

To date, little is still known about how alpine species occurring in the Qinghai-Tibetan Plateau ... more To date, little is still known about how alpine species occurring in the Qinghai-Tibetan Plateau (QTP) responded to past climatic oscillations. Here, by using variations of the chloroplast trnT-L, we examined the genetic distribution pattern of 101 individuals of Potentilla glabra, comprising both the interior QTP and the plateau edge. Phylogenetic and network analyses of 31 recovered haplotypes identified three tentative clades (A, B and C). Analysis of molecular variance (AMOVA) revealed that most of the genetic variability was found within populations (0.693), while differentiations between populations were obviously distinct (F st = 0.307). Two independent range expansions within clades A and B occurring at approximately 316 and 201 thousand years ago (kya) were recovered from the hierarchical mismatch analysis, and these two expansions were also confirmed by Fu's F S values and 'g' tests. However, distant distributions of clade C and private haplotypes from clades A and B suggest that they had survived the Last Glacial Maximum (LGM) and previous glaciers in situ since their origins. Our findings based on available limited samples support that multiple refugia of a few cold-enduring species had been maintained in the QTP platform during LGM and/or previous glacial stages. JQ (2009). Repeated range expansion and glacial endurance of Potentilla glabra (Rosaceae) in the Qinghai-Tibetan Plateau.

Research paper thumbnail of Allopatric divergence and phylogeographic structure of the plateau zokor ( Eospalax baileyi ), a fossorial rodent endemic to the Qinghai-Tibetan Plateau

Journal of Biogeography, 2010

Research paper thumbnail of Phylogeographic analyses suggest that a deciduous species ( Ostryopsis davidiana Decne., Betulaceae) survived in northern China during the Last Glacial Maximum

Journal of Biogeography, 2009

Research paper thumbnail of Genetic variation in the endangered Anisodus tanguticus (Solanaceae), an alpine perennial endemic to the Qinghai-Tibetan Plateau

Genetica, 2007

We used random amplified polymorphic DNA markers (RAPDs) to assess genetic variation between-and ... more We used random amplified polymorphic DNA markers (RAPDs) to assess genetic variation between-and within-populations of Anisodus tanguticus (Solanaceae), an endangered perennial endemic to the Qinghai-Tibetan Plateau with important medicinal value. We recorded a total of 92 amplified bands, using 12 RAPD primers, 76 of which (P = 82.61%) were polymorphic, and calculated values of H t and H sp of 0.3015 and 0.4459, respectively, suggesting a remarkably high rate of genetic variation at the species level. The average within-population diversity also appeared to be high, with P, H e and H pop values of 55.11%, 0.1948 and 0.2918, respectively. Analyses of molecular variance (AMOVA) showed that among-and between-population genetic variation accounted for 67.02% and 32.98% of the total genetic variation, respectively. In addition, Nei's coefficient of differentiation (G ST ) was found to be high (0.35), confirming the relatively high level of genetic differentiation among the populations. These differentiation coefficients are higher than mean corresponding coefficients for outbreeding species, but lower than reported coefficients for some rare species from this region. The genetic structure of A. tanguticus has probably been shaped by its breeding attributes, biogeographic history and human impact due to collection for medicinal purposes. The observed genetic variations suggest that as many populations as possible should be considered in any planned in situ or ex situ conservation programs for this species.

Research paper thumbnail of Evolutionary history of an alpine shrub Hippophae tibetana (Elaeagnaceae): allopatric divergence and regional expansion

Biological Journal of the Linnean Society, 2011

Increasing evidence suggests that geological or climatic events in the past promoted allopatric s... more Increasing evidence suggests that geological or climatic events in the past promoted allopatric speciation of alpine plants in the Qinghai-Tibetan Plateau and adjacent region. However, few studies have been undertaken to examine whether such allopatric divergences also occurred within a morphologically uniform species. In the present study, we report the evolutionary history of an alpine shrub species, Hippophae tibetana, based on examining chloroplast DNA (cpDNA) and nuclear ribosomal internal transcribed spacer (ITS) DNA variations. We sequenced two cpDNA fragments (trnL-F and trnS-G) and the nuclear ITS region in 183 individuals collected from 21 natural populations. Ten chlorotypes and 17 ITS types were identified. Phylogenetic analyses of both chlorotypes and ITS sequence variations suggested two distinct lineages distributed in the eastern and western region, respectively. On the basis of the fast and low plant substitution rates, these two lineages were estimated to have diverged from each other between 1 and 4 million years ago, during the period of the major glaciations and orogenic processes. In addition, ITS has undergone the accelerated evolution in two populations in the southern Himalaya isolated by the high mountains with a surprising accumulation of the private variations. The east-west split was also supported by an analysis of molecular variance, which partitioned around 91% of the total cpDNA variance between these two groups of populations. A single chlorotype was found for most populations in eastern or western region, suggesting a recent postglacial expansion within each region. Star-phylogeny and mismatch analyses of all chlorotypes within the eastern group of populations suggested an earlier regional expansion before the Last Glacial Maximum (LGM). The local fixture of the different chlorotypes in multiple populations suggested more than one refugia remained for eastern or western region. Coalescent tests rejected the hypothesis that all current populations originated from a single refugium during the LGM. Instead, they supported hypothesis that two lineages diverged before the late Pleistocene. These findings, when taken together, suggested that this species had experienced long allopatric divergence and recent regional range expansions in response to orogenic processes and the climate changes. The evolutionary history of this shrub species highlights importance of geographical isolations to the intraspecific divergence of alpine plants occurring in the world's ruff.

Research paper thumbnail of Strong Incongruence between the ITS Phylogeny and Generic Delimitation in the Nemosenecio-Sinosenecio-Tephroseris Assemblage (Asteraceae: Senecioneae)

Research paper thumbnail of A Bayesian Approach to Inferring Rates of Selfing and Locus-Specific Mutation

Genetics, Jan 14, 2015

We present a Bayesian method for characterizing the mating system of populations reproducing thro... more We present a Bayesian method for characterizing the mating system of populations reproducing through a mixture of self-fertilization and random outcrossing. Our method uses patterns of genetic variation across the genome as a basis for inference about reproduction under pure hermaphroditism, gynodioecy, and a model developed to describe the self-fertilizing killifish Kryptolebias marmoratus. We extend the standard coalescence model to accommodate these mating systems, accounting explicitly for multilocus identity disequilibrium, inbreeding depression, and variation in fertility among mating types. We incorporate the Ewens Sampling Formula (ESF) under the infinite-alleles model of mutation to obtain a novel expression for the likelihood of mating system parameters. Our Markov chain Monte Carlo (MCMC) algorithm assigns locus-specific mutation rates, drawn from a common mutation rate distribution that is itself estimated from the data using a Dirichlet Process Prior model. Our sampler ...

Research paper thumbnail of Introductory Editorial–RNA: An Expanding View of Function and Evolution

Evolutionary Bioinformatics, 2016

Research paper thumbnail of CPAG: software for leveraging pleiotropy in GWAS to reveal similarity between human traits links plasma fatty acids and intestinal inflammation

Genome biology, 2015

Meta-analyses of genome-wide association studies (GWAS) have demonstrated that the same genetic v... more Meta-analyses of genome-wide association studies (GWAS) have demonstrated that the same genetic variants can be associated with multiple diseases and other complex traits. We present software called CPAG (Cross-Phenotype Analysis of GWAS) to look for similarities between 700 traits, build trees with informative clusters, and highlight underlying pathways. Clusters are consistent with pre-defined groups and literature-based validation but also reveal novel connections. We report similarity between plasma palmitoleic acid and Crohn's disease and find that specific fatty acids exacerbate enterocolitis in zebrafish. CPAG will become increasingly powerful as more genetic variants are uncovered, leading to a deeper understanding of complex traits. CPAG is freely available at www.sourceforge.net/projects/CPAG/.

Research paper thumbnail of Expression Divergence of Duplicate Genes in the Protein Kinase Superfamily in Pacific Oyster

Evolutionary Bioinformatics, 2015

Gene duplication has been proposed to serve as the engine of evolutionary innovation. It is well ... more Gene duplication has been proposed to serve as the engine of evolutionary innovation. It is well recognized that eukaryotic genomes contain a large number of duplicated genes that evolve new functions or expression patterns. However, in mollusks, the evolutionary mechanisms underlying the divergence and the functional maintenance of duplicate genes remain little understood. In the present study, we performed a comprehensive analysis of duplicate genes in the protein kinase superfamily using whole genome and transcriptome data for the Pacific oyster. A total of 64 duplicated gene pairs were identified based on a phylogenetic approach and the reciprocal best BLAST method. By analyzing gene expression from RNA-seq data from 69 different developmental and stimuli-induced conditions (nine tissues, 38 developmental stages, eight dry treatments, seven heat treatments, and seven salty treatments), we found that expression patterns were significantly correlated for a number of duplicate gene pairs, suggesting the conservation of regulatory mechanisms following divergence. Our analysis also identified a subset of duplicate gene pairs with very high expression divergence, indicating that these gene pairs may have been subjected to transcriptional subfunctionalization or neofunctionalization after the initial duplication events. Further analysis revealed a significant correlation between expression and sequence divergence (as revealed by synonymous or nonsynonymous substitution rates) under certain conditions. Taken together, these results provide evidence for duplicate gene sequence and expression divergence in the Pacific oyster, accompanying its adaptation to harsh environments. Our results provide new insights into the evolution of duplicate genes and their expression levels in the Pacific oyster.

Research paper thumbnail of De Novo Transcriptome Assembly and Development of Novel Microsatellite Markers for the Traditional Chinese Medicinal Herb, Veratrilla baillonii Franch (Gentianaceae)

Evolutionary Bioinformatics, 2015

Veratrilla baillonii Franch is an important Chinese medicinal herb for treating liver-related dis... more Veratrilla baillonii Franch is an important Chinese medicinal herb for treating liver-related diseases, which has been over-collected in the recent decades. However, the effective conservation and related population genetic study has been hindered because of the lack of genome sequences and genetic markers in the natural population. We have conducted RNA-seq on V. baillonii. We performed de novo assembly of these data to characterize the V. baillonii transcriptome, resulting in 133,019 contigs with size 200 bp. These contigs were annotated using the NCBI nonredundant database and Gene Ontology (GO) terms. From these contigs, we developed novel microsatellite simple sequence repeat (SSR) markers, identifying a total of 40,885 SSRs. SSRs with repeat motifs of 1-4 bp (mono-, di-, tri-, and tetranucleotides) accounted for 99.8% of all SSRs, with mononucleotide repeats most common, followed by dinucleotide (16.2%) and trinucleotide repeats (14.7%). We selected 151 SSRs for experimental validation, of which 74 were confirmed by polymerase chain reaction. Fourteen SSRs were determined to be polymorphic by screening 40 individuals from six distant populations. The number of alleles per locus ranged from two to four, and the expected heterozygosity varied from 0.2637 to 0.8571, suggesting that these SSR markers are highly polymorphic and effective for further genetic analysis in the nature population. In addition, we explored the genetic structure of V. baillonii using five SSRs in four geographic populations and found that the identified genotypes were clustered into two phylogenetic clades: the Mekong River clade and Jinsha River clade. This result indicates that these two regions may harbor highly divergent genetic lineages and enriched genetic diversity. The de novo transcriptome sequences and new SSR markers discovered by this study provide an initial step for understanding the population genetics of V. baillonii, and a valuable resource for effective conservation management.

Research paper thumbnail of RNA-seq Reveals Complicated Transcriptomic Responses to Drought Stress in a Nonmodel Tropic Plant, Bombax ceiba L

Evolutionary Bioinformatics, 2015

High-throughput transcriptome provides an unbiased approach for understanding the genetic basis a... more High-throughput transcriptome provides an unbiased approach for understanding the genetic basis and gene functions in response to different conditions. Here we sequenced RNA-seq libraries derived from a Bombax ceiba L. system under a controlled experiment. As a known medicinal and ornamental plant, B. ceiba grows mainly in hot-dry monsoon rainforests in Southeast Asia and Australia. Due to the specific growth environment, it has evolved a unique system that enables a physiologic response to drought stress. To date, few studies have characterized the genome-wide features of drought endurance in B. ceiba. In this study, we first attempted to characterize and identify the most differentially expressed genes and associated functional pathways under drought treatment and normal condition. Using RNA-seq technology, we generated the first transcriptome of B. ceiba and identified 59 differentially expressed genes with greater than 1,000-fold changes under two conditions. The set of upregulated genes implicates interplay among various pathways: plants growth, ubiquitin-mediated proteolysis, polysaccharides hydrolyzation, oxidative phosphorylation and photosynthesis, etc. In contrast, genes associated with stem growth, cell division, fruit ripening senescence, disease resistance, and proline synthesis are repressed. Notably, key genes of high RPKM levels in drought are AUX1, JAZ, and psbS, which are known to regulate the growth of plants, the resistance against abiotic stress, and the photosynthesis process. Furthermore, 16,656 microsatellite markers and 3,071 single-nucleotide polymorphisms (SNPs) were predicted by in silico methods. The identification and functional annotation of differentially expressed genes, microsatellites, and SNPs represent a major step forward and would serve as a valuable resource for understanding the complexity underlying drought endurance and adaptation in B. ceiba.

Research paper thumbnail of Comparative Analyses of Clinical and Environmental Populations of Cryptococcus neoformans in Botswana

Molecular ecology, Jan 6, 2015

Cryptococcus neoformans var. grubii (Cng) is the most common cause of fungal meningitis and its p... more Cryptococcus neoformans var. grubii (Cng) is the most common cause of fungal meningitis and its prevalence is highest in sub-Saharan Africa. Patients become infected by inhaling airborne spores or desiccated yeast cells from the environment, where the fungus thrives in avian droppings, trees, and soil. To investigate the prevalence and population structure of Cng in southern Africa, we analyzed isolates from 77 environmental samples and 64 patients. We detected significant genetic diversity among isolates and strong evidence of geographic structure at the local level. High proportions of isolates with the rare MATa allele were observed in both clinical and environmental isolates; however, the mating type alleles were unevenly distributed among different subpopulations. Nearly equal proportions of the MATa and MATα mating types were observed among all clinical isolates and in one environmental subpopulation from the eastern part of Botswana. As previously reported, there was evidence...

Research paper thumbnail of Refugial isolation and range expansions drive the genetic structure of Oxyria sinensis (Polygonaceae) in the Himalaya-Hengduan Mountains

Scientific reports, 2015

The formation of the Mekong-Salween Divide and climatic oscillations in Pleistocene were the main... more The formation of the Mekong-Salween Divide and climatic oscillations in Pleistocene were the main drivers for the contemporary diversity and genetic structure of plants in the Himalaya-Hengduan Mountains (HHM). To identify the relative roles of the two historical events in shaping population history of plants in HHM, we investigated the phylogeographic pattern of Oxyria sinensis, a perennial plant endemic to the HHM. Sixteen chloroplast haplotypes were identified and were clustered into three phylogenetic clades. The age of the major clades was estimated to be in the Pleistocene, falling into several Pleistocene glacial stages and postdating the formation of the Mekong-Salween Divide. Range expansions occurred at least twice in the early and middle Pleistocene, but the spatial genetic distribution rarely changed since the Last Glacial Maximum. Our results suggest that temporary mountain glaciers may act as barriers in promoting the lineage divergence in O. sinensis and that subseque...

Research paper thumbnail of Measures of linkage disequilibrium among neighbouring SNPs indicate asymmetries across the house mouse hybrid zone

Molecular Ecology, 2011

Theory predicts that naturally occurring hybrid zones between genetically distinct taxa can move ... more Theory predicts that naturally occurring hybrid zones between genetically distinct taxa can move over space and time as a result of selection and/or demographic processes, with certain types of hybrid zones being more or less likely to move. Determining whether a hybrid zone is stationary or moving has important implications for understanding evolutionary processes affecting interactions in hybrid populations. However, direct observations of hybrid zone movement are difficult to make unless the zone is moving rapidly. Here, evidence for movement in the house mouse Mus musculus domesticus × Mus musculus musculus hybrid zone is provided using measures of LD and haplotype structure among neighbouring SNP markers from across the genome. Local populations of mice across two transects in Germany and the Czech Republic were sampled, and a total of 1301 mice were genotyped at 1401 markers from the nuclear genome. Empirical measures of LD provide evidence for extinction and (re)colonization in single populations and, together with simulations, suggest hybrid zone movement because of either geography-dependent asymmetrical dispersal or selection favouring one subspecies over the other.

Research paper thumbnail of Confirmation of natural hybrids between Gentiana straminea and G. siphonantha (Gentianaceae) based on molecular evidence

Frontiers of Biology in China, 2008

Abstract A few individuals with intermediate morphology always appeared in the sympatric distribu... more Abstract A few individuals with intermediate morphology always appeared in the sympatric distributions of Gentiana straminea and G. siphonantha. These intermediate individuals were hypothesized to be the hybrids of two species after a careful evaluation of their ...

Research paper thumbnail of Linkage disequilibrium approaches for detecting hybrid zone movement

Evolution of the House Mouse, 2012

Research paper thumbnail of Genome-wide architecture of reproductive isolation in a naturally occurring hybrid zone between Mus musculus musculus and M. m. domesticus

Molecular Ecology, 2012

Studies of a hybrid zone between two house mouse subspecies (Mus musculus musculus and M. m. dome... more Studies of a hybrid zone between two house mouse subspecies (Mus musculus musculus and M. m. domesticus) along with studies using laboratory crosses reveal a large role for the X chromosome and multiple autosomal regions in reproductive isolation as a consequence of disrupted epistasis in hybrids. One limitation of previous work has been that most of the identified genomic regions have been large. The goal here is to detect and characterize precise genomic regions underlying reproductive isolation. We surveyed 1401 markers evenly spaced across the genome in 679 mice collected from two different transects. Comparisons between transects provide a means for identifying common patterns that likely reflect intrinsic incompatibilities. We used a genomic cline approach to identify patterns that correspond to epistasis. From both transects, we identified contiguous regions on the X chromosome in which markers were inferred to be involved in epistatic interactions. We then searched for autosomal regions showing the same patterns and found they constitute about 5% of autosomal markers. We discovered substantial overlap between these candidate regions underlying reproductive isolation and QTL for hybrid sterility identified in laboratory crosses. Analysis of gene content in these regions suggests a key role for several mechanisms, including the regulation of transcription, sexual conflict and sexual selection operating at both the postmating prezygotic and postzygotic stages of reproductive isolation. Taken together, these results indicate that speciation in two recently diverged (c. 0.5 Ma) house mouse subspecies is complex, involving many genes dispersed throughout the genome and associated with distinct functions.

Research paper thumbnail of History and evolution of alpine plants endemic to the Qinghai-Tibetan Plateau: Aconitum gymnandrum (Ranunculaceae)

Molecular Ecology, 2009

Here, we report a survey of chloroplast DNA (cpDNA) and nuclear ribosomal internal transcribed sp... more Here, we report a survey of chloroplast DNA (cpDNA) and nuclear ribosomal internal transcribed spacer (ITS) DNA variation aimed at exploring the phylogeographical history of the QTP alpine endemic Aconitum gymnandrum. We sequenced three cpDNA fragments (rpl20-rps12 intergenic spacer, the trnV intron and psbA-trnH spacer) and also the nuclear (ITS) region in 245 individuals from 23 populations sampled throughout the species' range. Two distinct lineages, with eastern and western geographical distributions respectively, were identified from a phylogenetic analysis of ITS sequence variation. Based on a fast substitution rate, these were estimated to have diverged from each other in the early Pleistocene approximately 1.45 Ma. The analysis of cpDNA variation identified nine chlorotypes that clustered into two major clades that were broadly congruent in geographical distribution with the two ITS lineages. The east-west split of cpDNA divergence was supported by an AMOVA which partitioned approximately half of the total variance between these two groups of populations. Analysis of the spatial distribution of chlorotypes showed that each clade was subdivided into two groups of populations such that a total of four population groups existed in the species. It is suggested that these different groups derive from four independent glacial refugia that existed during the Last Glacial Maximum (LGM), and that three of these refugia were located at high altitude on the QTP platform itself at that time. Coalescent simulation of chlorotype genealogies supported both an early Pleistocene origin of the two main cpDNA clades and also the 'four-refugia' hypothesis during the LGM. Two previous phylogeographical studies of QTP alpine plants indicated that such plants retreated to refugia at the eastern/south-eastern plateau edge during the LGM and/or previous glacial maxima. However, the results for A. gymnandrum suggest that at least some of these cold-tolerant species may have also survived centrally on the QTP platform throughout the Quaternary.

Research paper thumbnail of The uncharacterized gene 1700093K21Rik and flanking regions are correlated with reproductive isolation in the house mouse, Mus musculus

Mammalian Genome, 2014

Reproductive barriers exist between the house mouse subspecies, Mus musculus musculus and M. m. d... more Reproductive barriers exist between the house mouse subspecies, Mus musculus musculus and M. m. domesticus, members of the Mus musculus species complex, primarily as a result of hybrid male infertility, and a hybrid zone exists where their ranges intersect in Europe. Using single nucleotide polymorphisms (SNPs) diagnostic for the two taxa, the extent of introgression across the genome was previously compared in these hybrid populations. Sixty-nine of 1316 autosomal SNPs exhibited reduced introgression in two hybrid zone transects suggesting maladaptive interactions among certain loci. One of these markers is within a region on chromosome 11 that, in other studies, has been associated with hybrid male sterility of these subspecies. We assessed sequence variation in a 20 Mb region on chromosome 11 flanking this marker, and observed its inclusion within a roughly 150 kb stretch of DNA showing elevated sequence differentiation between the two subspecies. Four genes are associated with this genomic subregion, with two entirely encompassed. One of the two genes, the uncharacterized 1700093K21Rik gene, displays distinguishing features consistent with a potential role in reproductive isolation between these subspecies. Along with its expression specifically within spermatogenic cells, we present various sequence analyses that demonstrate a high rate of molecular evolution of this gene, as well as identify a subspecies amino acid variant resulting in a structural difference. Taken together, the data suggest a role for this gene in reproductive isolation.

Research paper thumbnail of Repeated Range Expansion and Glacial Endurance of Potentilla glabra (Rosaceae) in the Qinghai-Tibetan Plateau

Journal of Integrative Plant Biology, 2009

To date, little is still known about how alpine species occurring in the Qinghai-Tibetan Plateau ... more To date, little is still known about how alpine species occurring in the Qinghai-Tibetan Plateau (QTP) responded to past climatic oscillations. Here, by using variations of the chloroplast trnT-L, we examined the genetic distribution pattern of 101 individuals of Potentilla glabra, comprising both the interior QTP and the plateau edge. Phylogenetic and network analyses of 31 recovered haplotypes identified three tentative clades (A, B and C). Analysis of molecular variance (AMOVA) revealed that most of the genetic variability was found within populations (0.693), while differentiations between populations were obviously distinct (F st = 0.307). Two independent range expansions within clades A and B occurring at approximately 316 and 201 thousand years ago (kya) were recovered from the hierarchical mismatch analysis, and these two expansions were also confirmed by Fu's F S values and 'g' tests. However, distant distributions of clade C and private haplotypes from clades A and B suggest that they had survived the Last Glacial Maximum (LGM) and previous glaciers in situ since their origins. Our findings based on available limited samples support that multiple refugia of a few cold-enduring species had been maintained in the QTP platform during LGM and/or previous glacial stages. JQ (2009). Repeated range expansion and glacial endurance of Potentilla glabra (Rosaceae) in the Qinghai-Tibetan Plateau.

Research paper thumbnail of Allopatric divergence and phylogeographic structure of the plateau zokor ( Eospalax baileyi ), a fossorial rodent endemic to the Qinghai-Tibetan Plateau

Journal of Biogeography, 2010

Research paper thumbnail of Phylogeographic analyses suggest that a deciduous species ( Ostryopsis davidiana Decne., Betulaceae) survived in northern China during the Last Glacial Maximum

Journal of Biogeography, 2009

Research paper thumbnail of Genetic variation in the endangered Anisodus tanguticus (Solanaceae), an alpine perennial endemic to the Qinghai-Tibetan Plateau

Genetica, 2007

We used random amplified polymorphic DNA markers (RAPDs) to assess genetic variation between-and ... more We used random amplified polymorphic DNA markers (RAPDs) to assess genetic variation between-and within-populations of Anisodus tanguticus (Solanaceae), an endangered perennial endemic to the Qinghai-Tibetan Plateau with important medicinal value. We recorded a total of 92 amplified bands, using 12 RAPD primers, 76 of which (P = 82.61%) were polymorphic, and calculated values of H t and H sp of 0.3015 and 0.4459, respectively, suggesting a remarkably high rate of genetic variation at the species level. The average within-population diversity also appeared to be high, with P, H e and H pop values of 55.11%, 0.1948 and 0.2918, respectively. Analyses of molecular variance (AMOVA) showed that among-and between-population genetic variation accounted for 67.02% and 32.98% of the total genetic variation, respectively. In addition, Nei's coefficient of differentiation (G ST ) was found to be high (0.35), confirming the relatively high level of genetic differentiation among the populations. These differentiation coefficients are higher than mean corresponding coefficients for outbreeding species, but lower than reported coefficients for some rare species from this region. The genetic structure of A. tanguticus has probably been shaped by its breeding attributes, biogeographic history and human impact due to collection for medicinal purposes. The observed genetic variations suggest that as many populations as possible should be considered in any planned in situ or ex situ conservation programs for this species.

Research paper thumbnail of Evolutionary history of an alpine shrub Hippophae tibetana (Elaeagnaceae): allopatric divergence and regional expansion

Biological Journal of the Linnean Society, 2011

Increasing evidence suggests that geological or climatic events in the past promoted allopatric s... more Increasing evidence suggests that geological or climatic events in the past promoted allopatric speciation of alpine plants in the Qinghai-Tibetan Plateau and adjacent region. However, few studies have been undertaken to examine whether such allopatric divergences also occurred within a morphologically uniform species. In the present study, we report the evolutionary history of an alpine shrub species, Hippophae tibetana, based on examining chloroplast DNA (cpDNA) and nuclear ribosomal internal transcribed spacer (ITS) DNA variations. We sequenced two cpDNA fragments (trnL-F and trnS-G) and the nuclear ITS region in 183 individuals collected from 21 natural populations. Ten chlorotypes and 17 ITS types were identified. Phylogenetic analyses of both chlorotypes and ITS sequence variations suggested two distinct lineages distributed in the eastern and western region, respectively. On the basis of the fast and low plant substitution rates, these two lineages were estimated to have diverged from each other between 1 and 4 million years ago, during the period of the major glaciations and orogenic processes. In addition, ITS has undergone the accelerated evolution in two populations in the southern Himalaya isolated by the high mountains with a surprising accumulation of the private variations. The east-west split was also supported by an analysis of molecular variance, which partitioned around 91% of the total cpDNA variance between these two groups of populations. A single chlorotype was found for most populations in eastern or western region, suggesting a recent postglacial expansion within each region. Star-phylogeny and mismatch analyses of all chlorotypes within the eastern group of populations suggested an earlier regional expansion before the Last Glacial Maximum (LGM). The local fixture of the different chlorotypes in multiple populations suggested more than one refugia remained for eastern or western region. Coalescent tests rejected the hypothesis that all current populations originated from a single refugium during the LGM. Instead, they supported hypothesis that two lineages diverged before the late Pleistocene. These findings, when taken together, suggested that this species had experienced long allopatric divergence and recent regional range expansions in response to orogenic processes and the climate changes. The evolutionary history of this shrub species highlights importance of geographical isolations to the intraspecific divergence of alpine plants occurring in the world's ruff.

Research paper thumbnail of Strong Incongruence between the ITS Phylogeny and Generic Delimitation in the Nemosenecio-Sinosenecio-Tephroseris Assemblage (Asteraceae: Senecioneae)