The Use of “Genotyping-by-Sequencing” to Recover Shared Genealogy in Genetically Diverse Eucalyptus Populations (original) (raw)

Parentage Reconstruction in Eucalyptus nitens Using SNPs and Microsatellite Markers: A Comparative Analysis of Marker Data Power and Robustness

PLOS ONE, 2015

Pedigree reconstruction using molecular markers enables efficient management of inbreeding in open-pollinated breeding strategies, replacing expensive and time-consuming controlled pollination. This is particularly useful in preferentially outcrossed, insect pollinated Eucalypts known to suffer considerable inbreeding depression from related matings. A single nucleotide polymorphism (SNP) marker panel consisting of 106 markers was selected for pedigree reconstruction from the recently developed high-density Eucalyptus Infinium SNP chip (EuCHIP60K). The performance of this SNP panel for pedigree reconstruction in open-pollinated progenies of two Eucalyptus nitens seed orchards was compared with that of two microsatellite panels with 13 and 16 markers respectively. The SNP marker panel out-performed one of the microsatellite panels in the resolution power to reconstruct pedigrees and out-performed both panels with respect to data quality. Parentage of all but one offspring in each clonal seed orchard was correctly matched to the expected seed parent using the SNP marker panel, whereas parentage assignment to less than a third of the expected seed parents were supported using the 13-microsatellite panel. The 16microsatellite panel supported all but one of the recorded seed parents, one better than the SNP panel, although there was still a considerable level of missing and inconsistent data. SNP marker data was considerably superior to microsatellite data in accuracy, reproducibility and robustness. Although microsatellites and SNPs data provide equivalent resolution for pedigree reconstruction, microsatellite analysis requires more time and experience to deal with the uncertainties of allele calling and faces challenges for data transferability across labs and over time. While microsatellite analysis will continue to be useful for some breeding tasks due to the high information content, existing infrastructure and low operating costs, the multi-species SNP resource available with the EuCHIP60k, opens a whole new array of opportunities for high-throughput, genome-wide or targeted genotyping in species of Eucalyptus.

Speciation in the presence of gene flow: population genomics of closely related and diverging Eucalyptus species

Heredity, 2018

Speciation is a complex process that is fundamental to the origins of biological diversity. While there has been considerable progress in our understanding of speciation, there are still many unanswered questions, especially regarding barriers to gene flow in diverging populations. Eucalyptus is an appropriate system for investigating speciation mechanisms since it comprises species that are rapidly evolving across heterogeneous environments. We examined patterns of genetic variation within and among six closely related Eucalyptus species in subgenus Eucalyptus section Eucalyptus in south-eastern Australia (commonly known as the "green ashes"). We used reduced representation genome sequencing to genotype samples from populations across altitudinal and latitudinal gradients. We found one species, Eucalyptus cunninghamii, to be highly genetically differentiated from the others, and a population of mallees from Mount Banks to be genetically distinct and therefore likely to be...

A case study of Eucalyptus globulus fingerprinting for breeding

Annals of Forest Science, 2011

& Introduction Tree genetic improvement programs usually 14 lack, in general, pedigree information. Since molecular 15 markers can be used to estimate the level of genetic 16 similarity between individuals, we genotyped a sample of a 17 Portuguese Eucalyptus globulus breeding population-a 18 reference population of 125 individuals-with 16 micro-19 satellites (SSR). 20 & Materials and methods Using genotypes from the 21 reference population, we developed a simulation approach 22 to recurrently generate (10 5 replicates) virtual offspring 23 with different relatedness: selfed, half-sib, full-sib and unrelated individuals. Four commonly used pairwise similarity coefficients were tested on these groups of simulated offspring. Significant deficits in heterozygosity were found for some markers in the reference population, likely due to the presence of null alleles. Therefore, the impact of null alleles in the relatedness estimates was also studied. We conservatively assumed that all homozygotes in the reference population were carriers of null alleles. & Results All estimators were unbiased, but one of them was better adjusted to our data set, even when null alleles were considered. The estimator's accuracy and precision were validated with individuals of known pedigree obtained from controlled crosses made with the same reference population's parents. Additionally, a clustering algorithm based on the estimator of choice was constructed, in order to infer the relatedness among 24 E. globulus elite individuals. We detected four putatively related elite individuals' pairs (six pairs considering the presence of null alleles). & Conclusions This work demonstrates that in the absence of pedigree information, our approach could be useful to identify relatives and minimize consanguinity in breeding populations.

SSRs, SNPs and DArTs comparison on estimation of relatedness and genetic parameters' precision from a small half-sib sample population of Eucalyptus grandis

Simple sequence repeats (SSR) are the most widely used molecular markers for relatedness inference due to their multi-allelic nature and high informativeness. However, there is a growing trend toward using high-throughput and inter-specific trans-ferable single-nucleotide polymorphisms (SNP) and Diversity Arrays Technology (DArT) in forest genetics owing to their wide genome coverage. We compared the efficiency of 15 SSRs, 181 SNPs and 2816 DArTs to estimate the relatedness coefficients, and their effects on genetic parameters' precision, in a relatively small data set of an open-pollinated progeny trial of Eucalyptus grandis (Hill ex Maiden) with limited relationship from the pedigree. Both simulations and real data of Eucalyptus grandis were used to study the statistical performance of three relatedness estimators based on co-dominant markers. Related-ness estimates in pairs of individuals belonging to the same family (related) were higher for DArTs than for

Population genetic analysis and phylogeny reconstruction in Eucalyptus (Myrtaceae) using high-throughput, genome-wide genotyping

Molecular Phylogenetics …, 2011

A set of over 8000 Diversity Arrays Technology (DArT) markers was tested for its utility in high-resolution population and phylogenetic studies across a range of Eucalyptus taxa. Small-scale population studies of Eucalyptus camaldulensis, Eucalyptus cladocalyx, Eucalyptus globulus, Eucalyptus grandis, Eucalyptus nitens, Eucalyptus pilularis and Eucalyptus urophylla demonstrated the potential of genome-wide genotyping with DArT markers to differentiate species, to identify interspecific hybrids and to resolve biogeographic disjunctions within species. The population genetic studies resolved geographically partitioned clusters in E. camaldulensis, E. cladocalyx, E. globulus and E. urophylla that were congruent with previous molecular studies. A phylogenetic study of 94 eucalypt species provided results that were largely congruent with traditional taxonomy and ITS-based phylogenies, but provided more resolution within major clades than had been obtained previously. Ascertainment bias (the bias introduced in a phylogeny from using markers developed in a small sample of the taxa that are being studied) was not detected. DArT offers an unprecedented level of resolution for population genetic, phylogenetic and evolutionary studies across the full range of Eucalyptus species.

Determination of inter- and intra-species genetic relationships among six Eucalyptus species based on inter-simple sequence repeats (ISSR)

Tree Physiology, 2005

Eucalyptus is the most economically important hardwood plantation tree cultivated in tropical and subtropical countries. Inter-simple sequence repeat (ISSR) markers were used to evaluate genetic relationships within and between individuals of six Eucalyptus species. A total of 583 loci (265 to 1535 bp) were amplified from 149 individuals belonging to the six Eucalyptus species using seven ISSR primers (two to three nucleotide repeats anchored with one or two nucleotides at the 3′ or 5′ region). The ISSR fragments indicated significant polymorphism and genetic diversity among the individuals. Cluster analysis and principal component analysis revealed the occurrence of wide genetic diversity among populations of E. tereticornis Sm., E. camaldulensis Dehnh. and E. urophylla S.T. Blake and narrow genetic diversity among populations of E. citriodora Hook. and E. grandis W. Hill ex Maiden. Genetic diversity was high in E. tereticornis Sm. (47.27%) and low in E. citriodora (18.64%). Maximum Nei's genetic identity (0.897) was observed between E. camaldulensis and E. tereticornis species, whereas maximum genetic diversity (0.286) was found between individuals of E. citriodora and E. grandis.

The significance of single nucleotide polymorphisms (SNPs) in Eucalyptus globulus breeding programs

Australian Forestry, 2011

Eucalyptus g/obulus (Labill.) is the most widely planted eucalypt for pulpwood in temperate regions of the world. Breeding to improve pulp properties of this species has been hampered by the long time between planting and pulp trait assessment and the high cost of estimating pulp traits. Identifying and employing allelic variants that associate with superior pulp yield and quality has the potential to assist breeding programs. Before this strategy can deliver benefits, detailed knowledge of population structure, nucleotide diversity, haplotype diversity and linkage disequilibrium (LD) must be collected. To address this, 20 wood quality candidate genes were sequenced in 8 to 28 Eucalyptus g/obulus individuals. Relative to other tree species where such studies have been conducted, single nucleotide polymorphism (SNP) frequencies were high. Decay of linkage disequilibrium was rapid at all loci tested, with linkage rarely extending beyond 500 base pairs. Regions within many candidate genes exhibited significant positive or negative selection signatures, indicative of purifying or balancing selection, respectively. Our findings have implications for association mapping in Eucalyptus species. The potential for E. globulus pedigree reconstruction and whole-genome association approaches in eucalypts in general are discussed.

Single-step genomic BLUP enables joint analysis of disconnected breeding programs: an example with Eucalyptus globulus Labill

G3 Genes|Genomes|Genetics

Single-step GBLUP (HBLUP) efficiently combines genomic, pedigree, and phenotypic information for holistic genetic analyses of disjunct breeding populations. We combined data from two independent multigenerational Eucalyptus globulus breeding populations to provide direct comparisons across the programs and indirect predictions in environments where pedigreed families had not been evaluated. Despite few known pedigree connections between the programs, genomic relationships provided the connectivity required to create a unified relationship matrix, H, which was used to compare pedigree-based and HBLUP models. Stem volume data from 48 sites spread across three regions of southern Australia and wood quality data across 20 sites provided comparisons of model accuracy. Genotyping proved valuable for correcting pedigree errors and HBLUP more precisely defines relationships within and among populations, with relationships among the genotyped individuals used to connect the pedigrees of the ...

Comparative genetic linkage maps of Eucalyptus grandis , Eucalyptus globulus and their F 1 hybrid based on a double pseudo-backcross mapping approach

TAG Theoretical and Applied Genetics, 2003

Comparative genetic mapping in interspecific pedigrees presents a powerful approach to study genetic differentiation, genome evolution and reproductive isolation in diverging species. We used this approach for genetic analysis of an F 1 hybrid of two Eucalyptus tree species, Eucalyptus grandis (W. Hill ex Maiden.) and Eucalyptus globulus (Labill.). This wide interspecific cross is characterized by hybrid inviability and hybrid abnormality. Approximately 20% of loci in the genome of the F 1 hybrid are expected to be hemizygous due to a difference in genome size between E. grandis (640 Mbp) and E. globulus (530 Mbp). We investigated the extent of colinearity between the two genomes and the distribution of hemizygous loci in the F 1 hybrid using high-throughput, semi-automated AFLP marker analysis. Two pseudobackcross families (backcrosses of an F 1 individual to non-parental individuals of the parental species) were each genotyped with more than 800 AFLP markers. This allowed construction of de novo comparative genetic linkage maps of the F 1 hybrid and the two backcross parents. All shared AFLP marker loci in the three singletree parental maps were found to be colinear and little evidence was found for gross chromosomal rearrangements. Our results suggest that hemizygous AFLP loci are dispersed throughout the E. grandis chromosomes of the F 1 hybrid.

Accounting for population structure in genomic predictions of Eucalyptus globulus

G3 Genes|Genomes|Genetics

Genetic groups have been widely adopted in tree breeding to account for provenance effects within pedigree-derived relationship matrices. However, provenances or genetic groups have not yet been incorporated into single-step genomic BLUP (“HBLUP”) analyses of tree populations. To quantify the impact of accounting for population structure in Eucalyptus globulus, we used HBLUP to compare breeding value predictions from models excluding base population effects and models including either fixed genetic groups or the marker-derived proxies, also known as metafounders. Full-sib families from 2 separate breeding populations were evaluated across 13 sites in the “Green Triangle” region of Australia. Gamma matrices (Γ) describing similarities among metafounders reflected the geographic distribution of populations and the origins of 2 land races were identified. Diagonal elements of Γ provided population diversity or allelic covariation estimates between 0.24 and 0.56. Genetic group solutions...