A systematic genome-wide analysis of zebrafish protein-coding gene function - PubMed (original) (raw)

. 2013 Apr 25;496(7446):494-7.

doi: 10.1038/nature11992. Epub 2013 Apr 17.

Elisabeth M Busch-Nentwich, Steven A Harvey, Christopher M Dooley, Ewart de Bruijn, Freek van Eeden, Ian Sealy, Richard J White, Colin Herd, Isaac J Nijman, Fruzsina Fényes, Selina Mehroke, Catherine Scahill, Richard Gibbons, Neha Wali, Samantha Carruthers, Amanda Hall, Jennifer Yen, Edwin Cuppen, Derek L Stemple

Affiliations

A systematic genome-wide analysis of zebrafish protein-coding gene function

Ross N W Kettleborough et al. Nature. 2013.

Abstract

Since the publication of the human reference genome, the identities of specific genes associated with human diseases are being discovered at a rapid rate. A central problem is that the biological activity of these genes is often unclear. Detailed investigations in model vertebrate organisms, typically mice, have been essential for understanding the activities of many orthologues of these disease-associated genes. Although gene-targeting approaches and phenotype analysis have led to a detailed understanding of nearly 6,000 protein-coding genes, this number falls considerably short of the more than 22,000 mouse protein-coding genes. Similarly, in zebrafish genetics, one-by-one gene studies using positional cloning, insertional mutagenesis, antisense morpholino oligonucleotides, targeted re-sequencing, and zinc finger and TAL endonucleases have made substantial contributions to our understanding of the biological activity of vertebrate genes, but again the number of genes studied falls well short of the more than 26,000 zebrafish protein-coding genes. Importantly, for both mice and zebrafish, none of these strategies are particularly suited to the rapid generation of knockouts in thousands of genes and the assessment of their biological activity. Here we describe an active project that aims to identify and phenotype the disruptive mutations in every zebrafish protein-coding gene, using a well-annotated zebrafish reference genome sequence, high-throughput sequencing and efficient chemical mutagenesis. So far we have identified potentially disruptive mutations in more than 38% of all known zebrafish protein-coding genes. We have developed a multi-allelic phenotyping scheme to efficiently assess the effects of each allele during embryogenesis and have analysed the phenotypic consequences of over 1,000 alleles. All mutant alleles and data are available to the community and our phenotyping scheme is adaptable to phenotypic analysis beyond embryogenesis.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Exome sequencing

a, ENU-mutagenised G0 males are outcrossed to create a population of F1s. Genomic DNA is taken from F1s, which are either outcrossed or cryopreserved as sperm samples. b, F1 genomic DNA is then subjected to exome sequencing. Illumina libraries are made and hybridised to the 120mer biotinylated RNA whole exome baits. Streptavidin coated magnetic beads capture genomic DNA hybridised to the RNA baits, and all other DNA is discarded. Exome-enriched DNA fragments are sequenced. Blue represents exonic and red non-coding genomic DNA.

Figure 2

Figure 2. Mutation detection

a, The cumulative detection of nonsense and essential splice alleles. As each mutagenised library displayed different rates of mutagenesis the order that exomes were sequenced was randomised. b, The detection of non-synonymous mutations. Sequencing 808 exomes resulted in the identification of 85,338 non-synonymous alleles in 19655 genes corresponding to 75% of all protein-coding genes.

Figure 3

Figure 3. Phenotypic analysis of alleles

a, F1 individuals were outcrossed to produce an F2 family. The induced disruptive alleles for one family are shown. b, F2s were incrossed and genotyped. c, First round, phenotypically wild-type embryos were collected from each clutch at 5 dpf and genotyped for the mutations heterozygous in both parents. The number of homozygous mutant F3 embryos was assessed using a Chi-squared test (p-value cut-off <0.05). Mutations homozygous in less than 25% of embryos were suspected to cause a phenotype. Here, there were no homozygous embryos in the phenotypically wild-type set for the alleles sa365, sa371 and sa379. d-e, Second round, phenotypes present within each incross were genotyped for putative phenotypic mutations. d, slc22a7bsa365 shows pigment phenotype. e, mphosph10sa371 shows small head and pericardiac oedema phenotype. lamc1sa379 mutants are shown in Fig. 4 l, n.

Figure 4

Figure 4. Confirmation of causality through complementation crosses

Depicted are four examples, polymerase I polypeptide a (polr1a) (a-d), midasin homologue (yeast) (mdn1) (e-h), titin a (ttna) (i-k) and laminin subunit gamma-1 (lamc1) (l-n), where heterozygous carriers of two independent alleles in the same gene were used to generate compound heterozygote offspring. Where possible incrosses of individual alleles are shown as well. In all images non-phenotypic siblings are above and phenotypic homozygous mutant or compound heterozygous embryos below. a-c, At 48 hpf embryos homozygous for either sa1376, sa2745 or compound heterozygous for sa1376 and sa2745 have small eyes, a hydrocephalic hindbrain and pericardiac oedema. d, sa2745 disrupts a splice donor site through a G>A transition at the first intronic nucleotide 3′ of coding nucleotide 985. Allele sa1376 is a C>T transition producing a premature STOP codon at amino acid (aa) 1487. e-g, At 96 hpf embryos homozygous for either sa1349, sa6631 or compound heterozygous for sa1349 and sa6631 have smaller heads with malformed jaws and mild pericardiac oedema. h, sa1349 and sa6631 produce premature STOP codons at aa 4597 (T>A transversion) and aa 5333 (G>A transition), respectively. i, j, At 48 hpf embryos homozygous for sa787 or compound heterozygous for sa787 and sa2492 are growth retarded, paralysed and have pericardiac oedema. k, Alleles sa787 and sa2492 produce premature STOP codons at aa 24946 (C>T transition) and aa 27471 (C>T transition), respectively. l, m, At 24 hpf embryos homozygous for sa379 or compound heterozygous for sa379 and m466 are shorter with an undifferentiated notochord, and brain and eye malformations. n, Allele m466 is a G>A transition producing a premature STOP codon at aa 13. Allele sa379 disrupts a splice acceptor site through a G>A transition one nucleotide 5′ of coding nucleotide 975.

Comment in

Similar articles

Cited by

References

    1. Gossler A, Joyner AL, Rossant J, Skarnes WC. Mouse embryonic stem cells and reporter constructs to detect developmentally regulated genes. Science. 1989;244:463–465. - PubMed
    1. Skarnes WC, Auerbach BA, Joyner AL. A gene trap approach in mouse embryonic stem cells: the lacZ reported is activated by splicing, reflects endogenous gene expression, and is mutagenic in mice. Genes Dev. 1992;6:903–918. - PubMed
    1. Ringwald M, et al. The IKMC web portal: a central point of entry to data and resources from the International Knockout Mouse Consortium. Nucleic Acids Res. 2011;39:D849–855. doi:10.1093/nar/gkq879. - PMC - PubMed
    1. Church DM, et al. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol. 2009;7:e1000112. doi:10.1371/journal.pbio.1000112. - PMC - PubMed
    1. Waterston RH, et al. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. doi:10.1038/nature01262. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources