Accurate characterization of the IFITM locus using MiSeq and PacBio sequencing shows genetic variation in Galliformes - PubMed (original) (raw)

Accurate characterization of the IFITM locus using MiSeq and PacBio sequencing shows genetic variation in Galliformes

Irene Bassano et al. BMC Genomics. 2017.

Abstract

Background: Interferon inducible transmembrane (IFITM) proteins are effectors of the immune system widely characterized for their role in restricting infection by diverse enveloped and non-enveloped viruses. The chicken IFITM (chIFITM) genes are clustered on chromosome 5 and to date four genes have been annotated, namely chIFITM1, chIFITM3, chIFITM5 and chIFITM10. However, due to poor assembly of this locus in the Gallus Gallus v4 genome, accurate characterization has so far proven problematic. Recently, a new chicken reference genome assembly Gallus Gallus v5 was generated using Sanger, 454, Illumina and PacBio sequencing technologies identifying considerable differences in the chIFITM locus over the previous genome releases.

Methods: We re-sequenced the locus using both Illumina MiSeq and PacBio RS II sequencing technologies and we mapped RNA-seq data from the European Nucleotide Archive (ENA) to this finalized chIFITM locus. Using SureSelect probes capture probes designed to the finalized chIFITM locus, we sequenced the locus of a different chicken breed, namely a White Leghorn, and a turkey.

Results: We confirmed the Gallus Gallus v5 consensus except for two insertions of 5 and 1 base pair within the chIFITM3 and B4GALNT4 genes, respectively, and a single base pair deletion within the B4GALNT4 gene. The pull down revealed a single amino acid substitution of A63V in the CIL domain of IFITM2 compared to Red Jungle fowl and 13, 13 and 11 differences between IFITM1, 2 and 3 of chickens and turkeys, respectively. RNA-seq shows chIFITM2 and chIFITM3 expression in numerous tissue types of different chicken breeds and avian cell lines, while the expression of the putative chIFITM1 is limited to the testis, caecum and ileum tissues.

Conclusions: Locus resequencing using these capture probes and RNA-seq based expression analysis will allow the further characterization of genetic diversity within Galliformes.

Keywords: Chicken IFITM; Genetic characterization; Illumina MiSeq; PacBio RSII; RNA-seq.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Locus comparison between PacBio consensus sequence (contig 2) and a portion of chromosome 5 of the two versions of the chicken genome. a: The 203 kb BAC reference sequence contained in the PacBio contig 2 (in the middle) is compared with chromosome 5 of Gallus gallus v4 (top) or v5 (bottom) using ACT, Artemis Comparison Tool. The annotation files for Gallus gallus v4 and PacBio contig 2 have been compressed to allow visualization of the whole BAC; for Gallus gallus v5 it was drawn manually only to visualize location of the locus. b: The chIFITM locus (circled in A) is enlarged in B to show only the chIFITM locus including the flanking genes (this is a 40 kb region extracted from the 203Kb total). Gaps are visible in Gallus gallus v4 represented by white bars (N nucleotides), while these are absent in the comparison with the more complete Gallus gallus v5. The graph does not show differences at the nucleotide level, but only an overall view of the locus. c: Dot Plot comparison graphs of the assembled PacBio contig 2 versus Gallus gallus v5 showing differences not visible when using ACT for the 40Kb region. The region enlarged in the right Dot Plot shows a stretch of the genomic region within the intronic region of the chIFITM3 gene which shows differences with chicken genome assembly v5. d Clustal Omega alignment of the PacBio contig 2 consensus sequences and the chicken genome v5 (portion of the IFITM3 gene corresponding to the gap seen in 2C). In yellow is highlighted the gap

Fig. 2

Fig. 2

Artemis coverage and stack view of Illumina MiSeq reads mapped against PacBio consensus sequence (contig 2). a Overall coverage and GC content of the Illumina MiSeq BAC reads (203 kb region) mapped against the PacBio contig 2. This reference was built using the annotation of Gallus gallus v4 as scaffold. The chIFITM genes are located between 138150 and 177724 in the 203Kb region. b stack view of the Illumina MiSeq reads showing the chIFITM locus

Fig. 3

Fig. 3

a Artemis coverage and stack view of the IFITM locus in DF1 cells following pull down of the IFITM locus using SureSelect probes and sequencing with PacBio. The figure shows an intact locus and successful mapping of the IFITM locus against the Gallus gallus sequence reference, despite two gaps observed within the B4GALNT4 and IFITM3 genes. b Artemis coverage and stack view of the IFITM locus in DF1 cells following pull down of the IFITM locus using SureSelect probes and sequencing with PacBio. These reads were instead mapped against the new PacBio contig 2 sequence reference. As for the mapping above, two gaps (one partial) are observed within the B4GALNT4 and IFITM3 genes, although more reads cross the gaps, allowing full coverage. c Artemis coverage and stack view of the IFITM locus in turkey breast tissue following pull down of the IFITM locus using SureSelect probes and sequencing with Illumina MiSeq. The graph shows successful mapping of MiSeq reads despite using chicken probes to pull down the locus in turkey tissue. The white bars represent actual gaps in the turkey reference as published on both Ensemble and NCBI and to which the probes will not eventually map as gaps are shown in the reference as “NNN”

Fig. 4

Fig. 4

Clustal Omega alignment of the amino acid sequence of the IFITM proteins derived from the consensus sequence of DF1 and turkey samples following targeted SureSelect pulldown. The amino acid sequences are compared to the Gallus gallus v5 sequences. Domain structures are represented as: IM1 and IM2, intramembrane domain 1 and 2, CIL, conserved intracellular loop. These have yet to be defined for chIFITM5

Fig. 5

Fig. 5

The read alignment views in Artemis showing RNA-Seq data from the different studies. Top panel: the ‘coverage view” showing a separate plot for each BAM mapped to our PacBio contig 2 (40Kb region). The coverage shows only data relative to constitutive expression level of chIFITMs in immune-relevant tissues and cell lines (lung, trachea, spleen, liver, DF1, CEF, HD11, DT40). Bottom panel: the “stack view” (paired reads: blue, single reads and/or reads with an unmapped pair: black; reads spanning the same region: green) to show in more detail read depth across each chIFITM transcript. All the features were annotated manually blasting the sequences from the latest version of the chicken genome. Cyan: CDS region, grey: mRNA, white: gene (overlapping with mRNA features)

Fig. 6

Fig. 6

RNA-seq data alignment of reads from the immune relevant tissues and cell lines in treated conditions: infection with IBDV, ALV-J, ILVV, LPS, H5N5/H5N1 or heat stress-induced conditions. The graph shows that also in these conditions, levels of chIFITM are lower compared to chIFITM2 and chIFITM3. Top panel, overall coverage. Bottom panel stack view of each chIFITM transcript

References

    1. The British Poultry Council. 2016. http://www.britishpoultry.org.uk/.
    1. Bande F, et al. Pathogenesis and Diagnostic Approaches of Avian Infectious Bronchitis. Adv Virol. 2016;2016:4621659. - PMC - PubMed
    1. Wickramasinghe IN, et al. The avian coronavirus spike protein. Virus Res. 2014;194:37–48. doi: 10.1016/j.virusres.2014.10.009. - DOI - PMC - PubMed
    1. Ingrao F, et al. Infectious Bursal Disease: a complex host-pathogen interaction. Dev Comp Immunol. 2013;41(3):429–438. doi: 10.1016/j.dci.2013.03.017. - DOI - PubMed
    1. Mahgoub HA, Bailey M, Kaiser P. An overview of infectious bursal disease. Arch Virol. 2012;157(11):2047–2057. doi: 10.1007/s00705-012-1377-9. - DOI - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources