Early Mesozoic Coexistence of Amniotes and Hepadnaviridae (original) (raw)

< Back to Article

Figure 4

An evolutionary scenario for the emergence of the oncogenic X gene.

(A) The genome of the HBV ancestor contained neither an X nor _X_-like ORF, given that avian and crocodilian eHBVs lack an _X_-like ORF, despite the fact that it is thought to be expressed in some closely related avihepadnaviruses [20]. X (+2 frame) and _X_-like (+3 frame) are encoded in different reading frames relative to the part of pol they overlap with (+1 frame), which strongly suggests that these ORFs are non-homologous and the _X_-like protein emerged in the ancestor of avihepadnaviruses. Independently, the X protein arose in orthohepadnaviruses via overprinting (4.) after a segmental duplication (1.), a partial deletion (2.), and a frameshifting mutation (3.) in one region of the HBV genome. Only the part of the HBV genome between the ribonuclease H (RNH) and the terminal protein (TP) domains of the pol ORF is shown, including structural elements such as direct repeats (DR; purple vertical lines) and the RNA encapsidation signal (ε; orange box). (B) Translated sequence alignment of the _X_-like ORF sensu Chang et al. [20] indicates presence of multiple internal stop codons in avian and crocodilian eHBVs, resulting in potential translation products <30 aa. Stop codon positions (asterisks) are highlighted with grey boxes if they are conserved between eHBVs, start codon positions for the longest possible ORF are highlighted by circles. Even when assuming that nonconventional start codons are used as suggested for DHBV [20], potential eHBV X-like proteins would comprise just a portion of the DHBV X-like protein. (C) Sequence similarity between translated preC/C 5′ end region (incl. in-frame aa sites upstream of the start codon) and translated central region of the preC/C ORF might be a potential remnant of an ancient segmental duplication of the first two thirds of the preC/C ORF. Amino acid residues with dark grey background are conserved between the start and the middle part of the preC/C ORF and thus constitute a potentially duplicated amino acid motif. (D) Schematic illustration of the proposed evolutionary steps of X ORF emergence [(1.) to (4.)] described in (A) that potentially led to the extant genome organization of orthohepadnaviruses. Black rectangles illustrate the location of the duplicated amino acid motif shown in (C).

Figure 4

doi: https://doi.org/10.1371/journal.pgen.1004559.g004