Genome-wide detection and characterization of positive selection in human populations (original) (raw)

Endogenous retroviruses (ERVs) are the proviral phase of exogenous retroviruses that become integrated into a host germ line. They can play an important role in the host genome. Bioinformatic tools have been used to detect ERVs in several vertebrates, primarily primates and rodents. Less information is available regarding ERVs in other mammalian groups, and the source of this information is basically experimental. We analyzed the genome of the cow (Bos taurus) using three different methods. A BLAST-based method detected 928 possible ERVs, LTR_STRUC detected 4,487 elements flanked by long terminal repeats (LTRs), and Retrotector detected 9,698 ERVs. The ERVs were not homogeneously distributed across chromosomes; the number of ERVs was positively correlated with chromosomal size and negatively correlated with chromosomal GC content. The bovine ERVs (BoERVs) were classified into 24 putative families, with 20 of them not previously described. One of these new families, BoERV1, was the most abundant family and appeared to be specific to ruminants. An analysis of representatives of ERV families from rodents, primates, and ruminants showed a phylogenetic relationship following their hosts' relationships. This study demonstrates the importance of using multiple methods when trying to identify new ERVs and shows that the number of bovine ERV families is not as limited as previously thought.