Reference genome of the California glossy snake, Arizona elegans occidentalis: A declining California Species of Special Concern - PubMed (original) (raw)

. 2022 Nov 30;113(6):632-640.

doi: 10.1093/jhered/esac040.

Jonathan Q Richmond 1, Merly Escalona 2, Mohan P A Marimuthu 3, Oanh Nguyen 3, Samuel Sacco 4, Eric Beraut 4, Michael Westphal 5, Robert N Fisher 1, Amy G Vandergast 1, Erin Toffelmier 6 7, Ian J Wang 8 9, H Bradley Shaffer 6 7

Affiliations

Reference genome of the California glossy snake, Arizona elegans occidentalis: A declining California Species of Special Concern

Dustin A Wood et al. J Hered. 2022.

Abstract

The glossy snake (Arizona elegans) is a polytypic species broadly distributed across southwestern North America. The species occupies habitats ranging from California's coastal chaparral to the shortgrass prairies of Texas and southeastern Nebraska, to the extensive arid scrublands of central México. Three subspecies are currently recognized in California, one of which is afforded state-level protection based on the extensive loss and modification of its preferred alluvial coastal scrub and inland desert habitat. We report the first genome assembly of A. elegans occidentalis as part of the California Conservation Genomics Project (CCGP). Consistent with the reference genome strategy of the CCGP, we used Pacific Biosciences HiFi long reads and Hi-C chromatin-proximity sequencing technologies to produce a de novo assembled genome. The assembly comprises a total of 140 scaffolds spanning 1,842,602,218 base pairs, has a contig NG50 of 61 Mb, a scaffold NG50 of 136 Mb, and a BUSCO complete score of 95.9%, and is one of the most complete snake genome assemblies. The A. e. occidentalis genome will be a key tool for understanding the genomic diversity and the basis of adaptations within this species and close relatives within the hyperdiverse snake family Colubridae.

Keywords: California Conservation Genomics Project (CCGP); Colubridae; Colubrinae; Species of Special Concern; alluvial soils; conservation genetics.

Published by Oxford University Press on behalf of The American Genetic Association 2022.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Fig. 1.

Fig. 1.

A) Subspecies distributions of the glossy snake, Arizona elegans, within California. The gray star represents the geographic location where the voucher specimen was collected (Fairbanks Ranch, San Diego County). B) Photo of the reference genome specimen, HBS135684. C) A California glossy snake, A. e. occidentalis, from San Diego County, California. D) A Mojave glossy snake, A. e. candida, from Kern County, California. E) A Desert glossy snake, A. e. eburnata, from San Bernardino County, California. Representative examples of (F) mixed scrub alluvial habitat of coastal San Diego County, (G) saltbush scrub and grassland habitat of the San Joaquin Desert, and (H) arid scrub and grassland habitat of the Mojave Desert. Photo credits: Robert W. Hansen (E, H); Jeffrey A. Nordland (C); H. Bradley Shaffer (B, D); Dustin A. Wood (F, G).

Fig. 2.

Fig. 2.

Visual overview of genome assembly metrics. A) k-mer spectra output generated from PacBio HiFi data without adapters using GenomeScope2.0. The bimodal pattern observed corresponds to a diploid genome. k-mers covered at lower coverage and low frequency corresponds to differences between haplotypes, whereas the higher coverage and high frequency k-mers correspond to the similarities between haplotypes. B) BlobToolkit Snail plot showing a graphical representation of the quality metrics presented in Table 2 for the A. e. occidentalis primary assembly (rAriEle1). The plot circle represents the full size of the assembly. From the inside-out, the central plot covers length-related metrics. The red line represents the size of the longest scaffold; all other scaffolds are arranged in size-order moving clockwise around the plot and drawn in gray starting from the outside of the central plot. Dark and light orange arcs show the scaffold N50 and scaffold N90 values. The central light gray spiral shows the cumulative scaffold count with a white line at each order of magnitude. White regions in this area reflect the proportion of Ns in the assembly. The dark vs. light blue area around it shows mean, maximum and minimum GC vs. AT content at 0.1% intervals (Challis et al. 2020). C, D) Hi-C contact maps for the primary (2C) and alternate (2D) genome assembly generated with PretextSnapshot. Hi-C contact maps translate proximity of genomic regions in 3D space to contiguous linear organization. Each line in the contact map corresponds to sequencing data supporting the linkage (or join) between two of such regions. Scaffolds are separated by black lines and higher density corresponds to high levels of fragmentation.

References

    1. Abdennur N, Mirny LA. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics. 2020;36(1):311–316. doi: 10.1093/bioinformatics/btz540. -DOI -PMC -PubMed
    1. Allio R, Schomaker-Bastos A, Romiguier J, Prosdocimi F, Nabholz B, Delsuc F. MitoFinder: efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics. Mol Ecol Resour. 2020;20(4):892–905. doi: 10.1111/1755-0998.13160. -DOI -PMC -PubMed
    1. Blanchard FN. A new snake of the genus Arizona. Occasional Papers Series/Report No. 150. Ann Arbor (MI): Museum of Zoology, University of Michigan; 1924. p. 1–5.
    1. Bury B, Gress F, Gorman GC. Karyotypic survey of some colubrid snakes from western North America. Herpetologica. 1970;26(4):461–466.
    1. Butterfield HS, Kelsey TR, Hart AK, editors. Rewilding agricultural landscapes: a California study in rebalancing the needs of people and nature. Washington (DC): Island Press; 2021.

Publication types

MeSH terms

LinkOut - more resources