Genomic features defining exonic variants that modulate splicing - PubMed (original) (raw)
Genomic features defining exonic variants that modulate splicing
Adam Woolfe et al. Genome Biol. 2010.
Abstract
Background: Single point mutations at both synonymous and non-synonymous positions within exons can have severe effects on gene function through disruption of splicing. Predicting these mutations in silico purely from the genomic sequence is difficult due to an incomplete understanding of the multiple factors that may be responsible. In addition, little is known about which computational prediction approaches, such as those involving exonic splicing enhancers and exonic splicing silencers, are most informative.
Results: We assessed the features of single-nucleotide genomic variants verified to cause exon skipping and compared them to a large set of coding SNPs common in the human population, which are likely to have no effect on splicing. Our findings implicate a number of features important for their ability to discriminate splice-affecting variants, including the naturally occurring density of exonic splicing enhancers and exonic splicing silencers of the exon and intronic environment, extensive changes in the number of predicted exonic splicing enhancers and exonic splicing silencers, proximity to the splice junctions and evolutionary constraint of the region surrounding the variant. By extending this approach to additional datasets, we also identified relevant features of variants that cause increased exon inclusion and ectopic splice site activation.
Conclusions: We identified a number of features that have statistically significant representation among exonic variants that modulate splicing. These analyses highlight putative mechanisms responsible for splicing outcome and emphasize the role of features important for exon definition. We developed a web-tool, Skippy, to score coding variants for these relevant splice-modulating features.
Figures
Figure 1
Proportion of variants with gains or losses in exonic splicing regulatory sequence with significant differences between splice-affecting genome variants and HapMap SNPs. SAVs were characterized by (a) the loss of ESEs and (b) the gain of ESSs. As a comparison, ESEfinder, Ast-ESR and PESE losses are also included. These were not significantly different between SAVs and hSNPs. Z score _P_-values from random bootstrap sampling relating to each type of change are located on the right of the histogram.
Figure 2
Splice-affecting genome variants are characterized by losses of large numbers of NI-ESEs and the gain of large numbers of NI-ESSs, often in combination. For both ESE losses and ESS gains, the proportion of SAVs with changes of two or more were significantly greater compared to hSNPs. Combinations of ESE losses and ESS gains, as opposed to each occurring independently, are highly enriched in SAVs compared to hSNPs (bottom graph).
Figure 3
Distribution of specific types of NI-ESR changes for SAVs and hSNPs compared to neutral expectation. The tilde symbol (~) signifies an alteration where the hexamer is designated an ESE, neutral or ESS in both the wild-type and variant sequences. The arrow represents the direction of the change as a consequence of the change between wild type and variant hexamer. The neutral expected distribution reflects the underlying probability of each type of change given the ESE/ESS distribution among NI hexamers and the genome-wide nucleotide substitution bias in coding regions.
Figure 4
SAVs are enriched at the borders of exons. SAV and hSNP containing exons were divided into six equal sections and the proportion of variants falling into each section was plotted. While hSNPs were roughly distributed equally across the exon (with some depletion towards the edges), SAVs are significantly enriched at both edges of the exon (P = 0.005).
Figure 5
Regions surrounding SAVs are under greater non-coding evolutionary constraint. (a) We created a 192-codon position-specific scoring matrix based on genome-wide conservation levels across mammals. Matrix scores are visualized increasing from green to red. As scores are inversely proportional to the genome-wide conservation of each codon position, conservation levels can also be visualized using the same matrix, decreasing from green to red. (b) For each variant, four-way mammalian multiple DNA alignments were extracted for a region surrounding the variant, and a score assigned to each fully conserved column via the scoring matrix, and the total normalized by the length of the alignment. An example of a random synonymous CγG variant is shown. (c) The mean conservation score for all SAVs (blue arrow) and SAVs on autosomes (yellow arrow) was compared to a distribution of randomly sampled sets of scores from all hSNPs (orange distribution). Randomly sampled distributions of hSNPs were also created controlling for minimum distance from a splice junction by having similar distributions in this regard as SAVs (blue distribution). A distribution of mean conservation scores was also produced for hSNPs from autosomes also controlled by minimum distance from the splice site (yellow distribution).
Figure 6
Exons containing SAVs have significantly lower ESE and significantly higher ESS densities than exons containing hSNPs. As an illustration, the proportion of overlapping hexamers that are considered ESEs (green), ESSs (red) or splice neutral (grey) was plotted for 35 exons containing SAVs (that cause ESE/ESS changes) and a set of 35 randomly selected, length-matched hSNP-containing exons. Exons in both sets are sorted in descending order by ESS density.
Figure 7
Features that characterize variants that activate de novo ectopic splice sites ('ectopic SAVs'). (a) Most ectopic SAVs, in contrast to hSNPs and skipping SAVs, have a large Δ_SS_ value and create an ectopic splice site that is stronger than the natural splice site. (b) Hexamers in the vicinity of the splice junctions are largely made up of ESSs. The graph represents the proportion of positions occupied either by an ESE or ESS motif across approximately 25,000 internal exons. Each position on the graph represents the first base of a hexamer sliding across 100 bp of the upstream and downstream introns and the first and last 50 bp of the exon. (c) Ectopic SAVs are located predominantly in the vicinity of the splice site of the same type created, that is, the majority of ectopic splice sites created are 5' ectopic sites and are located towards the end of the exon close to the 5' splice site. hSNPs that create a strong ectopic splice site computationally ('ectopic-like' hSNPs) are distributed across the exon in quite the opposite way, indicating the same constraints do not apply to these variants.
Similar articles
- Vulnerable exons, like ACADM exon 5, are highly dependent on maintaining a correct balance between splicing enhancers and silencers.
Holm LL, Doktor TK, Hansen MB, Petersen USS, Andresen BS. Holm LL, et al. Hum Mutat. 2022 Feb;43(2):253-265. doi: 10.1002/humu.24321. Epub 2021 Dec 30. Hum Mutat. 2022. PMID: 34923709 - Computational analysis of splicing errors and mutations in human transcripts.
Kurmangaliyev YZ, Gelfand MS. Kurmangaliyev YZ, et al. BMC Genomics. 2008 Jan 14;9:13. doi: 10.1186/1471-2164-9-13. BMC Genomics. 2008. PMID: 18194514 Free PMC article. - MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing.
Mort M, Sterne-Weiler T, Li B, Ball EV, Cooper DN, Radivojac P, Sanford JR, Mooney SD. Mort M, et al. Genome Biol. 2014 Jan 13;15(1):R19. doi: 10.1186/gb-2014-15-1-r19. Genome Biol. 2014. PMID: 24451234 Free PMC article. - Rules and tools to predict the splicing effects of exonic and intronic mutations.
Ohno K, Takeda JI, Masuda A. Ohno K, et al. Wiley Interdiscip Rev RNA. 2018 Jan;9(1). doi: 10.1002/wrna.1451. Epub 2017 Sep 26. Wiley Interdiscip Rev RNA. 2018. PMID: 28949076 Review. - Searching for splicing motifs.
Chasin LA. Chasin LA. Adv Exp Med Biol. 2007;623:85-106. doi: 10.1007/978-0-387-77374-2_6. Adv Exp Med Biol. 2007. PMID: 18380342 Review.
Cited by
- Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors.
Lin YJ, Menon AS, Hu Z, Brenner SE. Lin YJ, et al. Hum Genomics. 2024 Aug 28;18(1):90. doi: 10.1186/s40246-024-00663-z. Hum Genomics. 2024. PMID: 39198917 Free PMC article. - Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors.
Lin YJ, Menon AS, Hu Z, Brenner SE. Lin YJ, et al. bioRxiv [Preprint]. 2024 Jun 28:2024.06.25.600283. doi: 10.1101/2024.06.25.600283. bioRxiv. 2024. PMID: 38979289 Free PMC article. Updated. Preprint. - SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing.
Leman R, Parfait B, Vidaud D, Girodon E, Pacot L, Le Gac G, Ka C, Ferec C, Fichou Y, Quesnelle C, Aucouturier C, Muller E, Vaur D, Castera L, Boulouard F, Ricou A, Tubeuf H, Soukarieh O, Gaildrat P, Riant F, Guillaud-Bataille M, Caputo SM, Caux-Moncoutier V, Boutry-Kryza N, Bonnet-Dorion F, Schultz I, Rossing M, Quenez O, Goldenberg L, Harter V, Parsons MT, Spurdle AB, Frébourg T, Martins A, Houdayer C, Krieger S. Leman R, et al. Hum Mutat. 2022 Dec;43(12):2308-2323. doi: 10.1002/humu.24491. Epub 2022 Nov 20. Hum Mutat. 2022. PMID: 36273432 Free PMC article. - Challenges Related to the Use of Next-Generation Sequencing for the Optimization of Drug Therapy.
Zhou Y, Lauschke VM. Zhou Y, et al. Handb Exp Pharmacol. 2023;280:237-260. doi: 10.1007/164_2022_596. Handb Exp Pharmacol. 2023. PMID: 35792943 - Splicing mutations in the CFTR gene as therapeutic targets.
Deletang K, Taulan-Cadars M. Deletang K, et al. Gene Ther. 2022 Aug;29(7-8):399-406. doi: 10.1038/s41434-022-00347-0. Epub 2022 Jun 2. Gene Ther. 2022. PMID: 35650428 Free PMC article. Review.
References
- Eriksson M, Brown WT, Gordon LB, Glynn MW, Singer J, Scott L, Erdos MR, Robbins CM, Moses TY, Berglund P, Dutra A, Pak E, Durkin S, Csoka AB, Boehnke M, Glover TW, Collins FS. Recurrent de novo point mutations in lamin A cause Hutchinson-Gilford progeria syndrome. Nature. 2003;423:293–298. doi: 10.1038/nature01629. - DOI - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources