Beata Fonferko-Shadrach - Academia.edu (original) (raw)
Papers by Beata Fonferko-Shadrach
Epilepsia
ObjectiveThis study was undertaken to develop a novel pathway linking genetic data with routinely... more ObjectiveThis study was undertaken to develop a novel pathway linking genetic data with routinely collected data for people with epilepsy, and to analyze the influence of rare, deleterious genetic variants on epilepsy outcomes.MethodsWe linked whole‐exome sequencing (WES) data with routinely collected primary and secondary care data and natural language processing (NLP)‐derived seizure frequency information for people with epilepsy within the Secure Anonymised Information Linkage Databank. The study participants were adults who had consented to participate in the Swansea Neurology Biobank, Wales, between 2016 and 2018. DNA sequencing was carried out as part of the Epi25 collaboration. For each individual, we calculated the total number and cumulative burden of rare and predicted deleterious genetic variants and the total of rare and deleterious variants in epilepsy and drug metabolism genes. We compared these measures with the following outcomes: (1) no unscheduled hospital admissio...
International Journal for Population Data Science, Aug 28, 2018
Frontiers in Surgery, Aug 24, 2022
Seizure-european Journal of Epilepsy, Nov 1, 2017
International Journal for Population Data Science, Dec 7, 2020
IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzy... more IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzyme-inducing anti-epileptic drugs (EIAEDs) seems to be higher than those on other medications and the general population. National-level record linkage enables development of case-control studies at a wider scope accounting for multiple factors. Objectives and ApproachPeople with epilepsy were identified between 2003-01-01 and 2017-12-31 and were matched to a control group on: age, gender, deprivation quintile and year of diagnosis, accounting for any changes in clinical therapeutic guidelines. Primary and secondary care population records were linked to capture relevant comorbidities and major cardiovascular events. Annual district birth and death extract were used in combination with the Welsh Demographic Service (WDS) dataset to capture demographic and cardiovascular related death records. The WDS dataset was used to identify eligible control groups for each case and a linkage approach between the control and case database was developed for matching cases and controls with replacement and randomization. Survival analysis was conducted to evaluate the difference in time to first major cardiovascular event in patients receiving EIAED versus Non-EIAEDs and controls. Results10,241 cases (mean age 49.6 years, 52.2% male) with diagnosis of epilepsy were matched to 35,145 controls. 3,180 (31.1%) cases received EIAEDs and 7,061 (68.9%) received non-EIAEDs. The risk of experiencing a major cardiovascular event was higher in cases compared to controls (adjusted hazard ratio 1.52,95%CI[1.50–1.55];p<0.001). There was no significant difference in cardiovascular events between those treated with non-EIAEDs and EIAEDs (adjusted hazard ratio 1.04,95%CI[0.95-1.12];p=0.407). Conclusion / ImplicationsData linkage provides a unique opportunity and insight into studying disease risk factors. We have shown that individuals with epilepsy prescribed antiepileptic drugs, re at an increased risk of a major cardiovascular events regardless of treatment type (EIAED,NEIAED) compared with a matched control population.
Journal of Neurology, Neurosurgery, and Psychiatry, May 27, 2022
ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemi... more ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemiologi- cal trends and healthcare outcomes using routinely collected healthcare data.MethodsWe used primary and secondary care healthcare diagnostic codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003–2017. We validated IIH diagnosis codes using anonymised secondary care lists of IIH cases.ResultsWe analysed 35 million patient years of data (2003–2017). There were 1765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000, significantly increased from 2003 (prevalence=12/100,000, incidence=2.3/100,000). IIH prevalence is associated with socio-economic deprivation and increasing body mass index (BMI). 9% of people with IIH had CSF shunts with less than 0.2% having bariatric surgery. Unscheduled hospital admissions were significantly higher in the IIH cohort compared to controls; and also in IIH patients with CSF shunts compared to those without.ConclusionsIIH incidence and prevalence is increasing significantly, corresponding to population increases in BMI. This has important implications for healthcare professionals and policy makers given the comor- bidities, complications and increased healthcare utilisation and economic burden associated with IIH.lotif_miah@hotmail.com
Nature Genetics
Epilepsy is a highly heritable disorder affecting over 50 million people worldwide, of which abou... more Epilepsy is a highly heritable disorder affecting over 50 million people worldwide, of which about one-third are resistant to current treatments. Here we report a multi-ancestry genome-wide association study including 29,944 cases, stratified into three broad categories and seven subtypes of epilepsy, and 52,538 controls. We identify 26 genome-wide significant loci, 19 of which are specific to genetic generalized epilepsy (GGE). We implicate 29 likely causal genes underlying these 26 loci. SNP-based heritability analyses show that common variants explain between 39.6% and 90% of genetic risk for GGE and its subtypes. Subtype analysis revealed markedly different genetic architectures between focal and generalized epilepsies. Gene-set analyses of GGE signals implicate synaptic processes in both excitatory and inhibitory neurons in the brain. Prioritized candidate genes overlap with monogenic epilepsy genes and with targets of current antiseizure medications. Finally, we leverage our r...
Nature Communications
Copy number variants (CNV) are established risk factors for neurodevelopmental disorders with sei... more Copy number variants (CNV) are established risk factors for neurodevelopmental disorders with seizures or epilepsy. With the hypothesis that seizure disorders share genetic risk factors, we pooled CNV data from 10,590 individuals with seizure disorders, 16,109 individuals with clinically validated epilepsy, and 492,324 population controls and identified 25 genome-wide significant loci, 22 of which are novel for seizure disorders, such as deletions at 1p36.33, 1q44, 2p21-p16.3, 3q29, 8p23.3-p23.2, 9p24.3, 10q26.3, 15q11.2, 15q12-q13.1, 16p12.2, 17q21.31, duplications at 2q13, 9q34.3, 16p13.3, 17q12, 19p13.3, 20q13.33, and reciprocal CNVs at 16p11.2, and 22q11.21. Using genetic data from additional 248,751 individuals with 23 neuropsychiatric phenotypes, we explored the pleiotropy of these 25 loci. Finally, in a subset of individuals with epilepsy and detailed clinical data available, we performed phenome-wide association analyses between individual CNVs and clinical annotations categ...
British Journal of Surgery
Journal of Neurology, Neurosurgery & Psychiatry
BackgroundPublic Health England have recently reported that deaths associated with epilepsy are i... more BackgroundPublic Health England have recently reported that deaths associated with epilepsy are increasing and are associated with increased deprivation. We investigated comparable Welsh mortality trends and associations between epilepsy mortality and deprivation.MethodWe used routinely-collected health data within the Secure Anonymised Information Linkage (SAIL) Databank. We recorded deaths associated with epilepsy (DAE), epilepsy recorded on death certificates, and deaths in people with epilepsy (DPWE), people with diagnoses of epilepsy and epilepsy prescriptions before death. We compared death rates in different deprivation deciles adjusting for epilepsy prevalence.ResultsDuring 2005–2017 (41million patient-years) there were 2116 DAE and 7821 DPWE. DAE and DPWE increased from 4.3/100,000/yr and 17.2/100,000/yr in 2005–2007 to 5.7/100,000/yr and 20.9/100,000/yr in 2015–2017. The age-standardised mortality rates (ASMR) in 2006–2008 for DAE and DPWE were 5.3/100,000/yr and 20/100,00...
Supplementary information for the main study '<strong>I</strong><strong>nci... more Supplementary information for the main study '<strong>I</strong><strong>ncidence, Prevalence and Healthcare Outcomes in Idiopathic Intracranial Hypertension: A population study</strong><strong> </strong>'
International Journal of Population Data Science, 2020
IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzy... more IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzyme-inducing anti-epileptic drugs (EIAEDs) seems to be higher than those on other medications and the general population. National-level record linkage enables development of case-control studies at a wider scope accounting for multiple factors. Objectives and ApproachPeople with epilepsy were identified between 2003-01-01 and 2017-12-31 and were matched to a control group on: age, gender, deprivation quintile and year of diagnosis, accounting for any changes in clinical therapeutic guidelines. Primary and secondary care population records were linked to capture relevant comorbidities and major cardiovascular events. Annual district birth and death extract were used in combination with the Welsh Demographic Service (WDS) dataset to capture demographic and cardiovascular related death records. The WDS dataset was used to identify eligible control groups for each case and a linkage approach...
Neurology, 2021
Objective To characterize trends in incidence, prevalence, and health care outcomes in the idiopa... more Objective To characterize trends in incidence, prevalence, and health care outcomes in the idiopathic intracranial hypertension (IIH) population in Wales using routinely collected health care data. Methods We used and validated primary and secondary care IIH diagnosis codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003 and 2017. We recorded body mass index (BMI), deprivation quintile, CSF diversion surgery, and unscheduled hospital admissions in case and control cohorts. Results We analyzed 35 million patient-years of data. There were 1,765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000/y, a significant increase from 2003 (corresponding figures = 12/100,000 and 2.3/100,000/y) (p < 0.001). IIH prevalence is associated with increasing BMI and increasing deprivation. The odds ratio for developing IIH in the least deprived quintil...
The American Journal of Human Genetics, 2021
Both mild and severe epilepsies are influenced by variants in the same genes, yet an explanation ... more Both mild and severe epilepsies are influenced by variants in the same genes, yet an explanation for the resulting phenotypic variation is unknown. As part of the ongoing Epi25 Collaboration, we performed a whole-exome sequencing analysis of 13,487 epilepsy-affected individuals and 15,678 control individuals. While prior Epi25 studies focused on gene-based collapsing analyses, we asked how the pattern of variation within genes differs by epilepsy type. Specifically, we compared the genetic architectures of severe developmental and epileptic encephalopathies (DEEs) and two generally less severe epilepsies, genetic generalized epilepsy and non-acquired focal epilepsy (NAFE). Our gene-based rare variant collapsing analysis used geographic ancestry-based clustering that included broader ancestries than previously possible and revealed novel associations. Using the missense intolerance ratio (MTR), we found that variants in DEE-affected individuals are in significantly more intolerant genic sub-regions than those in NAFE-affected individuals. Only previously reported pathogenic variants absent in available genomic datasets showed a significant burden in epilepsy-affected individuals compared with control individuals, and the ultra-rare pathogenic variants associated with DEE were located in more intolerant genic sub-regions than variants associated with non-DEE epilepsies. MTR filtering improved the yield of ultra-rare pathogenic variants in affected individuals compared with control individuals. Finally, analysis of variants in genes without a disease association revealed a significant burden of loss-of-function variants in the genes most intolerant to such variation, indicating additional epilepsy-risk genes yet to be discovered. Taken together, our study suggests that genic and sub-genic intolerance are critical characteristics for interpreting the effects of variation in genes that influence epilepsy
International Journal for Population Data Science, 2017
ObjectivesElectronic healthcare records (EHR) are the main data sources that facilitate epidemiol... more ObjectivesElectronic healthcare records (EHR) are the main data sources that facilitate epidemiology research. Routinely collected data such as primary and secondary care are now easily linked to produce novel and high impact research. There are, however, rich data locked in the free text of clinical letters that are not otherwise translated into EHRs. It is highly desirable to be able to extract this information to strengthen the body of information in existing EHRs. The Swansea Collaborative in Analysis of NLP Research (SCANR) group at Swansea University has been established to evaluate the usage of Natural Language Processing platforms for obtaining new clinical data. To use Clix Enrich to extract SNOMED concepts from a variety of clinical free texts and produce EHRs from the extraction process. Approach SNOMED concepts contain common items of interest such as diagnosis, medication and symptoms, as well as contextual concepts such as historical reference and negation. Clix Enrich...
Journal of Neurology, Neurosurgery & Psychiatry
ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemi... more ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemiologi- cal trends and healthcare outcomes using routinely collected healthcare data.MethodsWe used primary and secondary care healthcare diagnostic codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003–2017. We validated IIH diagnosis codes using anonymised secondary care lists of IIH cases.ResultsWe analysed 35 million patient years of data (2003–2017). There were 1765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000, significantly increased from 2003 (prevalence=12/100,000, incidence=2.3/100,000). IIH prevalence is associated with socio-economic deprivation and increasing body mass index (BMI). 9% of people with IIH had CSF shunts with less than 0.2% having bariatric surgery. Unscheduled hospital admissions were significantly higher in ...
International Journal for Population Data Science, 2017
BackgroundFree text documents in healthcare settings contain a wealth of information not captured... more BackgroundFree text documents in healthcare settings contain a wealth of information not captured in electronic healthcare records (EHRs). Epilepsy clinic letters are an example of an unstructured data source containing a large amount of intricate disease information. Extracting meaningful and contextually correct clinical information from free text sources, to enhance EHRs, remains a significant challenge. SCANR (Swansea University Collaborative in the Analysis of NLP Research) was set up to use natural language processing (NLP) technology to extract structured data from unstructured sources. IBM Watson Content Analytics software (ICA) uses NLP technology. It enables users to define annotations based on dictionaries and language characteristics to create parsing rules that highlight relevant items. These include clinical details such as symptoms and diagnoses, medication and test results, as well as personal identifiers. ApproachTo use ICA to build a pipeline to accurately extract ...
Frontiers in Digital Health, 2021
Across various domains, such as health and social care, law, news, and social media, there are in... more Across various domains, such as health and social care, law, news, and social media, there are increasing quantities of unstructured texts being produced. These potential data sources often contain rich information that could be used for domain-specific and research purposes. However, the unstructured nature of free-text data poses a significant challenge for its utilisation due to the necessity of substantial manual intervention from domain-experts to label embedded information. Annotation tools can assist with this process by providing functionality that enables the accurate capture and transformation of unstructured texts into structured annotations, which can be used individually, or as part of larger Natural Language Processing (NLP) pipelines. We present Markup (https://www.getmarkup.com/) an open-source, web-based annotation tool that is undergoing continued development for use across all domains. Markup incorporates NLP and Active Learning (AL) technologies to enable rapid a...
International Journal for Population Data Science, 2020
IntroductionIdiopathic Intracranial Hypertension (IIH) is a condition of unknown aetiology that i... more IntroductionIdiopathic Intracranial Hypertension (IIH) is a condition of unknown aetiology that is strongly associated with obesity. IIH predominantly affects women of childbearing age and causes chronic disabling headaches, visual disturbance and, in a minority of patients, permanent visual loss. Objectives and ApproachWe characterised the IIH population, epidemiological trends and healthcare outcomes in Wales using routinely collected healthcare data. We used primary and secondary care healthcare diagnosis codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003 and 2017. We validated IIH diagnosis codes using anonymised secondary care lists of IIH cases. ResultsWe analysed 35 million patient years of data (2003–2017). There were 1765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000/year, a significant increase from 2003 (correspondin...
International Journal for Population Data Science, 2020
IntroductionUnstructured free-text clinical notes often contain valuable information relating to ... more IntroductionUnstructured free-text clinical notes often contain valuable information relating to patient symptoms, prescriptions and diagnoses. These can assist with better care for patients and novel healthcare research if transformed into accessible, structured clinical text. In particular, Natural Language Processing (NLP) algorithms can produce such structured outputs, but require gold standard data to train and validate their accuracy. While existing tools such as Brat and Webanno provide interfaces to manually annotate text, there is a lack of capability to efficiently annotate complex clinical information. Objectives and ApproachWe present Markup, an open-source, web-based annotation tool developed for use within clinical contexts by domain experts to produce gold standard annotations for NLP development. Markup incorporates NLP and Active Learning technologies to enable rapid and accurate annotation of unstructured documents. Markup supports custom user configurations, autom...
Epilepsia
ObjectiveThis study was undertaken to develop a novel pathway linking genetic data with routinely... more ObjectiveThis study was undertaken to develop a novel pathway linking genetic data with routinely collected data for people with epilepsy, and to analyze the influence of rare, deleterious genetic variants on epilepsy outcomes.MethodsWe linked whole‐exome sequencing (WES) data with routinely collected primary and secondary care data and natural language processing (NLP)‐derived seizure frequency information for people with epilepsy within the Secure Anonymised Information Linkage Databank. The study participants were adults who had consented to participate in the Swansea Neurology Biobank, Wales, between 2016 and 2018. DNA sequencing was carried out as part of the Epi25 collaboration. For each individual, we calculated the total number and cumulative burden of rare and predicted deleterious genetic variants and the total of rare and deleterious variants in epilepsy and drug metabolism genes. We compared these measures with the following outcomes: (1) no unscheduled hospital admissio...
International Journal for Population Data Science, Aug 28, 2018
Frontiers in Surgery, Aug 24, 2022
Seizure-european Journal of Epilepsy, Nov 1, 2017
International Journal for Population Data Science, Dec 7, 2020
IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzy... more IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzyme-inducing anti-epileptic drugs (EIAEDs) seems to be higher than those on other medications and the general population. National-level record linkage enables development of case-control studies at a wider scope accounting for multiple factors. Objectives and ApproachPeople with epilepsy were identified between 2003-01-01 and 2017-12-31 and were matched to a control group on: age, gender, deprivation quintile and year of diagnosis, accounting for any changes in clinical therapeutic guidelines. Primary and secondary care population records were linked to capture relevant comorbidities and major cardiovascular events. Annual district birth and death extract were used in combination with the Welsh Demographic Service (WDS) dataset to capture demographic and cardiovascular related death records. The WDS dataset was used to identify eligible control groups for each case and a linkage approach between the control and case database was developed for matching cases and controls with replacement and randomization. Survival analysis was conducted to evaluate the difference in time to first major cardiovascular event in patients receiving EIAED versus Non-EIAEDs and controls. Results10,241 cases (mean age 49.6 years, 52.2% male) with diagnosis of epilepsy were matched to 35,145 controls. 3,180 (31.1%) cases received EIAEDs and 7,061 (68.9%) received non-EIAEDs. The risk of experiencing a major cardiovascular event was higher in cases compared to controls (adjusted hazard ratio 1.52,95%CI[1.50–1.55];p&lt;0.001). There was no significant difference in cardiovascular events between those treated with non-EIAEDs and EIAEDs (adjusted hazard ratio 1.04,95%CI[0.95-1.12];p=0.407). Conclusion / ImplicationsData linkage provides a unique opportunity and insight into studying disease risk factors. We have shown that individuals with epilepsy prescribed antiepileptic drugs, re at an increased risk of a major cardiovascular events regardless of treatment type (EIAED,NEIAED) compared with a matched control population.
Journal of Neurology, Neurosurgery, and Psychiatry, May 27, 2022
ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemi... more ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemiologi- cal trends and healthcare outcomes using routinely collected healthcare data.MethodsWe used primary and secondary care healthcare diagnostic codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003–2017. We validated IIH diagnosis codes using anonymised secondary care lists of IIH cases.ResultsWe analysed 35 million patient years of data (2003–2017). There were 1765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000, significantly increased from 2003 (prevalence=12/100,000, incidence=2.3/100,000). IIH prevalence is associated with socio-economic deprivation and increasing body mass index (BMI). 9% of people with IIH had CSF shunts with less than 0.2% having bariatric surgery. Unscheduled hospital admissions were significantly higher in the IIH cohort compared to controls; and also in IIH patients with CSF shunts compared to those without.ConclusionsIIH incidence and prevalence is increasing significantly, corresponding to population increases in BMI. This has important implications for healthcare professionals and policy makers given the comor- bidities, complications and increased healthcare utilisation and economic burden associated with IIH.lotif_miah@hotmail.com
Nature Genetics
Epilepsy is a highly heritable disorder affecting over 50 million people worldwide, of which abou... more Epilepsy is a highly heritable disorder affecting over 50 million people worldwide, of which about one-third are resistant to current treatments. Here we report a multi-ancestry genome-wide association study including 29,944 cases, stratified into three broad categories and seven subtypes of epilepsy, and 52,538 controls. We identify 26 genome-wide significant loci, 19 of which are specific to genetic generalized epilepsy (GGE). We implicate 29 likely causal genes underlying these 26 loci. SNP-based heritability analyses show that common variants explain between 39.6% and 90% of genetic risk for GGE and its subtypes. Subtype analysis revealed markedly different genetic architectures between focal and generalized epilepsies. Gene-set analyses of GGE signals implicate synaptic processes in both excitatory and inhibitory neurons in the brain. Prioritized candidate genes overlap with monogenic epilepsy genes and with targets of current antiseizure medications. Finally, we leverage our r...
Nature Communications
Copy number variants (CNV) are established risk factors for neurodevelopmental disorders with sei... more Copy number variants (CNV) are established risk factors for neurodevelopmental disorders with seizures or epilepsy. With the hypothesis that seizure disorders share genetic risk factors, we pooled CNV data from 10,590 individuals with seizure disorders, 16,109 individuals with clinically validated epilepsy, and 492,324 population controls and identified 25 genome-wide significant loci, 22 of which are novel for seizure disorders, such as deletions at 1p36.33, 1q44, 2p21-p16.3, 3q29, 8p23.3-p23.2, 9p24.3, 10q26.3, 15q11.2, 15q12-q13.1, 16p12.2, 17q21.31, duplications at 2q13, 9q34.3, 16p13.3, 17q12, 19p13.3, 20q13.33, and reciprocal CNVs at 16p11.2, and 22q11.21. Using genetic data from additional 248,751 individuals with 23 neuropsychiatric phenotypes, we explored the pleiotropy of these 25 loci. Finally, in a subset of individuals with epilepsy and detailed clinical data available, we performed phenome-wide association analyses between individual CNVs and clinical annotations categ...
British Journal of Surgery
Journal of Neurology, Neurosurgery & Psychiatry
BackgroundPublic Health England have recently reported that deaths associated with epilepsy are i... more BackgroundPublic Health England have recently reported that deaths associated with epilepsy are increasing and are associated with increased deprivation. We investigated comparable Welsh mortality trends and associations between epilepsy mortality and deprivation.MethodWe used routinely-collected health data within the Secure Anonymised Information Linkage (SAIL) Databank. We recorded deaths associated with epilepsy (DAE), epilepsy recorded on death certificates, and deaths in people with epilepsy (DPWE), people with diagnoses of epilepsy and epilepsy prescriptions before death. We compared death rates in different deprivation deciles adjusting for epilepsy prevalence.ResultsDuring 2005–2017 (41million patient-years) there were 2116 DAE and 7821 DPWE. DAE and DPWE increased from 4.3/100,000/yr and 17.2/100,000/yr in 2005–2007 to 5.7/100,000/yr and 20.9/100,000/yr in 2015–2017. The age-standardised mortality rates (ASMR) in 2006–2008 for DAE and DPWE were 5.3/100,000/yr and 20/100,00...
Supplementary information for the main study '<strong>I</strong><strong>nci... more Supplementary information for the main study '<strong>I</strong><strong>ncidence, Prevalence and Healthcare Outcomes in Idiopathic Intracranial Hypertension: A population study</strong><strong> </strong>'
International Journal of Population Data Science, 2020
IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzy... more IntroductionThe risk of cardiovascular events amongst people with epilepsy who are receiving enzyme-inducing anti-epileptic drugs (EIAEDs) seems to be higher than those on other medications and the general population. National-level record linkage enables development of case-control studies at a wider scope accounting for multiple factors. Objectives and ApproachPeople with epilepsy were identified between 2003-01-01 and 2017-12-31 and were matched to a control group on: age, gender, deprivation quintile and year of diagnosis, accounting for any changes in clinical therapeutic guidelines. Primary and secondary care population records were linked to capture relevant comorbidities and major cardiovascular events. Annual district birth and death extract were used in combination with the Welsh Demographic Service (WDS) dataset to capture demographic and cardiovascular related death records. The WDS dataset was used to identify eligible control groups for each case and a linkage approach...
Neurology, 2021
Objective To characterize trends in incidence, prevalence, and health care outcomes in the idiopa... more Objective To characterize trends in incidence, prevalence, and health care outcomes in the idiopathic intracranial hypertension (IIH) population in Wales using routinely collected health care data. Methods We used and validated primary and secondary care IIH diagnosis codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003 and 2017. We recorded body mass index (BMI), deprivation quintile, CSF diversion surgery, and unscheduled hospital admissions in case and control cohorts. Results We analyzed 35 million patient-years of data. There were 1,765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000/y, a significant increase from 2003 (corresponding figures = 12/100,000 and 2.3/100,000/y) (p < 0.001). IIH prevalence is associated with increasing BMI and increasing deprivation. The odds ratio for developing IIH in the least deprived quintil...
The American Journal of Human Genetics, 2021
Both mild and severe epilepsies are influenced by variants in the same genes, yet an explanation ... more Both mild and severe epilepsies are influenced by variants in the same genes, yet an explanation for the resulting phenotypic variation is unknown. As part of the ongoing Epi25 Collaboration, we performed a whole-exome sequencing analysis of 13,487 epilepsy-affected individuals and 15,678 control individuals. While prior Epi25 studies focused on gene-based collapsing analyses, we asked how the pattern of variation within genes differs by epilepsy type. Specifically, we compared the genetic architectures of severe developmental and epileptic encephalopathies (DEEs) and two generally less severe epilepsies, genetic generalized epilepsy and non-acquired focal epilepsy (NAFE). Our gene-based rare variant collapsing analysis used geographic ancestry-based clustering that included broader ancestries than previously possible and revealed novel associations. Using the missense intolerance ratio (MTR), we found that variants in DEE-affected individuals are in significantly more intolerant genic sub-regions than those in NAFE-affected individuals. Only previously reported pathogenic variants absent in available genomic datasets showed a significant burden in epilepsy-affected individuals compared with control individuals, and the ultra-rare pathogenic variants associated with DEE were located in more intolerant genic sub-regions than variants associated with non-DEE epilepsies. MTR filtering improved the yield of ultra-rare pathogenic variants in affected individuals compared with control individuals. Finally, analysis of variants in genes without a disease association revealed a significant burden of loss-of-function variants in the genes most intolerant to such variation, indicating additional epilepsy-risk genes yet to be discovered. Taken together, our study suggests that genic and sub-genic intolerance are critical characteristics for interpreting the effects of variation in genes that influence epilepsy
International Journal for Population Data Science, 2017
ObjectivesElectronic healthcare records (EHR) are the main data sources that facilitate epidemiol... more ObjectivesElectronic healthcare records (EHR) are the main data sources that facilitate epidemiology research. Routinely collected data such as primary and secondary care are now easily linked to produce novel and high impact research. There are, however, rich data locked in the free text of clinical letters that are not otherwise translated into EHRs. It is highly desirable to be able to extract this information to strengthen the body of information in existing EHRs. The Swansea Collaborative in Analysis of NLP Research (SCANR) group at Swansea University has been established to evaluate the usage of Natural Language Processing platforms for obtaining new clinical data. To use Clix Enrich to extract SNOMED concepts from a variety of clinical free texts and produce EHRs from the extraction process. Approach SNOMED concepts contain common items of interest such as diagnosis, medication and symptoms, as well as contextual concepts such as historical reference and negation. Clix Enrich...
Journal of Neurology, Neurosurgery & Psychiatry
ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemi... more ObjectiveTo characterise the Welsh idiopathic intracranial hypertension (IIH) population, epidemiologi- cal trends and healthcare outcomes using routinely collected healthcare data.MethodsWe used primary and secondary care healthcare diagnostic codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003–2017. We validated IIH diagnosis codes using anonymised secondary care lists of IIH cases.ResultsWe analysed 35 million patient years of data (2003–2017). There were 1765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000, significantly increased from 2003 (prevalence=12/100,000, incidence=2.3/100,000). IIH prevalence is associated with socio-economic deprivation and increasing body mass index (BMI). 9% of people with IIH had CSF shunts with less than 0.2% having bariatric surgery. Unscheduled hospital admissions were significantly higher in ...
International Journal for Population Data Science, 2017
BackgroundFree text documents in healthcare settings contain a wealth of information not captured... more BackgroundFree text documents in healthcare settings contain a wealth of information not captured in electronic healthcare records (EHRs). Epilepsy clinic letters are an example of an unstructured data source containing a large amount of intricate disease information. Extracting meaningful and contextually correct clinical information from free text sources, to enhance EHRs, remains a significant challenge. SCANR (Swansea University Collaborative in the Analysis of NLP Research) was set up to use natural language processing (NLP) technology to extract structured data from unstructured sources. IBM Watson Content Analytics software (ICA) uses NLP technology. It enables users to define annotations based on dictionaries and language characteristics to create parsing rules that highlight relevant items. These include clinical details such as symptoms and diagnoses, medication and test results, as well as personal identifiers. ApproachTo use ICA to build a pipeline to accurately extract ...
Frontiers in Digital Health, 2021
Across various domains, such as health and social care, law, news, and social media, there are in... more Across various domains, such as health and social care, law, news, and social media, there are increasing quantities of unstructured texts being produced. These potential data sources often contain rich information that could be used for domain-specific and research purposes. However, the unstructured nature of free-text data poses a significant challenge for its utilisation due to the necessity of substantial manual intervention from domain-experts to label embedded information. Annotation tools can assist with this process by providing functionality that enables the accurate capture and transformation of unstructured texts into structured annotations, which can be used individually, or as part of larger Natural Language Processing (NLP) pipelines. We present Markup (https://www.getmarkup.com/) an open-source, web-based annotation tool that is undergoing continued development for use across all domains. Markup incorporates NLP and Active Learning (AL) technologies to enable rapid a...
International Journal for Population Data Science, 2020
IntroductionIdiopathic Intracranial Hypertension (IIH) is a condition of unknown aetiology that i... more IntroductionIdiopathic Intracranial Hypertension (IIH) is a condition of unknown aetiology that is strongly associated with obesity. IIH predominantly affects women of childbearing age and causes chronic disabling headaches, visual disturbance and, in a minority of patients, permanent visual loss. Objectives and ApproachWe characterised the IIH population, epidemiological trends and healthcare outcomes in Wales using routinely collected healthcare data. We used primary and secondary care healthcare diagnosis codes within the Secure Anonymised Information Linkage databank to ascertain IIH cases and controls in a retrospective cohort study between 2003 and 2017. We validated IIH diagnosis codes using anonymised secondary care lists of IIH cases. ResultsWe analysed 35 million patient years of data (2003–2017). There were 1765 cases of IIH in 2017 (85% female). The prevalence and incidence of IIH in 2017 was 76/100,000 and 7.8/100,000/year, a significant increase from 2003 (correspondin...
International Journal for Population Data Science, 2020
IntroductionUnstructured free-text clinical notes often contain valuable information relating to ... more IntroductionUnstructured free-text clinical notes often contain valuable information relating to patient symptoms, prescriptions and diagnoses. These can assist with better care for patients and novel healthcare research if transformed into accessible, structured clinical text. In particular, Natural Language Processing (NLP) algorithms can produce such structured outputs, but require gold standard data to train and validate their accuracy. While existing tools such as Brat and Webanno provide interfaces to manually annotate text, there is a lack of capability to efficiently annotate complex clinical information. Objectives and ApproachWe present Markup, an open-source, web-based annotation tool developed for use within clinical contexts by domain experts to produce gold standard annotations for NLP development. Markup incorporates NLP and Active Learning technologies to enable rapid and accurate annotation of unstructured documents. Markup supports custom user configurations, autom...