Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports

Harsha Gurulingappa et al. J Biomed Inform. 2012 Oct.

Free article

Abstract

A significant amount of information about drug-related safety issues such as adverse effects are published in medical case reports that can only be explored by human readers due to their unstructured nature. The work presented here aims at generating a systematically annotated corpus that can support the development and validation of methods for the automatic extraction of drug-related adverse effects from medical case reports. The documents are systematically double annotated in various rounds to ensure consistent annotations. The annotated documents are finally harmonized to generate representative consensus annotations. In order to demonstrate an example use case scenario, the corpus was employed to train and validate models for the classification of informative against the non-informative sentences. A Maximum Entropy classifier trained with simple features and evaluated by 10-fold cross-validation resulted in the F₁ score of 0.70 indicating a potential useful application of the corpus.

PubMed Disclaimer

Cited by

EnzChemRED, a rich enzyme chemistry relation extraction dataset.
Lai PT, Coudert E, Aimo L, Axelsen K, Breuza L, de Castro E, Feuermann M, Morgat A, Pourcel L, Pedruzzi I, Poux S, Redaschi N, Rivoire C, Sveshnikova A, Wei CH, Leaman R, Luo L, Lu Z, Bridge A. Lai PT, et al. Sci Data. 2024 Sep 9;11(1):982. doi: 10.1038/s41597-024-03835-7. Sci Data. 2024. PMID: 39251610 Free PMC article.
BERT-based language model for accurate drug adverse event extraction from social media: implementation, evaluation, and contributions to pharmacovigilance practices.
Dong F, Guo W, Liu J, Patterson TA, Hong H. Dong F, et al. Front Public Health. 2024 Apr 23;12:1392180. doi: 10.3389/fpubh.2024.1392180. eCollection 2024. Front Public Health. 2024. PMID: 38716250 Free PMC article.
Surveying biomedical relation extraction: a critical examination of current datasets and the proposal of a new resource.
Huang MS, Han JC, Lin PY, You YT, Tsai RT, Hsu WL. Huang MS, et al. Brief Bioinform. 2024 Mar 27;25(3):bbae132. doi: 10.1093/bib/bbae132. Brief Bioinform. 2024. PMID: 38609331 Free PMC article. Review.
Using transfer learning-based causality extraction to mine latent factors for Sjögren's syndrome from biomedical literature.
VanSchaik JT, Jain P, Rajapuri A, Cheriyan B, Thyvalikakath TP, Chakraborty S. VanSchaik JT, et al. Heliyon. 2023 Aug 22;9(9):e19265. doi: 10.1016/j.heliyon.2023.e19265. eCollection 2023 Sep. Heliyon. 2023. PMID: 37809371 Free PMC article.
Revisiting Relation Extraction in the era of Large Language Models.
Wadhwa S, Amir S, Wallace BC. Wadhwa S, et al. Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589. doi: 10.18653/v1/2023.acl-long.868. Proc Conf Assoc Comput Linguist Meet. 2023. PMID: 37674787 Free PMC article.

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources
- Elsevier Science
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information

Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports - PubMed (original) (raw)