Dependency and AMR Embeddings for Drug-Drug Interaction Extraction from Biomedical Literature (original) (raw)

A novel feature-based approach to extract drug-drug interactions from biomedical text

Bioinformatics, 2014

Motivation: Knowledge of drug-drug interactions (DDIs) is crucial for health-care professionals to avoid adverse effects when co-administering drugs to patients. As most newly discovered DDIs are made available through scientific publications, automatic DDI extraction is highly relevant. Results: We propose a novel feature-based approach to extract DDIs from text. Our approach consists of three steps. First, we apply text preprocessing to convert input sentences from a given dataset into structured representations. Second, we map each candidate DDI pair from that dataset into a suitable syntactic structure. Based on that, a novel set of features is used to generate feature vectors for these candidate DDI pairs. Third, the obtained feature vectors are used to train a support vector machine (SVM) classifier. When evaluated on two DDI extraction challenge test datasets from 2011 and 2013, our system achieves F-scores of 71.1% and 83.5%, respectively, outperforming any state-of-the-art DDI extraction system. Availability and implementation: The source code is available for academic use at

Automatic Extraction of Drug-Drug Interaction From Literature Through Detecting Clause Dependency and Linguistic-based Negation

— extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in the research literature. This paper aims to explore clause dependency related features alongside to linguistic-based negation scope and cues to overcome complexity of the sentences. The experiments indicate the ratio of negation cues which is another source of inaccuracy is higher in complex sentences in comparison with simple ones. Additionally, the results show by employing the proposed features combined with a bag of words kernel, the performance of the used kernel methods improves. Moreover, experiments show the enhanced local context kernel outperforms other methods. The proposed method can be used as an alternative approach for sentence simplification techniques in biomedical area which is an error-prone task.

Extraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency

Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical natural language processing. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, no significant improvement has been reported in literature, since the task is difficult. This paper aims to explore clause dependency related features alongside to linguistic-based negation scope and cues to overcome complexity of the sentences. The results show through employing the proposed features combined with a bag of words kernel, the performance of the used kernel methods improves. Moreover, experiments show that the enhanced local context kernel outperforms other methods. The proposed method can be used as an alternative approach for sentence simplification techniques in biomedical area which is an error-prone task.

Extracting Drug-Drug Interaction from Text Using Negation Features

Resumen: La extracción de relaciones entre entidades es una tarea muy impor-tante dentro del procesamiento de textos biomédicos. Se han desarrollado muchos algoritmos para este propósito aunque sólo unos pocos han estudiado el tema de las interacciones entre fármacos. En este trabajo se ha estudiado el efecto de la negación para esta tarea. En primer lugar, se describe cómo se ha extendido el corpus DrugDDI con anotaciones sobre negaciones y, en segundo lugar, se muestran una serie de experimentos en los que se muestra que tener en cuenta el efecto de la negación puede mejorar la detección de interacciones entre fármacos cuando se combina con otros métodos de extracción de relaciones. Palabras clave: Interacciones entre fármacos, negación, funciones kernel, máquinas de vectores de soporte, funciones kernel. Abstract: Extracting biomedical relations from text is an important task in BioMedical NLP. There are several systems developed for this purpose but the ones on Drug-Drug interactions are still a few. In this paper we want to show the effectiveness of negation features for this task. We firstly describe how we extended the DrugDDI corpus by annotating it with the scope of negation, and secondly we report a set of experiments in which we show that negation features provide benefits for the detection of drug-drug interactions in combination with some simple relation extraction methods.

NIL UCM: Extracting Drug-Drug interactions from text through combination of sequence and tree kernels

A drug-drug interaction (DDI) occurs when one drug affects the level or activity of another drug. Semeval 2013 DDI Extraction challenge is going to be held with the aim of identifying the state of the art relation extraction algorithms. In this paper we firstly review some of the existing approaches in relation extraction generally and biomedical relations especially. And secondly we will explain our SVM based approaches that use lexical, morphosyntactic and parse tree features. Our combination of sequence and tree kernels have shown promising performance with a best result of 0.54 F1 macroaverage on the test dataset.

Drug Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network

Bioinformatics (Oxford, England), 2016

Detecting drug-drug interaction (DDI) has become a vital part of public health safety. Therefore, using text mining techniques to extract DDIs from biomedical literature has received great attentions. However, this research is still at an early stage and its performance has much room to improve. In this paper, we present a syntax convolutional neural network (SCNN) based DDI extraction method. In this method, a novel word embedding, syntax word embedding, is proposed to employ the syntactic information of a sentence. Then the position and part of speech (POS) features are introduced to extend the embedding of each word. Later, auto-encoder is introduced to encode the traditional bag-of-words feature (sparse 0-1 vector) as the dense real value vector. Finally, a combination of embedding-based convolutional features and traditional features are fed to the softmax classifier to extract DDIs from biomedical literature. Experimental results on the DDIExtraction 2013 corpus show that SCNN...

Using a shallow linguistic kernel for drug–drug interaction extraction

Journal of biomedical …, 2011

A drug–drug interaction (DDI) occurs when one drug influences the level or activity of another drug. Information Extraction (IE) techniques can provide health care professionals with an interesting way to reduce time spent reviewing the literature for potential drug–drug interactions. Nevertheless, no approach has been proposed to the problem of extracting DDIs in biomedical texts. In this article, we study whether a machine learning-based method is appropriate for DDI extraction in biomedical texts and whether the results provided are superior to those obtained from our previously proposed pattern-based approach [1]. The method proposed here for DDI extraction is based on a supervised machine learning technique, more specifically, the shallow linguistic kernel proposed in Giuliano et al. (2006) [2]. Since no benchmark corpus was available to evaluate our approach to DDI extraction, we created the first such corpus, DrugDDI, annotated with 3169 DDIs. We performed several experiments varying the configuration parameters of the shallow linguistic kernel. The model that maximizes the F-measure was evaluated on the test data of the DrugDDI corpus, achieving a precision of 51.03%, a recall of 72.82% and an F-measure of 60.01%.To the best of our knowledge, this work has proposed the first full solution for the automatic extraction of DDIs from biomedical texts. Our study confirms that the shallow linguistic kernel outperforms our previous pattern-based approach. Additionally, it is our hope that the DrugDDI corpus will allow researchers to explore new solutions to the DDI extraction problem.Our goal is to develop an IE system to extract drug-drug interactions from biomedical texts. We use the DrugBank database as the source of unstructured textual information on drugs and their interactions. These texts are analyzed by the MetaMap tool that provides shallow syntactic and semantic information. Our system is based on a supervised machine learning approach, in particular, a shallow linguistic kernel-based approach that uses Support Vector Machines (SVM).► We propose the first full solution for the automatic extraction of drug–drug interactions (DDIs) from biomedical texts. ► We creates the first annotated corpus with DDIs in order to train and evaluate our system. ► Our system is on a shallow linguistic kernel.

One stage versus two stages deep learning approaches for the extraction of drug-drug interactions from texts

Proces. del Leng. Natural, 2020

Drug-drug interactions (DDI) are a cause of adverse drug reactions. They occur when a drug has an impact on the effect of another drug. There is not a complete, up to date database where health care professionals can consult the interactions of any drug because most of the knowledge on DDI is hidden in unstructured text. In last years, deep learning has been succesfully applied to the extraction of DDI from texts, which requires the detection and later classification of DDI. Most of the deep learning systems for DDI extraction developed so far have addressed the detection and classification in one single step. In this study, we compare the performance of one-stage and two-stage architectures for DDI extraction. Our architectures are based on a bidirectional recurrent neural network layer composed of Gated Recurrent Units. The two-stage system obtained a 67.45 % micro-average F1 score on the test set.

BERTChem-DDI : Improved Drug-Drug Interaction Prediction from text using Chemical Structure Information

ArXiv, 2020

Traditional biomedical version of embeddings obtained from pre-trained language models have recently shown state-of-the-art results for relation extraction (RE) tasks in the medical domain. In this paper, we explore how to incorporate domain knowledge, available in the form of molecular structure of drugs, for predicting Drug-Drug Interaction from textual corpus. We propose a method, BERTChem-DDI, to efficiently combine drug embeddings obtained from the rich chemical structure of drugs (encoded in SMILES) along with off-the-shelf domain-specific BioBERT embedding-based RE architecture. Experiments conducted on the DDIExtraction 2013 corpus clearly indicate that this strategy improves other strong baselines architectures by 3.4% macro F1-score.

EGFI: Drug-Drug Interaction Extraction and Generation with Fusion of Enriched Entity and Sentence Information

2021

The rapid growth in literature accumulates diverse and yet comprehensive biomedical knowledge hidden to be mined such as drug interactions. However, it is difficult to extract the heterogeneous knowledge to retrieve or even discover the latest and novel knowledge in an efficient manner. To address such a problem, we propose EGFI for extracting and consolidating drug interactions from large-scale medical literature text data. Specifically, EGFI consists of two parts: classification and generation. In the classification part, EGFI encompasses the language model BioBERT which has been comprehensively pre-trained on biomedical corpus. In particular, we propose the multi-head attention mechanism and pack BiGRU to fuse multiple semantic information for rigorous context modeling. In the generation part, EGFI utilizes another pre-trained language model BioGPT-2 where the generation sentences are selected based on filtering rules. We evaluated the classification part on "DDIs 2013"...