Identification of Occupation Mentions in Clinical Narratives (original) (raw)

Lecture Notes in Computer Science, 2016

Abstract

A patient’s occupation is an important variable used for disease surveillance and modeling, but such information is often only available in free-text clinical narratives. We have developed a large occupation dictionary that is used as part of both knowledge- (dictionary and rules) and data-driven (machine-learning) methods for the identification of occupation mentions. We have evaluated the approaches on both public and non-public clinical datasets. A machine-learning method using linear chain conditional random fields trained on minimalistic set of features achieved up to 88 % \( {\text{F}}_{1} \)-measure (token-level), with the occupation feature derived from the knowledge-driven method showing a notable positive impact across the datasets (up to additional 32 % \( {\text{F}}_{1} \)-measure).

Goran Nenadic hasn't uploaded this paper.

Let Goran know you want this paper to be uploaded.

Ask for this paper to be uploaded.