H. Brugman - Academia.edu (original) (raw)

Uploads

Papers by H. Brugman

Research paper thumbnail of Evaluating a Thesaurus Browser for an Audio-Visual Archive

In this article we report on a user study aimed at evaluating and improving a thesaurus browser. ... more In this article we report on a user study aimed at evaluating and improving a thesaurus browser. The browser is intended to be used by documentalists of a large public audio-visual archive for finding appropriate indexing terms for TV programs. The subjects involved in the study were documentalists of the institutions involved. The study provides insight into the value of various thesaurus browsing and searching techniques.

Research paper thumbnail of Towards dynamic corpora

Research paper thumbnail of Moderne Informationstechnologie und ihre Auswirkungen auf die korpus-basierte Forschung

Research paper thumbnail of Automatic Annotation Suggestions for Audiovisual Archives: Evaluation Aspects

Interdisciplinary Science Reviews, 2009

In the context of large and ever growing archives, generating annotation suggestions automaticall... more In the context of large and ever growing archives, generating annotation suggestions automatically from textual resources related to the documents to be archived is an interesting option in theory. It could save a lot of work in the time-consuming and expensive task of manual annotation and it could help cataloguers attain a higher inter annotator agreement. However, some questions arise in practice: what is the quality of the automatically produced annotations? How do they compare with manual annotations and with the requirements for annotation that were defined in the archive? If different from the manual annotations, are the automatic annotations wrong? In the CHOICE project, partially hosted at the Netherlands Institute for Sound and Vision, the Dutch public archive for audiovisual broadcasts, we automatically generate annotation suggestions for cataloguers. In this paper, we define three types of evaluation of these annotation suggestions: (1) a classic and strict precision/recall measure expressing the overlap between automatically generated keywords and the manual annotations, (2) a loosened precision/recall measure for which semantically very similar annotations are also considered as relevant matches, (3) an in-use evaluation of the usefulness of manual versus automatic annotations in the context of Serendipitous Browsing. During serendipitous browsing the annotations (manual or automatic) are used to retrieve and visualize semantically related documents.

Research paper thumbnail of EUDICO, Annotation and Exploitation of Multi Media Corpora over the Internet

mpi.nl

In this paper we describe a software framework that supports media annotation and analysis of med... more In this paper we describe a software framework that supports media annotation and analysis of media related corpora over the internet. We will present the layered architecture of this framework and we will introduce our Abstract Corpus Model with which we isolate corpus specific ...

Research paper thumbnail of Web services architecture for language resources

A web services based architecture for Language Resources utilizing existing technology such as XM... more A web services based architecture for Language Resources utilizing existing technology such as XML, SOAP, WSDL and UDDI is presented. The web services architecture creates a pervasive information infrastructure that enables straightforward access to two kinds of Language Resources: traditional information sources and language processing resources. Details a bout two practical implementations of this web services architecture are given.

Research paper thumbnail of Evaluating the SharedCanvas Manuscript Data Model in CATCHPlus

In this paper, we present the SharedCanvas model for describing the layout of culturally importan... more In this paper, we present the SharedCanvas model for describing the layout of culturally important, hand-written objects such as medieval manuscripts, which is intended to be used as a common input format to presentation interfaces. The model is evaluated using two collections from CATCHPlus not consulted during the design phase, each with their own complex requirements, in order to determine if further development is required or if the model is ready for general usage. The model is applied to the new collections, revealing several new areas of concern for user interface production and discovery of the constituent resources. However, the fundamental information modelling aspects of SharedCanvas and the underlying Open Annotation Collaboration ontology are demonstrated to be sufficient to cover the challenging new requirements. The distributed, Linked Open Data approach is validated as an important methodology to seamlessly allow simultaneous interaction with multiple repositories, and at the same time to facilitate both scholarly commentary and crowd-sourcing of the production of transcriptions.

Research paper thumbnail of Anchoring dutch cultural heritage thesauri to wordnet: two case studies

Research paper thumbnail of Evaluating a Thesaurus Browser for an Audio-Visual Archive

In this article we report on a user study aimed at evaluating and improving a thesaurus browser. ... more In this article we report on a user study aimed at evaluating and improving a thesaurus browser. The browser is intended to be used by documentalists of a large public audio-visual archive for finding appropriate indexing terms for TV programs. The subjects involved in the study were documentalists of the institutions involved. The study provides insight into the value of various thesaurus browsing and searching techniques.

Research paper thumbnail of Towards dynamic corpora

Research paper thumbnail of Moderne Informationstechnologie und ihre Auswirkungen auf die korpus-basierte Forschung

Research paper thumbnail of Automatic Annotation Suggestions for Audiovisual Archives: Evaluation Aspects

Interdisciplinary Science Reviews, 2009

In the context of large and ever growing archives, generating annotation suggestions automaticall... more In the context of large and ever growing archives, generating annotation suggestions automatically from textual resources related to the documents to be archived is an interesting option in theory. It could save a lot of work in the time-consuming and expensive task of manual annotation and it could help cataloguers attain a higher inter annotator agreement. However, some questions arise in practice: what is the quality of the automatically produced annotations? How do they compare with manual annotations and with the requirements for annotation that were defined in the archive? If different from the manual annotations, are the automatic annotations wrong? In the CHOICE project, partially hosted at the Netherlands Institute for Sound and Vision, the Dutch public archive for audiovisual broadcasts, we automatically generate annotation suggestions for cataloguers. In this paper, we define three types of evaluation of these annotation suggestions: (1) a classic and strict precision/recall measure expressing the overlap between automatically generated keywords and the manual annotations, (2) a loosened precision/recall measure for which semantically very similar annotations are also considered as relevant matches, (3) an in-use evaluation of the usefulness of manual versus automatic annotations in the context of Serendipitous Browsing. During serendipitous browsing the annotations (manual or automatic) are used to retrieve and visualize semantically related documents.

Research paper thumbnail of EUDICO, Annotation and Exploitation of Multi Media Corpora over the Internet

mpi.nl

In this paper we describe a software framework that supports media annotation and analysis of med... more In this paper we describe a software framework that supports media annotation and analysis of media related corpora over the internet. We will present the layered architecture of this framework and we will introduce our Abstract Corpus Model with which we isolate corpus specific ...

Research paper thumbnail of Web services architecture for language resources

A web services based architecture for Language Resources utilizing existing technology such as XM... more A web services based architecture for Language Resources utilizing existing technology such as XML, SOAP, WSDL and UDDI is presented. The web services architecture creates a pervasive information infrastructure that enables straightforward access to two kinds of Language Resources: traditional information sources and language processing resources. Details a bout two practical implementations of this web services architecture are given.

Research paper thumbnail of Evaluating the SharedCanvas Manuscript Data Model in CATCHPlus

In this paper, we present the SharedCanvas model for describing the layout of culturally importan... more In this paper, we present the SharedCanvas model for describing the layout of culturally important, hand-written objects such as medieval manuscripts, which is intended to be used as a common input format to presentation interfaces. The model is evaluated using two collections from CATCHPlus not consulted during the design phase, each with their own complex requirements, in order to determine if further development is required or if the model is ready for general usage. The model is applied to the new collections, revealing several new areas of concern for user interface production and discovery of the constituent resources. However, the fundamental information modelling aspects of SharedCanvas and the underlying Open Annotation Collaboration ontology are demonstrated to be sufficient to cover the challenging new requirements. The distributed, Linked Open Data approach is validated as an important methodology to seamlessly allow simultaneous interaction with multiple repositories, and at the same time to facilitate both scholarly commentary and crowd-sourcing of the production of transcriptions.

Research paper thumbnail of Anchoring dutch cultural heritage thesauri to wordnet: two case studies

Log In