Christine Jacquin - Academia.edu (original) (raw)

Papers by Christine Jacquin

Research paper thumbnail of Indexing a Web Site to Highlight Its Content

ThisarticlepresentsanewapproachinordertoindexaWeb site. It uses ontologies and natural language t... more ThisarticlepresentsanewapproachinordertoindexaWeb site. It uses ontologies and natural language techniques for information retrievalontheInternet.Themaingoalistobuildastructuredindexof the Web site. This structure is given by a terminology oriented ontology of a domain which is chosen a priori according to the content of the Web site. The indexing process uses improved natural language technics.

Research paper thumbnail of Indexation sØmantique de documents sur le Web: application aux ressources humaines

Research paper thumbnail of A Distributed Agent-Based System for Knowledge-Based Internet Information Retrieval

Research paper thumbnail of Systemes question-réponse et eurowordnet

Research paper thumbnail of Sources d'informations et de connaissances: de la gestion locale à la recherche distribuée

Research paper thumbnail of Indexation sémantique de documents sur le web: application aux ressources humaines

Dans cet article nous présentons nos travaux en cours relatifs à la gestion des ressources humain... more Dans cet article nous présentons nos travaux en cours relatifs à la gestion des ressources humaines sur le Web. Nous nous intéressons plus particulièrement à l'extraction et à la structuration du contenu des Curriculum Vitae dans le but de faciliter le processus de recherche et d'offre d'emploi sur le Web.

Research paper thumbnail of Vers un système d’annotation distribué

Research paper thumbnail of Annotations sur le Web: notes de lecture

De nombreux systèmes de partage d'information existent de nos jours mais les spécificités du Web ... more De nombreux systèmes de partage d'information existent de nos jours mais les spécificités du Web en font des outils extrêmement difficiles à exploiter. Les outils d'annotation visent à améliorer échange, communication et interopérabilité sur le Web. Notre objectif est de faire une synthèse des caractéristiques des annotations. Cette étude générale nous permet de caractériser les annotations sémantiques.

Research paper thumbnail of BONOM: un système Multi− Agents pour la recherche d'informations sur Internet dirigée par la connaissance

Complément aux actes …, 2000

S C a z a l e n s , E . D e s m o n t i l s , C . J a c q u i n , P . L a m a r r e

Research paper thumbnail of Using Web Sites in University Courses as Bulletin Boards and for Enrichment Use of Natural Language Technics to Improve Text Retrieval on The Web

World Conference on the WWW and Internet, 1997

Research paper thumbnail of A Web site indexing process for an Internet information retrieval agent system

Proceedings of the First International Conference on Web Information Systems Engineering, 2000

... At this stage, only senses used in an hypernym path are selected. Then, we match “isa” paths ... more ... At this stage, only senses used in an hypernym path are selected. Then, we match “isa” paths and hypernym paths to find possible senses for each term. ... (extend-category general-ont Artificial-Agent 0 1) Notice our automatic process provides good results. ...

Research paper thumbnail of Indexing a Web Site to Highlight Its Content

Lecture Notes in Computer Science, 2001

This article presents a new approach in order to indexa Web site. It uses ontologies and natural ... more This article presents a new approach in order to indexa Web site. It uses ontologies and natural language techniques for information retrieval on the Internet. The main goal is to build a structured indexof the Web site. This structure is given by a terminology oriented ontology of a domain which is chosen a priori according to the content of the

Research paper thumbnail of Automatic named identification of speakers using diarization and ASR systems

2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009

In this paper, we consider the extraction of speaker identity from audio records of broadcast new... more In this paper, we consider the extraction of speaker identity from audio records of broadcast news without a priori acoustic information about speakers. Using an automatic speech recognition system and an automatic speaker diarization system, we present improvements for a method which allows to extract speaker identities from automatic transcripts and to assign them to speech segments.

Research paper thumbnail of Clustering Short Text and Its Evaluation

Lecture Notes in Computer Science, 2012

ABSTRACT Recently there has been an increase in interest towards clustering short text because it... more ABSTRACT Recently there has been an increase in interest towards clustering short text because it could be used in many NLP applications. According to the application, a variety of short text could be defined mainly in terms of their length (e.g. sentence, paragraphs) and type (e.g. scientific papers, newspapers). Finding a clustering method that is able to cluster short text in general is difficult. In this paper, we cluster 4 different corpora with different types of text with varying length and evaluate them against the gold standard. Based on these clustering experiments, we show how different similarity measures, clustering algorithms, and cluster evaluation methods effect the resulting clusters. We discuss four existing corpus based similarity methods, Cosine similarity, Latent Semantic Analysis, Short text Vector Space Model, and Kullback-Leibler distance, four well known clustering methods, Complete Link, Single Link, Average Link hierarchical clustering and Spectral clustering, and three evaluation methods, clustering F-measure, adjusted Rand Index, and V. Our experiments show that corpus based similarity measures do not significantly affect the clusters and that the performance of spectral clustering is better than hierarchical clustering. We also show that the values given by the evaluation methods do not always represent the usability of the clusters.

Research paper thumbnail of Dinosys: An Annotation Tool for Web-Based Learning

Lecture Notes in Computer Science, 2004

The main function of freeform annotation systems is to improve exchange, communication and intero... more The main function of freeform annotation systems is to improve exchange, communication and interoperability on the Web. The purpose of this paper is on the one hand, to make a synthesis of the characteristics of the annotations and architectures of annotation systems, and on the other hand to propose a new architecture for an annotation system which is easy to use, lightweight, efficient, non-intrusive, scaleable, shared and platform-independent. The use of this tool within the e-learning framework is also studied.

Research paper thumbnail of The Answer Validation System ProdicosAV Dedicated to French

Lecture Notes in Computer Science, 2009

In this paper, we present the ProdicosAV answer validation system which was developed by the NLP ... more In this paper, we present the ProdicosAV answer validation system which was developed by the NLP team from the LINA institute. ProdicosAV system is based on the Prodicos System which participated two years ago in the Question Answering CLEF evaluation campaign for French. We firstly present the modifications made on Prodicos to improve it and to adapt it to a new kind of exercise. We present in details the ranking passage module and the temporal validator module. Secondly, the answer-validation module dedicated to the AVE task is presented. Finally, the evaluation is put forward to justify the results obtained.

Research paper thumbnail of Question Types Specification for the Use of Specialized Patterns in Prodicos System

Lecture Notes in Computer Science, 2007

We present the second version of the Prodicos query answering system which was developed by the T... more We present the second version of the Prodicos query answering system which was developed by the TALN team from the LINA institute. The main improvements made concern in the one hand, the use of external knowledge (Wikipedia) to improve the passage selection step. And on the other hand, the answer extraction step is improved by the determination of four different strategies for locating the answer to a question regarding its type. Afterwards, for the passage selection and answer extraction modules, the evaluation is put forward to justify the results obtained.

Research paper thumbnail of The Query Answering System PRODICOS

Lecture Notes in Computer Science, 2006

In this paper, we present the PRODICOS query answering system which was developed by the TALN tea... more In this paper, we present the PRODICOS query answering system which was developed by the TALN team from the LINA institute. We present the various modules constituting our system and for all of them, their evaluation in order to explain the obtained results. Then, we present the main improvement based on the use of semantic data.

Research paper thumbnail of French EuroWordNet Lexical Database Improvements

Lecture Notes in Computer Science, 2007

... d'Informatique Nantes Atlantique 2 rue de la Houssinière BP92208, 44322 Nantes Cedex... more ... d'Informatique Nantes Atlantique 2 rue de la Houssinière BP92208, 44322 Nantes Cedex 03 France {christine.jacquin,emmanuel.desmontils,laura ... the se-mantic knowledge is often coming from thesaurus like WordNet [3]. For European language such as Dutch, German, French ...

Research paper thumbnail of Analyse conjointe du signal sonore et de sa transcription pour l'identification nommée de locuteurs

For some years, processing mass of multimedia documents has become a very crucial issue for appli... more For some years, processing mass of multimedia documents has become a very crucial issue for applications like indexation or information retrieval. Among the focused information, speaker identity can be very useful for such applications. A huge collection of documents cannot be manually processed with a reasonable cost: only automatic systems are a relevant solution.In this paper, we consider the extraction

Research paper thumbnail of Indexing a Web Site to Highlight Its Content

ThisarticlepresentsanewapproachinordertoindexaWeb site. It uses ontologies and natural language t... more ThisarticlepresentsanewapproachinordertoindexaWeb site. It uses ontologies and natural language techniques for information retrievalontheInternet.Themaingoalistobuildastructuredindexof the Web site. This structure is given by a terminology oriented ontology of a domain which is chosen a priori according to the content of the Web site. The indexing process uses improved natural language technics.

Research paper thumbnail of Indexation sØmantique de documents sur le Web: application aux ressources humaines

Research paper thumbnail of A Distributed Agent-Based System for Knowledge-Based Internet Information Retrieval

Research paper thumbnail of Systemes question-réponse et eurowordnet

Research paper thumbnail of Sources d'informations et de connaissances: de la gestion locale à la recherche distribuée

Research paper thumbnail of Indexation sémantique de documents sur le web: application aux ressources humaines

Dans cet article nous présentons nos travaux en cours relatifs à la gestion des ressources humain... more Dans cet article nous présentons nos travaux en cours relatifs à la gestion des ressources humaines sur le Web. Nous nous intéressons plus particulièrement à l'extraction et à la structuration du contenu des Curriculum Vitae dans le but de faciliter le processus de recherche et d'offre d'emploi sur le Web.

Research paper thumbnail of Vers un système d’annotation distribué

Research paper thumbnail of Annotations sur le Web: notes de lecture

De nombreux systèmes de partage d'information existent de nos jours mais les spécificités du Web ... more De nombreux systèmes de partage d'information existent de nos jours mais les spécificités du Web en font des outils extrêmement difficiles à exploiter. Les outils d'annotation visent à améliorer échange, communication et interopérabilité sur le Web. Notre objectif est de faire une synthèse des caractéristiques des annotations. Cette étude générale nous permet de caractériser les annotations sémantiques.

Research paper thumbnail of BONOM: un système Multi− Agents pour la recherche d'informations sur Internet dirigée par la connaissance

Complément aux actes …, 2000

S C a z a l e n s , E . D e s m o n t i l s , C . J a c q u i n , P . L a m a r r e

Research paper thumbnail of Using Web Sites in University Courses as Bulletin Boards and for Enrichment Use of Natural Language Technics to Improve Text Retrieval on The Web

World Conference on the WWW and Internet, 1997

Research paper thumbnail of A Web site indexing process for an Internet information retrieval agent system

Proceedings of the First International Conference on Web Information Systems Engineering, 2000

... At this stage, only senses used in an hypernym path are selected. Then, we match “isa” paths ... more ... At this stage, only senses used in an hypernym path are selected. Then, we match “isa” paths and hypernym paths to find possible senses for each term. ... (extend-category general-ont Artificial-Agent 0 1) Notice our automatic process provides good results. ...

Research paper thumbnail of Indexing a Web Site to Highlight Its Content

Lecture Notes in Computer Science, 2001

This article presents a new approach in order to indexa Web site. It uses ontologies and natural ... more This article presents a new approach in order to indexa Web site. It uses ontologies and natural language techniques for information retrieval on the Internet. The main goal is to build a structured indexof the Web site. This structure is given by a terminology oriented ontology of a domain which is chosen a priori according to the content of the

Research paper thumbnail of Automatic named identification of speakers using diarization and ASR systems

2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009

In this paper, we consider the extraction of speaker identity from audio records of broadcast new... more In this paper, we consider the extraction of speaker identity from audio records of broadcast news without a priori acoustic information about speakers. Using an automatic speech recognition system and an automatic speaker diarization system, we present improvements for a method which allows to extract speaker identities from automatic transcripts and to assign them to speech segments.

Research paper thumbnail of Clustering Short Text and Its Evaluation

Lecture Notes in Computer Science, 2012

ABSTRACT Recently there has been an increase in interest towards clustering short text because it... more ABSTRACT Recently there has been an increase in interest towards clustering short text because it could be used in many NLP applications. According to the application, a variety of short text could be defined mainly in terms of their length (e.g. sentence, paragraphs) and type (e.g. scientific papers, newspapers). Finding a clustering method that is able to cluster short text in general is difficult. In this paper, we cluster 4 different corpora with different types of text with varying length and evaluate them against the gold standard. Based on these clustering experiments, we show how different similarity measures, clustering algorithms, and cluster evaluation methods effect the resulting clusters. We discuss four existing corpus based similarity methods, Cosine similarity, Latent Semantic Analysis, Short text Vector Space Model, and Kullback-Leibler distance, four well known clustering methods, Complete Link, Single Link, Average Link hierarchical clustering and Spectral clustering, and three evaluation methods, clustering F-measure, adjusted Rand Index, and V. Our experiments show that corpus based similarity measures do not significantly affect the clusters and that the performance of spectral clustering is better than hierarchical clustering. We also show that the values given by the evaluation methods do not always represent the usability of the clusters.

Research paper thumbnail of Dinosys: An Annotation Tool for Web-Based Learning

Lecture Notes in Computer Science, 2004

The main function of freeform annotation systems is to improve exchange, communication and intero... more The main function of freeform annotation systems is to improve exchange, communication and interoperability on the Web. The purpose of this paper is on the one hand, to make a synthesis of the characteristics of the annotations and architectures of annotation systems, and on the other hand to propose a new architecture for an annotation system which is easy to use, lightweight, efficient, non-intrusive, scaleable, shared and platform-independent. The use of this tool within the e-learning framework is also studied.

Research paper thumbnail of The Answer Validation System ProdicosAV Dedicated to French

Lecture Notes in Computer Science, 2009

In this paper, we present the ProdicosAV answer validation system which was developed by the NLP ... more In this paper, we present the ProdicosAV answer validation system which was developed by the NLP team from the LINA institute. ProdicosAV system is based on the Prodicos System which participated two years ago in the Question Answering CLEF evaluation campaign for French. We firstly present the modifications made on Prodicos to improve it and to adapt it to a new kind of exercise. We present in details the ranking passage module and the temporal validator module. Secondly, the answer-validation module dedicated to the AVE task is presented. Finally, the evaluation is put forward to justify the results obtained.

Research paper thumbnail of Question Types Specification for the Use of Specialized Patterns in Prodicos System

Lecture Notes in Computer Science, 2007

We present the second version of the Prodicos query answering system which was developed by the T... more We present the second version of the Prodicos query answering system which was developed by the TALN team from the LINA institute. The main improvements made concern in the one hand, the use of external knowledge (Wikipedia) to improve the passage selection step. And on the other hand, the answer extraction step is improved by the determination of four different strategies for locating the answer to a question regarding its type. Afterwards, for the passage selection and answer extraction modules, the evaluation is put forward to justify the results obtained.

Research paper thumbnail of The Query Answering System PRODICOS

Lecture Notes in Computer Science, 2006

In this paper, we present the PRODICOS query answering system which was developed by the TALN tea... more In this paper, we present the PRODICOS query answering system which was developed by the TALN team from the LINA institute. We present the various modules constituting our system and for all of them, their evaluation in order to explain the obtained results. Then, we present the main improvement based on the use of semantic data.

Research paper thumbnail of French EuroWordNet Lexical Database Improvements

Lecture Notes in Computer Science, 2007

... d'Informatique Nantes Atlantique 2 rue de la Houssinière BP92208, 44322 Nantes Cedex... more ... d'Informatique Nantes Atlantique 2 rue de la Houssinière BP92208, 44322 Nantes Cedex 03 France {christine.jacquin,emmanuel.desmontils,laura ... the se-mantic knowledge is often coming from thesaurus like WordNet [3]. For European language such as Dutch, German, French ...

Research paper thumbnail of Analyse conjointe du signal sonore et de sa transcription pour l'identification nommée de locuteurs

For some years, processing mass of multimedia documents has become a very crucial issue for appli... more For some years, processing mass of multimedia documents has become a very crucial issue for applications like indexation or information retrieval. Among the focused information, speaker identity can be very useful for such applications. A huge collection of documents cannot be manually processed with a reasonable cost: only automatic systems are a relevant solution.In this paper, we consider the extraction