Heylen Kris - Academia.edu (original) (raw)
Uploads
Papers by Heylen Kris
ABSTRACT Many approaches to term extraction focus on the extraction of multiword units, assuming ... more ABSTRACT Many approaches to term extraction focus on the extraction of multiword units, assuming that multiword units comprise the majority of terms in most subject fields. However, this supposed prevalence of multiword terms has gone largely untested in the literature. In this paper, we perform a quantitative corpus-based analysis of the claim that multiword units are more technical than single word units, and that multiword units are more widespread in specialized domains. As a case study, we look at Dutch terminology from the Belgian legal domain. First, the relevant units are extracted using linguistic filters and an algorithm to identify Dutch compounds and multiword units. In a second step, we calculate for all units an association measure that captures the degree to which a linguistic unit belongs to the domain. Thirdly, we analyze the relationship between the units' technicality, frequency and their status as a simplex, compound or multiword unit.
TKE 2010: Presenting …, 2010
This paper describes how a toolset developed for the purposes of variational linguistics to ident... more This paper describes how a toolset developed for the purposes of variational linguistics to identify regional variants, can be used in the field of term extraction. The notion of stable lexical marker will be introduced as a method to quantify centrality and dispersion of terms ...
Cet article présente les résultats d'une analyse sémantique quantitative des unités lexicales... more Cet article présente les résultats d'une analyse sémantique quantitative des unités lexicales spécifiques dans un corpus technique, relevant du domaine des machines-outils pour l'usinage des métaux. L'étude vise à vérifier si et dans quelle mesure les mots-clés du corpus technique sont monosémiques. A cet effet, nous procédons à une analyse statistique de régression simple, qui permet d'étudier la corrélation entre le rang de spécificité des mots-clés et leur rang de monosémie, mais qui soulève des problèmes statistiques et méthodologiques, notamment un biais de fréquence. Pour y remédier, nous adoptons une approche alternative pour le repérage des unités lexicales spécifiques, à savoir l'analyse des marqueurs lexicaux stables ou Stable Lexical Marker Analysis (SLMA). Nous discutons les résultats quantitatifs et statistiques de cette approche dans la perspective de la corrélation entre le rang de spécificité et le rang de monosémie.
International Journal of Corpus Linguistics
... Using multifactorial statisti-cal techniques, both Gries and Grondelaers are able to examine ... more ... Using multifactorial statisti-cal techniques, both Gries and Grondelaers are able to examine the com-bined effect of explanatory variables on syntactic variation. Gries exam-ines the relative strength of variables by pair-wise comparison. ...
The study of lexical collocations occupies a central position in corpus linguistic research. Lexi... more The study of lexical collocations occupies a central position in corpus linguistic research. Lexical restrictions on a word's combinatorial possibilities are often an integral part of corpus linguistic analyses and are applied in various domains (e.g. lexicography, language teaching). However, if a corpus is considered a sample of spontaneously realized language use by a linguistic community in (a) given setting(s), it is rather surprising that the settings of actual language use received little attention in traditional corpus linguistics. In this contribution, we will focus on the impact of the usage settings on the linguistic properties of the language use in a corpus. We will investigate whether lexical collocability is subject to extra-linguistic constraints. Based on a variational case study, viz. the inflectional variation of attributive adjectives in Dutch, it will be demonstrated that the collocational strength of the AN pair is significantly modified by register, region...
Studies in Generative Grammar, 2005
Advances in Cognitive Sociolinguistics, 2010
Applying word space models to sociolinguistics. Religion names before and after 9/11. Yves Peirsm... more Applying word space models to sociolinguistics. Religion names before and after 9/11. Yves Peirsman, Kris Heylen and Dirk Geeraerts Abstract Researchers in disciplines like lexical semantics and critical discourse analysis are in need of a quantitative method that allows them to ...
Proceedings of the EACL 2012 …, 2012
Corpus Linguistics and Linguistic Theory, 2000
This paper presents a technical state of the art in usage-based linguistics as defined in the con... more This paper presents a technical state of the art in usage-based linguistics as defined in the context of Cognitive Linguistics. Starting from actual case studies rather than theoretical assumptions, methodological issues concerning the usage-based approach are ...
ABSTRACT Many approaches to term extraction focus on the extraction of multiword units, assuming ... more ABSTRACT Many approaches to term extraction focus on the extraction of multiword units, assuming that multiword units comprise the majority of terms in most subject fields. However, this supposed prevalence of multiword terms has gone largely untested in the literature. In this paper, we perform a quantitative corpus-based analysis of the claim that multiword units are more technical than single word units, and that multiword units are more widespread in specialized domains. As a case study, we look at Dutch terminology from the Belgian legal domain. First, the relevant units are extracted using linguistic filters and an algorithm to identify Dutch compounds and multiword units. In a second step, we calculate for all units an association measure that captures the degree to which a linguistic unit belongs to the domain. Thirdly, we analyze the relationship between the units' technicality, frequency and their status as a simplex, compound or multiword unit.
TKE 2010: Presenting …, 2010
This paper describes how a toolset developed for the purposes of variational linguistics to ident... more This paper describes how a toolset developed for the purposes of variational linguistics to identify regional variants, can be used in the field of term extraction. The notion of stable lexical marker will be introduced as a method to quantify centrality and dispersion of terms ...
Cet article présente les résultats d'une analyse sémantique quantitative des unités lexicales... more Cet article présente les résultats d'une analyse sémantique quantitative des unités lexicales spécifiques dans un corpus technique, relevant du domaine des machines-outils pour l'usinage des métaux. L'étude vise à vérifier si et dans quelle mesure les mots-clés du corpus technique sont monosémiques. A cet effet, nous procédons à une analyse statistique de régression simple, qui permet d'étudier la corrélation entre le rang de spécificité des mots-clés et leur rang de monosémie, mais qui soulève des problèmes statistiques et méthodologiques, notamment un biais de fréquence. Pour y remédier, nous adoptons une approche alternative pour le repérage des unités lexicales spécifiques, à savoir l'analyse des marqueurs lexicaux stables ou Stable Lexical Marker Analysis (SLMA). Nous discutons les résultats quantitatifs et statistiques de cette approche dans la perspective de la corrélation entre le rang de spécificité et le rang de monosémie.
International Journal of Corpus Linguistics
... Using multifactorial statisti-cal techniques, both Gries and Grondelaers are able to examine ... more ... Using multifactorial statisti-cal techniques, both Gries and Grondelaers are able to examine the com-bined effect of explanatory variables on syntactic variation. Gries exam-ines the relative strength of variables by pair-wise comparison. ...
The study of lexical collocations occupies a central position in corpus linguistic research. Lexi... more The study of lexical collocations occupies a central position in corpus linguistic research. Lexical restrictions on a word's combinatorial possibilities are often an integral part of corpus linguistic analyses and are applied in various domains (e.g. lexicography, language teaching). However, if a corpus is considered a sample of spontaneously realized language use by a linguistic community in (a) given setting(s), it is rather surprising that the settings of actual language use received little attention in traditional corpus linguistics. In this contribution, we will focus on the impact of the usage settings on the linguistic properties of the language use in a corpus. We will investigate whether lexical collocability is subject to extra-linguistic constraints. Based on a variational case study, viz. the inflectional variation of attributive adjectives in Dutch, it will be demonstrated that the collocational strength of the AN pair is significantly modified by register, region...
Studies in Generative Grammar, 2005
Advances in Cognitive Sociolinguistics, 2010
Applying word space models to sociolinguistics. Religion names before and after 9/11. Yves Peirsm... more Applying word space models to sociolinguistics. Religion names before and after 9/11. Yves Peirsman, Kris Heylen and Dirk Geeraerts Abstract Researchers in disciplines like lexical semantics and critical discourse analysis are in need of a quantitative method that allows them to ...
Proceedings of the EACL 2012 …, 2012
Corpus Linguistics and Linguistic Theory, 2000
This paper presents a technical state of the art in usage-based linguistics as defined in the con... more This paper presents a technical state of the art in usage-based linguistics as defined in the context of Cognitive Linguistics. Starting from actual case studies rather than theoretical assumptions, methodological issues concerning the usage-based approach are ...