Wouter Voorspoels - Academia.edu (original) (raw)
Papers by Wouter Voorspoels
In contrast to nouns, little is known about the graded structure of adjective categories. In this... more In contrast to nouns, little is known about the graded structure of adjective categories. In this study we investigate if adjectives categories show a similar graded structure and explain it using a similarity-based account. The results show a reliable graded structure which is adequately explained by two formal models based on prototypes and exemplars, therefore generalizing the model performance from concrete nouns to adjectives. Finally, we show the attention weights of these models deal with the additional challenge of antonymy in adjectives and discuss the findings in the light of alternative accounts that do not rely on item-similarity.
Acta Psychologica, 2014
We collected age-of-acquisition ratings for 30,000 Dutch words.We validated these ratings against... more We collected age-of-acquisition ratings for 30,000 Dutch words.We validated these ratings against existing, smaller databases.These ratings can be used in word processing experiments.Word processing studies increasingly make use of regression analyses based on large numbers of stimuli (the so-called megastudy approach) rather than experimental designs based on small factorial designs. This requires the availability of word features for many words. Following similar studies in English, we present and validate ratings of age of acquisition and concreteness for 30,000 Dutch words. These include nearly all lemmas language researchers are likely to be interested in. The ratings are freely available for research purposes.
Judgment and decision making
Whether it pertains to the foods to buy when one is on a diet, the items to take along to the bea... more Whether it pertains to the foods to buy when one is on a diet, the items to take along to the beach on one's day off or (perish the thought) the belongings to save from one's burning house, choice is ubiquitous. We aim to determine from choices the criteria individuals use when they select objects from among a set of candidates. In order to do so we employ a mixture IRT (item-response theory) model that capitalizes on the insights that objects are chosen more often the better they meet the choice criteria and that the use of different criteria is reflected in inter-individual selection differences. The model is found to account for the inter-individual selection differences for 10 ad hoc and goal-derived categories. Its parameters can be related to selection criteria that are frequently thought of in the context of these categories. These results suggest that mixture IRT models allow one to infer from mere choice behavior the criteria individuals used to select/discard objec...
Cognitive psychology, Jan 21, 2015
A robust finding in category-based induction tasks is for positive observations to raise the will... more A robust finding in category-based induction tasks is for positive observations to raise the willingness to generalize to other categories while negative observations lower the willingness to generalize. This pattern is referred to as monotonic generalization. Across three experiments we find systematic non-monotonicity effects, in which negative observations raise the willingness to generalize. Experiments 1 and 2 show that this effect emerges in hierarchically structured domains when a negative observation from a different category is added to a positive observation. They also demonstrate that this is related to a specific kind of shift in the reasoner's hypothesis space. Experiment 3 shows that the effect depends on the assumptions that the reasoner makes about how inductive arguments are constructed. Non-monotonic reasoning occurs when people believe the facts were put together by a helpful communicator, but monotonicity is restored when they believe the observations were sa...
Frontiers in psychology, 2014
WHEN ENCOUNTERING AN UNKNOWN INDIVIDUAL, SOCIAL CATEGORIZATION OF THE INDIVIDUAL HAS BEEN SHOWN T... more WHEN ENCOUNTERING AN UNKNOWN INDIVIDUAL, SOCIAL CATEGORIZATION OF THE INDIVIDUAL HAS BEEN SHOWN TO AUTOMATICALLY PROCEED ON THE BASIS OF THREE FUNDAMENTAL DIMENSIONS: People seem to mandatorily encode race, sex and age. In contradiction to this general finding, Kurzban et al. (2001) showed that race encoding is not automatic and inevitable, but rather a byproduct of categorization in terms of coalitions. In particular, they argue and empirically support that when other coalitional information is present, the encoding of race is spectacularly reduced. In the present contribution, we present a replication of the race-erased effect reported by Kurzban et al. First, we give a detailed overview of the hypotheses, the experimental methodology, the derivation of the sample size required to achieve a power of 95%, and the criteria that need to be met for a successful replication. Then we present the findings of an empirical test that met the requirements of our power analyses. Our results i...
Acta psychologica, 2014
Word processing studies increasingly make use of regression analyses based on large numbers of st... more Word processing studies increasingly make use of regression analyses based on large numbers of stimuli (the so-called megastudy approach) rather than experimental designs based on small factorial designs. This requires the availability of word features for many words. Following similar studies in English, we present and validate ratings of age of acquisition and concreteness for 30,000 Dutch words. These include nearly all lemmas language researchers are likely to be interested in. The ratings are freely available for research purposes.
Psychiatry research, Jan 30, 2014
This study aims to assess the reliability and the validity of exemplar similarity derived from ca... more This study aims to assess the reliability and the validity of exemplar similarity derived from category fluency tasks. A homogeneous sample of 21 healthy participants completed a category fluency task twice with an interval of one week. They also rated pairs comprised of the most frequently generated exemplars in terms of similarity. Similarities were derived from the fluency data by determining the average distance between generated exemplars and correcting it for repetitions and response sequence length. We calculated the correlation between the similarities derived from the two sessions of the fluency task and between the derived similarities and the directly rated similarities. Spatial representations of the similarities were constructed using multidimensional scaling to visualize the differences between both sessions of the fluency task and the pairwise rating task. We find that the derived similarities are not stable in time and show little correspondence with directly rated s...
Language, Cognition and Neuroscience, 2013
ABSTRACT In contrast to noun categories, little is known about the graded structure of adjective ... more ABSTRACT In contrast to noun categories, little is known about the graded structure of adjective categories. In this study, we investigated whether adjective categories show a similar graded structure and what determines this structure. The results show that adjective categories like nouns exhibit a reliable graded structure. Similar to nouns, we investigated whether similarity is the main determinant of the graded structure. We derived a low-dimensional similarity representation for adjective categories and found that valence differences in adjectives constitute an important organising principle in this similarity space. Valence was not implicated in the categories’ graded structure, however. A formal similarity-based model using exemplars accounted for the graded structure by effectively discarding the valence differences between adjectives in the similarity representation through dimensional weighting. Our results generalise similarity-based accounts of graded structure and highlight a closely knit relationship between adjectives and nouns on a representational level.
The Quarterly Journal of Experimental Psychology, 2012
We examine the influence of contrast categories on the internal graded membership structure of ev... more We examine the influence of contrast categories on the internal graded membership structure of everyday concepts using computational models proposed in the artificial category learning tradition. In particular, the generalized context model , which assumes that only members of a given category contribute to the typicality of a category member, is contrasted to the similarity-dissimilarity generalized context model (SD-GCM;, which assumes that members of other categories are also influential in determining typicality. The models are compared in a hierarchical Bayesian framework in their account of the typicality gradient of five animal categories and six artefact categories. For each target category, we consider all possible relevant contrast categories. Three separate issue are examined: (a) whether contrast effects can be found, (b) which categories are responsible for these effects, and (c) whether more than one category influences the typicality. Results indicate that the internal category structure is codetermined by dissimilarity towards potential contrast categories. In most cases, only a single contrast category contributed to the typicality. The present findings suggest that contrast effects might be more widespread than has previously been assumed. Further, they stress the importance of characteristics particular of everyday concepts, which require careful consideration when applying computational models of representation of the artificial category learning tradition to everyday concepts.
Psychonomic Bulletin & Review, 2008
Psychonomic Bulletin & Review, 2011
Inspired by Barsalou's proposal that categories can be represented by ideals, we develop and test... more Inspired by Barsalou's proposal that categories can be represented by ideals, we develop and test a computational model, the ideal dimension model (IDM). The IDM is tested in its account of the typicality gradient in 11 superordinate natural language concepts and, using Bayesian model evaluation, contrasted with a standard exemplar model and a central prototype model. The IDM is found to capture typicality better than the exemplar model and the central tendency prototype model, both in terms of goodness-of-fit and generalizability. The present findings challenge the dominant view that exemplar representations are most successful, and present compelling evidence that superordinate natural language categories can be represented using an abstract summary, be it in the form of ideal representations. A supplemental appendix for this article can be downloaded from http://mc.psychonomic-journals.org/content/supplemental.
Memory & Cognition, 2013
The finding that the typicality gradient in goalderived categories is mainly driven by ideals rat... more The finding that the typicality gradient in goalderived categories is mainly driven by ideals rather than by exemplar similarity has stood uncontested for nearly three decades. Due to the rather rigid earlier implementations of similarity, a key question has remained-that is, whether a more flexible approach to similarity would alter the conclusions. In the present study, we evaluated whether a similarity-based approach that allows for dimensional weighting could account for findings in goal-derived categories. To this end, we compared a computational model of exemplar similarity (the generalized context model; Nosofsky, Journal of Experimental Psychology. General 115: [39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57] 1986) and a computational model of ideal representation (the ideal-dimension model; Voorspoels, Vanpaemel, & Storms, Psychonomic Bulletin & Review 18:1006-114, 2011 in their accounts of exemplar typicality in ten goal-derived categories. In terms of both goodnessof-fit and generalizability, we found strong evidence for an ideal approach in nearly all categories. We conclude that focusing on a limited set of features is necessary but not sufficient to account for the observed typicality gradient. A second aspect of ideal representations-that is, that extreme rather than common, central-tendency values drive typicality-seems to be crucial.
Memory & Cognition, 2011
Both intuitively, and according to similaritybased theories of induction, relevant evidence raise... more Both intuitively, and according to similaritybased theories of induction, relevant evidence raises argument strength when it is positive and lowers it when it is negative. In three experiments, we tested the hypothesis that argument strength can actually increase when negative evidence is introduced. Two kinds of argument were compared through forced choice or sequential evaluation: single positive arguments (e.g., "Shostakovich's music causes alpha waves in the brain; therefore, Bach's music causes alpha waves in the brain") and double mixed arguments (e.g., "Shostakovich's music causes alpha waves in the brain, X's music DOES NOT; therefore, Bach's music causes alpha waves in the brain"). Negative evidence in the second premise lowered credence when it applied to an item X from the same subcategory (e.g., Haydn) and raised it when it applied to a different subcategory (e.g., AC/DC). The results constitute a new constraint on models of induction.
In contrast to nouns, little is known about the graded structure of adjective categories. In this... more In contrast to nouns, little is known about the graded structure of adjective categories. In this study we investigate if adjectives categories show a similar graded structure and explain it using a similarity-based account. The results show a reliable graded structure which is adequately explained by two formal models based on prototypes and exemplars, therefore generalizing the model performance from concrete nouns to adjectives. Finally, we show the attention weights of these models deal with the additional challenge of antonymy in adjectives and discuss the findings in the light of alternative accounts that do not rely on item-similarity.
Acta Psychologica, 2014
We collected age-of-acquisition ratings for 30,000 Dutch words.We validated these ratings against... more We collected age-of-acquisition ratings for 30,000 Dutch words.We validated these ratings against existing, smaller databases.These ratings can be used in word processing experiments.Word processing studies increasingly make use of regression analyses based on large numbers of stimuli (the so-called megastudy approach) rather than experimental designs based on small factorial designs. This requires the availability of word features for many words. Following similar studies in English, we present and validate ratings of age of acquisition and concreteness for 30,000 Dutch words. These include nearly all lemmas language researchers are likely to be interested in. The ratings are freely available for research purposes.
Judgment and decision making
Whether it pertains to the foods to buy when one is on a diet, the items to take along to the bea... more Whether it pertains to the foods to buy when one is on a diet, the items to take along to the beach on one's day off or (perish the thought) the belongings to save from one's burning house, choice is ubiquitous. We aim to determine from choices the criteria individuals use when they select objects from among a set of candidates. In order to do so we employ a mixture IRT (item-response theory) model that capitalizes on the insights that objects are chosen more often the better they meet the choice criteria and that the use of different criteria is reflected in inter-individual selection differences. The model is found to account for the inter-individual selection differences for 10 ad hoc and goal-derived categories. Its parameters can be related to selection criteria that are frequently thought of in the context of these categories. These results suggest that mixture IRT models allow one to infer from mere choice behavior the criteria individuals used to select/discard objec...
Cognitive psychology, Jan 21, 2015
A robust finding in category-based induction tasks is for positive observations to raise the will... more A robust finding in category-based induction tasks is for positive observations to raise the willingness to generalize to other categories while negative observations lower the willingness to generalize. This pattern is referred to as monotonic generalization. Across three experiments we find systematic non-monotonicity effects, in which negative observations raise the willingness to generalize. Experiments 1 and 2 show that this effect emerges in hierarchically structured domains when a negative observation from a different category is added to a positive observation. They also demonstrate that this is related to a specific kind of shift in the reasoner's hypothesis space. Experiment 3 shows that the effect depends on the assumptions that the reasoner makes about how inductive arguments are constructed. Non-monotonic reasoning occurs when people believe the facts were put together by a helpful communicator, but monotonicity is restored when they believe the observations were sa...
Frontiers in psychology, 2014
WHEN ENCOUNTERING AN UNKNOWN INDIVIDUAL, SOCIAL CATEGORIZATION OF THE INDIVIDUAL HAS BEEN SHOWN T... more WHEN ENCOUNTERING AN UNKNOWN INDIVIDUAL, SOCIAL CATEGORIZATION OF THE INDIVIDUAL HAS BEEN SHOWN TO AUTOMATICALLY PROCEED ON THE BASIS OF THREE FUNDAMENTAL DIMENSIONS: People seem to mandatorily encode race, sex and age. In contradiction to this general finding, Kurzban et al. (2001) showed that race encoding is not automatic and inevitable, but rather a byproduct of categorization in terms of coalitions. In particular, they argue and empirically support that when other coalitional information is present, the encoding of race is spectacularly reduced. In the present contribution, we present a replication of the race-erased effect reported by Kurzban et al. First, we give a detailed overview of the hypotheses, the experimental methodology, the derivation of the sample size required to achieve a power of 95%, and the criteria that need to be met for a successful replication. Then we present the findings of an empirical test that met the requirements of our power analyses. Our results i...
Acta psychologica, 2014
Word processing studies increasingly make use of regression analyses based on large numbers of st... more Word processing studies increasingly make use of regression analyses based on large numbers of stimuli (the so-called megastudy approach) rather than experimental designs based on small factorial designs. This requires the availability of word features for many words. Following similar studies in English, we present and validate ratings of age of acquisition and concreteness for 30,000 Dutch words. These include nearly all lemmas language researchers are likely to be interested in. The ratings are freely available for research purposes.
Psychiatry research, Jan 30, 2014
This study aims to assess the reliability and the validity of exemplar similarity derived from ca... more This study aims to assess the reliability and the validity of exemplar similarity derived from category fluency tasks. A homogeneous sample of 21 healthy participants completed a category fluency task twice with an interval of one week. They also rated pairs comprised of the most frequently generated exemplars in terms of similarity. Similarities were derived from the fluency data by determining the average distance between generated exemplars and correcting it for repetitions and response sequence length. We calculated the correlation between the similarities derived from the two sessions of the fluency task and between the derived similarities and the directly rated similarities. Spatial representations of the similarities were constructed using multidimensional scaling to visualize the differences between both sessions of the fluency task and the pairwise rating task. We find that the derived similarities are not stable in time and show little correspondence with directly rated s...
Language, Cognition and Neuroscience, 2013
ABSTRACT In contrast to noun categories, little is known about the graded structure of adjective ... more ABSTRACT In contrast to noun categories, little is known about the graded structure of adjective categories. In this study, we investigated whether adjective categories show a similar graded structure and what determines this structure. The results show that adjective categories like nouns exhibit a reliable graded structure. Similar to nouns, we investigated whether similarity is the main determinant of the graded structure. We derived a low-dimensional similarity representation for adjective categories and found that valence differences in adjectives constitute an important organising principle in this similarity space. Valence was not implicated in the categories’ graded structure, however. A formal similarity-based model using exemplars accounted for the graded structure by effectively discarding the valence differences between adjectives in the similarity representation through dimensional weighting. Our results generalise similarity-based accounts of graded structure and highlight a closely knit relationship between adjectives and nouns on a representational level.
The Quarterly Journal of Experimental Psychology, 2012
We examine the influence of contrast categories on the internal graded membership structure of ev... more We examine the influence of contrast categories on the internal graded membership structure of everyday concepts using computational models proposed in the artificial category learning tradition. In particular, the generalized context model , which assumes that only members of a given category contribute to the typicality of a category member, is contrasted to the similarity-dissimilarity generalized context model (SD-GCM;, which assumes that members of other categories are also influential in determining typicality. The models are compared in a hierarchical Bayesian framework in their account of the typicality gradient of five animal categories and six artefact categories. For each target category, we consider all possible relevant contrast categories. Three separate issue are examined: (a) whether contrast effects can be found, (b) which categories are responsible for these effects, and (c) whether more than one category influences the typicality. Results indicate that the internal category structure is codetermined by dissimilarity towards potential contrast categories. In most cases, only a single contrast category contributed to the typicality. The present findings suggest that contrast effects might be more widespread than has previously been assumed. Further, they stress the importance of characteristics particular of everyday concepts, which require careful consideration when applying computational models of representation of the artificial category learning tradition to everyday concepts.
Psychonomic Bulletin & Review, 2008
Psychonomic Bulletin & Review, 2011
Inspired by Barsalou's proposal that categories can be represented by ideals, we develop and test... more Inspired by Barsalou's proposal that categories can be represented by ideals, we develop and test a computational model, the ideal dimension model (IDM). The IDM is tested in its account of the typicality gradient in 11 superordinate natural language concepts and, using Bayesian model evaluation, contrasted with a standard exemplar model and a central prototype model. The IDM is found to capture typicality better than the exemplar model and the central tendency prototype model, both in terms of goodness-of-fit and generalizability. The present findings challenge the dominant view that exemplar representations are most successful, and present compelling evidence that superordinate natural language categories can be represented using an abstract summary, be it in the form of ideal representations. A supplemental appendix for this article can be downloaded from http://mc.psychonomic-journals.org/content/supplemental.
Memory & Cognition, 2013
The finding that the typicality gradient in goalderived categories is mainly driven by ideals rat... more The finding that the typicality gradient in goalderived categories is mainly driven by ideals rather than by exemplar similarity has stood uncontested for nearly three decades. Due to the rather rigid earlier implementations of similarity, a key question has remained-that is, whether a more flexible approach to similarity would alter the conclusions. In the present study, we evaluated whether a similarity-based approach that allows for dimensional weighting could account for findings in goal-derived categories. To this end, we compared a computational model of exemplar similarity (the generalized context model; Nosofsky, Journal of Experimental Psychology. General 115: [39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57] 1986) and a computational model of ideal representation (the ideal-dimension model; Voorspoels, Vanpaemel, & Storms, Psychonomic Bulletin & Review 18:1006-114, 2011 in their accounts of exemplar typicality in ten goal-derived categories. In terms of both goodnessof-fit and generalizability, we found strong evidence for an ideal approach in nearly all categories. We conclude that focusing on a limited set of features is necessary but not sufficient to account for the observed typicality gradient. A second aspect of ideal representations-that is, that extreme rather than common, central-tendency values drive typicality-seems to be crucial.
Memory & Cognition, 2011
Both intuitively, and according to similaritybased theories of induction, relevant evidence raise... more Both intuitively, and according to similaritybased theories of induction, relevant evidence raises argument strength when it is positive and lowers it when it is negative. In three experiments, we tested the hypothesis that argument strength can actually increase when negative evidence is introduced. Two kinds of argument were compared through forced choice or sequential evaluation: single positive arguments (e.g., "Shostakovich's music causes alpha waves in the brain; therefore, Bach's music causes alpha waves in the brain") and double mixed arguments (e.g., "Shostakovich's music causes alpha waves in the brain, X's music DOES NOT; therefore, Bach's music causes alpha waves in the brain"). Negative evidence in the second premise lowered credence when it applied to an item X from the same subcategory (e.g., Haydn) and raised it when it applied to a different subcategory (e.g., AC/DC). The results constitute a new constraint on models of induction.