Vincent Berthiaume - Academia.edu (original) (raw)

Papers by Vincent Berthiaume

Research paper thumbnail of Comparing the inductive biases of simple neural networks and Bayesian models

Cognitive Science, 2012

Comparing the inductive biases of simple neural networks and Bayesian models Thomas L. Griffiths ... more Comparing the inductive biases of simple neural networks and Bayesian models Thomas L. Griffiths (tom griffiths@berkeley.edu) Joseph L. Austerweil (joseph.austerweil@gmail.com) Vincent G. Berthiaume (vberthiaume@berkeley.edu) Department of Psychology, University of California, Berkeley, CA 94720 USA Abstract Understanding the relationship between connectionist and probabilistic models is important for evaluating the compati- bility of these approaches. We use mathematical analyses and computer simulations to show that a linear neural network can approximate the generalization performance of a probabilis- tic model of property induction, and that training this network by gradient descent with early stopping results in similar per- formance to Bayesian inference with a particular prior. How- ever, this prior differs from distributions defined using discrete structure, suggesting that neural networks have inductive bi- ases that can be differentiated from probabilistic models with stru...

Research paper thumbnail of Where is Sally's marble? Learning to predict others' actions in true and false belief situations

Theory of Mind (ToM) is the lay person’s understanding that others have mental states. Central to... more Theory of Mind (ToM) is the lay person’s understanding that others have mental states. Central to the field of social cognition, this ability is fundamental to any individual living in society, if only to account for others’ points of view and beliefs. The standard False Belief (FB) task tests whether subjects can use a protagonist’s belief to predict that she will search for an object where she last saw it (e.g., Wimmer & Perner, 1983). There are four different possible conditions, depending on whether the protagonist pays attention (A) or not (NA) when the object is moved (M) from location X to Y or not (NM). Children younger than four usually fail the task, predicting in all conditions that the protagonist will search where the object ends up. However, recent work (Onishi & Baillargeon, 2005) showed 15 month olds succeeding at a visual version of the task. Would it be possible by 15 months to have learned, through observation, to make accurate predictions about where a protagonis...

Research paper thumbnail of A Computational Developmental Model of the Implicit False Belief Task

A Computational Developmental Model of the Implicit False Belief Task Vincent G. Berthiaume (Vinc... more A Computational Developmental Model of the Implicit False Belief Task Vincent G. Berthiaume (Vincent.Berthiaume@McGill.ca) Department of Psychology, 1205 Dr. Penfield Avenue Montreal, Qc, H3A 1B1 Canada Kristine H. Onishi (Kris.Onishi@McGill.ca) Department of Psychology, 1205 Dr. Penfield Avenue Montreal, Qc, H3A 1B1 Canada Thomas R. Shultz (Thomas.Shultz@McGill.ca) Department of Psychology and School of Computer Science, 1205 Dr. Penfield Avenue Montreal, Qc, H3A 1B1 Canada years and eight months (Wellman, Cross, & Watson, 2001) will typically say that Sally will search in the box, an expectation that is consistent with an omniscient ToM – i.e., Sally will search in the actual location of the object. An older child will instead typically say that Sally will search in the basket, an expectation that is consistent with a representational ToM – i.e., Sally will search for the object in accord with her mental representation of its location. Recently, 15-month-olds were shown to solve a...

Research paper thumbnail of White- and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computational Model of the Raven Matrices Task

White- and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computationa... more White- and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computational Model of the Raven Matrices Task Vincent G. Berthiaume (Vincent.Berthiaume@McGill.ca) Department of Psychology, McGill University, 1205 Dr. Penfield Avenue Montreal, QC H3A 1B1 Canada Thomas R. Shultz (Thomas.Shultz@McGill.ca) Department of Psychology and School of Computer Science, McGill University, 1205 Dr. Penfield Avenue Montreal, QC H3A 1B1 Canada Olaf Dammann (ODammann@TuftsMedicalCenter.org) Division of Newborn Medicine, Floating Hospital for Children at Tufts Medical Center, 800 Washington Street, Box 854 Boston, MA 02111 USA preterm brains (Leviton & Paneth, 1990). By contrast, grey matter consists of neuronal cell bodies and its damage is usually more constrained in the preterm brain (Billiards, Pierson, Haynes, Folkerth, & Kinney, 2006). Although the association between cognitive impairments and brain damage is well known in the pediatric community, not much is known about ...

Research paper thumbnail of A systematic comparison of flat and standard cascade-correlation using a student-teacher network approximation task

Http Dx Doi Org 10 1080 09540090701528951, Aug 28, 2007

Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computation... more Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computational power to the task being learned. The standard cascor algorithm recruits each hidden unit on a new layer, creating deep networks. In contrast, the flat cascor variant adds all recruited hidden units on a single hidden layer. Student-teacher network approximation tasks were used to investigate the ability of flat and standard cascor networks to learn the input-output mapping of other, randomly initialized flat and standard cascor networks. For lowcomplexity approximation tasks, there was no significant performance difference between flat and standard student networks. Contrary to the common belief that standard cascor does not generalize well due to cascading weights creating deep networks, we found that both standard and flat cascor generalized well on problems of varying complexity. On high-complexity tasks, flat cascor networks had fewer connection weights and learned with less computational cost than standard networks did.

Research paper thumbnail of Where is Sally's marble? Learning to predict others' actions in true and false belief situations

Cognitio 2007, Jun 15, 2007

Research paper thumbnail of A computational developmental model of the implicit false belief task

Do children understand that others have mental representations, for instance, mental representati... more Do children understand that others have mental representations, for instance, mental representations of an object's location? This understanding, known as a representational Theory of Mind (ToM) has typically been studied using false-belief (FB) tasks. Standard, verbal FB tasks test whether a child can use protagonists' beliefs to say that they will search for objects where they last saw them. Whereas children under 3.5 years typically fail the task and expect protagonists to search where objects are (expectation consistent with an omniscient ToM), older children expect protagonists to search where they last saw the objects (expectation consistent with a representational ToM). Recently, 15-month-olds were shown to succeed at a visual, implicit version of the task. We present a sibling-descendant cascade-correlation connectionist model that learns to succeed at an implicit FB task. When trained on twice as many true-as false-belief trials, our model reproduced the omniscient-torepresentational transition observed in explicit tasks. That is, networks first had expectations consistent with an omniscient ToM, and after further training had expectations consistent with a representational ToM. Thus, our model predicts that infants may also go through a transition on the implicit task, and suggests that this transition may be due in part to people holding more true than false beliefs.

Research paper thumbnail of A systematic comparison of flat and standard cascade-correlation using a student–teacher network approximation task

Connection Science, 2007

Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computation... more Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computational power to the task being learned. The standard cascor algorithm recruits each hidden unit on a new layer, creating deep networks. In contrast, the flat cascor variant adds all recruited hidden units on a single hidden layer. Student-teacher network approximation tasks were used to investigate the ability of flat and standard cascor networks to learn the input-output mapping of other, randomly initialized flat and standard cascor networks. For lowcomplexity approximation tasks, there was no significant performance difference between flat and standard student networks. Contrary to the common belief that standard cascor does not generalize well due to cascading weights creating deep networks, we found that both standard and flat cascor generalized well on problems of varying complexity. On high-complexity tasks, flat cascor networks had fewer connection weights and learned with less computational cost than standard networks did.

Research paper thumbnail of A constructivist connectionist model of transitions on false-belief tasks

Cognition, 2013

How do children come to understand that others have mental representations, e.g., of an object&am... more How do children come to understand that others have mental representations, e.g., of an object's location? Preschoolers go through two transitions on verbal false-belief tasks, in which they have to predict where an agent will search for an object that was moved in her absence. First, while three-and-a-half-year-olds usually fail at approach tasks, in which the agent wants to find the object, children just under four succeed. Second, only after four do children succeed at tasks in which the agent wants to avoid the object. We present a constructivist connectionist model that autonomously reproduces the two transitions and suggests that the transitions are due to increases in general processing abilities enabling children to (1) overcome a default true-belief attribution by distinguishing false- from true-belief situations, and to (2) predict search in avoidance situations, where there is often more than one correct, empty search location. Constructivist connectionist models are rigorous, flexible and powerful tools that can be analyzed before and after transitions to uncover novel and emergent mechanisms of cognitive development.

Research paper thumbnail of White-and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computational Model of the Raven Matrices Task

Many preterm neonates have white-matter damage (WMD, damaged connections between neurons) and gre... more Many preterm neonates have white-matter damage (WMD, damaged connections between neurons) and grey matterdamage (GMD, dead neurons). These children are known to have lower IQs than their full-term peers, yet the mechanisms underlying this association are poorly understood. We designed a developmental connectionist model of the Raven Matrices IQ task in which (1) all neurons had intact output, simulating normal development, or (2) half the neurons had noisy output, simulating noisy transmission or WMD, or (3) half the neurons had no output, simulating cell death or GMD. We found that damage increased task error. Further, WMD was worse than GMD overall, yet GMD was at once worse for generalization problems not given in training and better for training problems. Our model is the first to simulate an effect of perinatal brain damage on a cognitive task, and predicts that different types of brain damage may lead to different cognitive impairments.

Research paper thumbnail of Toddlers' transitions on non-verbal false-belief tasks involving a novel location: A constructivist connectionist model

Some argue that children learn a Theory of Mind (ToM), the understanding that others have mental ... more Some argue that children learn a Theory of Mind (ToM), the understanding that others have mental states, at around 3.5 years. This is evidenced by their transition from failure to success on verbal false-belief tasks, when they begin to verbally predict an actress will search for a toy where she falsely believes it to be, rather than in its actual location. However, nonverbal measures have recently been used to show that children in their second year of life may already have some understanding of others' false beliefs. We present a Sibling-Descendant Cascade-Correlation neural-network model of one study that found 25-month-old toddlers correctly anticipated an actress would search according to her false belief. Networks were trained on true- and false-belief search patterns, simulating toddlers' everyday experience with true and false beliefs, and then tested on nonverbal true- and false-belief tasks involving a novel location. Networks transitioned from incorrectly predicting true-belief searches in both true- and false-belief tasks to making correct predictions in both tasks. Our model thus (1) reproduced the transition that has been observed in older children and (2) generalized its learning to a novel location. The model can be used to refine our understanding of the transitions while again demonstrating the usefulness of SDCC as an algorithm for modeling cognitive development.

Research paper thumbnail of Bootstrapping syntax from morpho-phonology

It has been a puzzle how the syntax of natural language could be learned from positive evidence a... more It has been a puzzle how the syntax of natural language could be learned from positive evidence alone. Here we present a hybrid neural-network model in which artificial syntactic categories are acquired through unsupervised competitive learning due to grouping together lexical words with consistent phonological endings. These relatively large syntactic categories then become target signals for a feed-forward error-reducing network that learns to pair these lexical items with smaller numbers of function words to form phrases. This hybrid model learns phrasal syntax from positive evidence alone, while covering the essential findings in recent experiments on adult humans learning an artificial language. The model further predicts generalization to novel lexical words (exceptions) from knowledge of function words.

Research paper thumbnail of Comparing the inductive biases of simple neural networks and Bayesian models

Cognitive Science, 2012

Comparing the inductive biases of simple neural networks and Bayesian models Thomas L. Griffiths ... more Comparing the inductive biases of simple neural networks and Bayesian models Thomas L. Griffiths (tom griffiths@berkeley.edu) Joseph L. Austerweil (joseph.austerweil@gmail.com) Vincent G. Berthiaume (vberthiaume@berkeley.edu) Department of Psychology, University of California, Berkeley, CA 94720 USA Abstract Understanding the relationship between connectionist and probabilistic models is important for evaluating the compati- bility of these approaches. We use mathematical analyses and computer simulations to show that a linear neural network can approximate the generalization performance of a probabilis- tic model of property induction, and that training this network by gradient descent with early stopping results in similar per- formance to Bayesian inference with a particular prior. How- ever, this prior differs from distributions defined using discrete structure, suggesting that neural networks have inductive bi- ases that can be differentiated from probabilistic models with stru...

Research paper thumbnail of Where is Sally's marble? Learning to predict others' actions in true and false belief situations

Theory of Mind (ToM) is the lay person’s understanding that others have mental states. Central to... more Theory of Mind (ToM) is the lay person’s understanding that others have mental states. Central to the field of social cognition, this ability is fundamental to any individual living in society, if only to account for others’ points of view and beliefs. The standard False Belief (FB) task tests whether subjects can use a protagonist’s belief to predict that she will search for an object where she last saw it (e.g., Wimmer & Perner, 1983). There are four different possible conditions, depending on whether the protagonist pays attention (A) or not (NA) when the object is moved (M) from location X to Y or not (NM). Children younger than four usually fail the task, predicting in all conditions that the protagonist will search where the object ends up. However, recent work (Onishi & Baillargeon, 2005) showed 15 month olds succeeding at a visual version of the task. Would it be possible by 15 months to have learned, through observation, to make accurate predictions about where a protagonis...

Research paper thumbnail of A Computational Developmental Model of the Implicit False Belief Task

A Computational Developmental Model of the Implicit False Belief Task Vincent G. Berthiaume (Vinc... more A Computational Developmental Model of the Implicit False Belief Task Vincent G. Berthiaume (Vincent.Berthiaume@McGill.ca) Department of Psychology, 1205 Dr. Penfield Avenue Montreal, Qc, H3A 1B1 Canada Kristine H. Onishi (Kris.Onishi@McGill.ca) Department of Psychology, 1205 Dr. Penfield Avenue Montreal, Qc, H3A 1B1 Canada Thomas R. Shultz (Thomas.Shultz@McGill.ca) Department of Psychology and School of Computer Science, 1205 Dr. Penfield Avenue Montreal, Qc, H3A 1B1 Canada years and eight months (Wellman, Cross, & Watson, 2001) will typically say that Sally will search in the box, an expectation that is consistent with an omniscient ToM – i.e., Sally will search in the actual location of the object. An older child will instead typically say that Sally will search in the basket, an expectation that is consistent with a representational ToM – i.e., Sally will search for the object in accord with her mental representation of its location. Recently, 15-month-olds were shown to solve a...

Research paper thumbnail of White- and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computational Model of the Raven Matrices Task

White- and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computationa... more White- and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computational Model of the Raven Matrices Task Vincent G. Berthiaume (Vincent.Berthiaume@McGill.ca) Department of Psychology, McGill University, 1205 Dr. Penfield Avenue Montreal, QC H3A 1B1 Canada Thomas R. Shultz (Thomas.Shultz@McGill.ca) Department of Psychology and School of Computer Science, McGill University, 1205 Dr. Penfield Avenue Montreal, QC H3A 1B1 Canada Olaf Dammann (ODammann@TuftsMedicalCenter.org) Division of Newborn Medicine, Floating Hospital for Children at Tufts Medical Center, 800 Washington Street, Box 854 Boston, MA 02111 USA preterm brains (Leviton & Paneth, 1990). By contrast, grey matter consists of neuronal cell bodies and its damage is usually more constrained in the preterm brain (Billiards, Pierson, Haynes, Folkerth, & Kinney, 2006). Although the association between cognitive impairments and brain damage is well known in the pediatric community, not much is known about ...

Research paper thumbnail of A systematic comparison of flat and standard cascade-correlation using a student-teacher network approximation task

Http Dx Doi Org 10 1080 09540090701528951, Aug 28, 2007

Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computation... more Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computational power to the task being learned. The standard cascor algorithm recruits each hidden unit on a new layer, creating deep networks. In contrast, the flat cascor variant adds all recruited hidden units on a single hidden layer. Student-teacher network approximation tasks were used to investigate the ability of flat and standard cascor networks to learn the input-output mapping of other, randomly initialized flat and standard cascor networks. For lowcomplexity approximation tasks, there was no significant performance difference between flat and standard student networks. Contrary to the common belief that standard cascor does not generalize well due to cascading weights creating deep networks, we found that both standard and flat cascor generalized well on problems of varying complexity. On high-complexity tasks, flat cascor networks had fewer connection weights and learned with less computational cost than standard networks did.

Research paper thumbnail of Where is Sally's marble? Learning to predict others' actions in true and false belief situations

Cognitio 2007, Jun 15, 2007

Research paper thumbnail of A computational developmental model of the implicit false belief task

Do children understand that others have mental representations, for instance, mental representati... more Do children understand that others have mental representations, for instance, mental representations of an object's location? This understanding, known as a representational Theory of Mind (ToM) has typically been studied using false-belief (FB) tasks. Standard, verbal FB tasks test whether a child can use protagonists' beliefs to say that they will search for objects where they last saw them. Whereas children under 3.5 years typically fail the task and expect protagonists to search where objects are (expectation consistent with an omniscient ToM), older children expect protagonists to search where they last saw the objects (expectation consistent with a representational ToM). Recently, 15-month-olds were shown to succeed at a visual, implicit version of the task. We present a sibling-descendant cascade-correlation connectionist model that learns to succeed at an implicit FB task. When trained on twice as many true-as false-belief trials, our model reproduced the omniscient-torepresentational transition observed in explicit tasks. That is, networks first had expectations consistent with an omniscient ToM, and after further training had expectations consistent with a representational ToM. Thus, our model predicts that infants may also go through a transition on the implicit task, and suggests that this transition may be due in part to people holding more true than false beliefs.

Research paper thumbnail of A systematic comparison of flat and standard cascade-correlation using a student–teacher network approximation task

Connection Science, 2007

Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computation... more Cascade-correlation (cascor) networks grow by recruiting hidden units to adjust their computational power to the task being learned. The standard cascor algorithm recruits each hidden unit on a new layer, creating deep networks. In contrast, the flat cascor variant adds all recruited hidden units on a single hidden layer. Student-teacher network approximation tasks were used to investigate the ability of flat and standard cascor networks to learn the input-output mapping of other, randomly initialized flat and standard cascor networks. For lowcomplexity approximation tasks, there was no significant performance difference between flat and standard student networks. Contrary to the common belief that standard cascor does not generalize well due to cascading weights creating deep networks, we found that both standard and flat cascor generalized well on problems of varying complexity. On high-complexity tasks, flat cascor networks had fewer connection weights and learned with less computational cost than standard networks did.

Research paper thumbnail of A constructivist connectionist model of transitions on false-belief tasks

Cognition, 2013

How do children come to understand that others have mental representations, e.g., of an object&am... more How do children come to understand that others have mental representations, e.g., of an object's location? Preschoolers go through two transitions on verbal false-belief tasks, in which they have to predict where an agent will search for an object that was moved in her absence. First, while three-and-a-half-year-olds usually fail at approach tasks, in which the agent wants to find the object, children just under four succeed. Second, only after four do children succeed at tasks in which the agent wants to avoid the object. We present a constructivist connectionist model that autonomously reproduces the two transitions and suggests that the transitions are due to increases in general processing abilities enabling children to (1) overcome a default true-belief attribution by distinguishing false- from true-belief situations, and to (2) predict search in avoidance situations, where there is often more than one correct, empty search location. Constructivist connectionist models are rigorous, flexible and powerful tools that can be analyzed before and after transitions to uncover novel and emergent mechanisms of cognitive development.

Research paper thumbnail of White-and Grey-Matter Damage Differentially Impair Learning and Generalization in a Computational Model of the Raven Matrices Task

Many preterm neonates have white-matter damage (WMD, damaged connections between neurons) and gre... more Many preterm neonates have white-matter damage (WMD, damaged connections between neurons) and grey matterdamage (GMD, dead neurons). These children are known to have lower IQs than their full-term peers, yet the mechanisms underlying this association are poorly understood. We designed a developmental connectionist model of the Raven Matrices IQ task in which (1) all neurons had intact output, simulating normal development, or (2) half the neurons had noisy output, simulating noisy transmission or WMD, or (3) half the neurons had no output, simulating cell death or GMD. We found that damage increased task error. Further, WMD was worse than GMD overall, yet GMD was at once worse for generalization problems not given in training and better for training problems. Our model is the first to simulate an effect of perinatal brain damage on a cognitive task, and predicts that different types of brain damage may lead to different cognitive impairments.

Research paper thumbnail of Toddlers' transitions on non-verbal false-belief tasks involving a novel location: A constructivist connectionist model

Some argue that children learn a Theory of Mind (ToM), the understanding that others have mental ... more Some argue that children learn a Theory of Mind (ToM), the understanding that others have mental states, at around 3.5 years. This is evidenced by their transition from failure to success on verbal false-belief tasks, when they begin to verbally predict an actress will search for a toy where she falsely believes it to be, rather than in its actual location. However, nonverbal measures have recently been used to show that children in their second year of life may already have some understanding of others' false beliefs. We present a Sibling-Descendant Cascade-Correlation neural-network model of one study that found 25-month-old toddlers correctly anticipated an actress would search according to her false belief. Networks were trained on true- and false-belief search patterns, simulating toddlers' everyday experience with true and false beliefs, and then tested on nonverbal true- and false-belief tasks involving a novel location. Networks transitioned from incorrectly predicting true-belief searches in both true- and false-belief tasks to making correct predictions in both tasks. Our model thus (1) reproduced the transition that has been observed in older children and (2) generalized its learning to a novel location. The model can be used to refine our understanding of the transitions while again demonstrating the usefulness of SDCC as an algorithm for modeling cognitive development.

Research paper thumbnail of Bootstrapping syntax from morpho-phonology

It has been a puzzle how the syntax of natural language could be learned from positive evidence a... more It has been a puzzle how the syntax of natural language could be learned from positive evidence alone. Here we present a hybrid neural-network model in which artificial syntactic categories are acquired through unsupervised competitive learning due to grouping together lexical words with consistent phonological endings. These relatively large syntactic categories then become target signals for a feed-forward error-reducing network that learns to pair these lexical items with smaller numbers of function words to form phrases. This hybrid model learns phrasal syntax from positive evidence alone, while covering the essential findings in recent experiments on adult humans learning an artificial language. The model further predicts generalization to novel lexical words (exceptions) from knowledge of function words.