Michel Valstar - Profile on Academia.edu (original) (raw)

Papers by Michel Valstar

Now that Affective Computing and Social Signal Processing methods are becoming increasingly robus... more Now that Affective Computing and Social Signal Processing methods are becoming increasingly robust and accurate, novel areas of applications with significant societal impact are opening up for exploration. Perhaps one of the most promising areas is the application of automatic expressive behaviour understanding to help diagnose, monitor, and treat medical conditions that themselves alter a person’s social and affective signals. This work argues that this is now essentially a new area of research, called behaviomedics. It gives a definition of the area, discusses the most important groups of medical conditions that could benefit from this, and makes suggestions for future directions.

Abstract Automatic recognition of human facial expressions is a challenging problem with many app... more Abstract Automatic recognition of human facial expressions is a challenging problem with many applications in human-computer interaction. Most of the existing facial expression analyzers succeed only in recognizing a few emotional facial expressions, such as anger or happiness.

Meta-Analysis of the First Facial Expression Recognition Challenge

Abstract Automatic facial expression recognition has been an active topic in computer science for... more Abstract Automatic facial expression recognition has been an active topic in computer science for over two decades, in particular facial action coding system action unit (AU) detection and classification of a number of discrete emotion states from facial expressive imagery. Standardization and comparability have received some attention; for instance, there exist a number of commonly used facial expression databases.

Abstract We combine a machine vision system that recognises emotions and a non-photorealistic ren... more Abstract We combine a machine vision system that recognises emotions and a non-photorealistic rendering (NPR) system to automatically produce portraits which heighten the emotion of the sitter. To do this, the vision system analyses a short video clip of a person expressing an emotion, then tracks the movement of facial features and uses this tracking data to analyse which emotion was expressed and what the temporal dynamics of the expression were.

Abstract A system that could enable fast and robust facial expression recognition would have many... more Abstract A system that could enable fast and robust facial expression recognition would have many applications in behavioral science, medicine, security and human-machine interaction. While working toward that goal, we do not attempt to recognize prototypic facial expressions of emotions but analyze subtle changes in facial behavior by recognizing facial muscle action units (AUs, ie, atomic facial signals) instead. By detecting AUs we can analyse many more facial communicative signals than emotional expressions alone.

Abstract: A collaborative effort within the WP8 of the SSPNet Network of Excellence is to automat... more Abstract: A collaborative effort within the WP8 of the SSPNet Network of Excellence is to automatically model, detect, and interpret social signals displayed in political debates. For this work, appropriate data (recordings of political debates and other relevant human behaviours) and data annotations were needed that the WP8 partners can use in their research. This document describes the data and annotations which will serve as the basis for research in WP8, and it describes the way in which the data will be used.

Abstract Enabling computer systems to recognize human facial expressions is a challenging researc... more Abstract Enabling computer systems to recognize human facial expressions is a challenging research problem with many applications in behavioral science, medicine, security, and human-machine interaction. Instead of being another approach to automatic detection of prototypic facial expressions of emotion, this work attempts to analyze subtle changes in facial behavior by recognizing facial action units (AU, ie atomic facial signals) that produce expressions.

Local Evidence Aggregation for Regression Based Facial Point Detection

Abstract We propose a new algorithm to detect facial points in frontal and near-frontal face imag... more Abstract We propose a new algorithm to detect facial points in frontal and near-frontal face images. It combines a regression-based approach with a probabilistic graphical model-based face shape model, that restricts the search to anthropomorphically consistent regions.

Abstract Sensitive artificial listeners (SAL) are virtual dialogue partners who, despite their ve... more Abstract Sensitive artificial listeners (SAL) are virtual dialogue partners who, despite their very limited verbal understanding, intend to engage the user in a conversation by paying attention to the user's emotions and non-verbal expressions. The SAL characters have their own emotionally defined personality, and attempt to drag the user towards their dominant emotion, through a combination of verbal and non-verbal expression.

Abstract The experiments from which this database contains the recordings, were conducted with th... more Abstract The experiments from which this database contains the recordings, were conducted with the aim of the analysis of human interaction, in particular mimicry, and elaborate on the theoretical hypotheses of the relationship between the occurrence of mimicry and human affect. The recorded experiments are designed to explore this relationship.

Abstract We propose to demonstrate the emotionally aware painting fool, a novel system that combi... more Abstract We propose to demonstrate the emotionally aware painting fool, a novel system that combines a machine vision system able to recognise emotions with a non-photorealistic rendering (NPR) system to automatically produce portraits of the sitter in an emotionally enhanced style. During the demonstration, the vision system records a short video clip of a person showing a basic emotion. The system then analyses this video clip, locating facial features and tracking their motion.

Abstract This demonstration aims to showcase the recently completed SEMAINE system. The SEMAINE s... more Abstract This demonstration aims to showcase the recently completed SEMAINE system. The SEMAINE system is a publicly available, fully autonomous Sensitive Artificial Listeners (SAL) system that consists of virtual dialog partners based on audiovisual analysis and synthesis (see http://semaine. opendfki. de/wiki). The system runs in real-time, and combines incremental analysis of user behavior, dialog management, and synthesis of speaker and listener behavior of a SAL character, displayed as a virtual agent.

Abstract—Facial expression is one of the most important non-verbal behavioural cues in social sig... more Abstract—Facial expression is one of the most important non-verbal behavioural cues in social signals. Constructing an effective face representation from images is an essential step for successful facial behaviour analysis. Most existing face descriptors operate on the same scale, and do not leverage coarse vs fine methods such as image pyramids. In this work, we propose the sparse appearance descriptors Block-based Pyramid Local Binary Pattern (B-PLBP) and Block-based Pyramid Local Phase Quantisation (B-PLPQ).

Abstract. Most existing work on automatic analysis of facial expressions has focused on a small s... more Abstract. Most existing work on automatic analysis of facial expressions has focused on a small set of prototypic emotional facial expressions such as fear, happiness, and surprise. The system proposed here enables detection of a much larger range of facial behaviour by detecting facial muscle actions (action units, AUs). It automatically detect all 9 upper face AUs using local apperance descriptors. Meanwhile, the merits of the family of local binary pattern descriptors are investigated.

ABSTRACT We present the second Audio-Visual Emotion recognition Challenge and workshop (AVEC 2012... more ABSTRACT We present the second Audio-Visual Emotion recognition Challenge and workshop (AVEC 2012), which aims to bring together researchers from the audio and video analysis communities around the topic of emotion recognition. The goal of the challenge is to recognise four continuously valued affective dimensions: arousal, expectancy, power, and valence.

Abstract Enabling computer systems to correctly analyse human behavior is an unsolved problem in ... more Abstract Enabling computer systems to correctly analyse human behavior is an unsolved problem in Artificial Intelligence. Recognizing human facial expressions automatically by a robot or computer is an important aspect of this analysis. A system that enables fast and robust facial action unit (AU, ie atomic facial signal) recognition would have many applications in behavioral science, medicine, security and man machine interaction.

Abstract The human face is used to regulate the conversation by gazing or nodding, to interpret w... more Abstract The human face is used to regulate the conversation by gazing or nodding, to interpret what has been said by lip reading, and to communicate and understand somebody's affective state and intentions on the basis of the shown facial expression. Machine understanding of human facial behavior could revolutionize human-machine interaction technologies and fields as diverse as security, behavioral science, medicine, communication, and education.

Building Autonomous Sensitive Artificial Listeners

This paper describes a substantial effort to build a real-time interactive multimodal dialogue sy... more This paper describes a substantial effort to build a real-time interactive multimodal dialogue system with a focus on emotional and non-verbal interaction capabilities. The work is motivated by the aim to provide technology with competences in perceiving and producing the emotional and non-verbal behaviours required to sustain a conversational dialogue. We present the Sensitive Artificial Listener (SAL) scenario as a setting which seems particularly suited for the study of emotional and non- verbal behaviour, since it requires only very limited verbal understanding on the part of the machine. This scenario allows us to concentrate on non-verbal capabilities without having to address at the same time the challenges of spoken language understanding, task modeling etc. We first report on three prototype versions of the SAL scenario, in which the behaviour of the Sensitive Artificial Listener characters was determined by a human operator. These prototypes served the purpose of verifying the effectiveness of the SAL scenario and allowed us to collect data required for building system components for analysing and synthesising the respective behaviours. We then describe the fully autonomous integrated real-time system we created, which combines incremental analysis of user behaviour, dialogue management, and synthesis of speaker and listener behaviour of a SAL character displayed as a virtual agent. We discuss principles that should underlie the evaluation of SAL-type systems. Since the system is designed for modularity and reuse, and since it is publicly available, the SAL system has potential as a joint research tool in the affective computing research community.

Fully Automatic Recognition of the Temporal Phases of Facial Actions

Past work on automatic analysis of facial expressions has focused mostly on detecting prototypic ... more Past work on automatic analysis of facial expressions has focused mostly on detecting prototypic expressions of basic emotions like happiness and anger. The method proposed here enables detection of a much larger range of facial behaviour by recognising facial muscle actions (action units, AUs) that compound expressions. AUs are agnostic, leaving the inference about conveyed intent to higher order decision making (e.g., emotion recognition). The proposed fully automatic method not only allows recognition of 22 AUs, but also explicitly models their temporal characteristics (i.e., sequences of temporal segments: neutral, onset, apex, and offset). To do so it uses a facial point detector based on Gabor-feature-based boosted classifiers to automatically localise 20 facial fiducial points. These points are tracked through a sequence of images using a method called particle filtering with factorized likelihoods. To encode AUs and their temporal activation models based on the tracking data, it applies a combination of GentleBoost, Support Vector Machines, and Hidden Markov Models. We attain an average AU recognition rate of 95.3% when tested on a benchmark set of deliberately displayed facial expressions and 72% when tested on spontaneous expressions.

Meta-Analysis of the First Facial Expression Recognition Challenge

Local Evidence Aggregation for Regression Based Facial Point Detection

Building Autonomous Sensitive Artificial Listeners

Fully Automatic Recognition of the Temporal Phases of Facial Actions