The Multimodal Dyadic Behavior Dataset (original) (raw)

Home Overview Get the data Stage parsing Behavior detection Rating prediction Submit results Contact us
Introduction We introduce the Multimodal Dyadic Behavior (MMDB) dataset, a unique collection of multimodal (video, audio, and physiological) recordings of the social and communicative behavior of toddlers. The MMDB contains 160 sessions of 3-5 minute semi-structured play interaction between a trained adult examiner and a child between the age of 15 and 30 months. Our play protocol is designed to elicit social attention, back-and-forth interaction, and non-verbal communication from the child. These behaviors reflect key socio-communicative milestones which are implicated in autism spectrum disorders. The MMDB dataset supports a novel problem domain for activity recognition, which consists of the decoding of dyadic social interactions between adults and children in a developmental context. Goal Our overall goal is to facilitate the development of novel computational methods for measuring and analysing the behavior of children and adults during face-to-face social interactions. We have explored the automatic analysis of three aspects of the dataset: - Parsing into stages and substages - Detection of discrete behaviors (gaze shifts, smiling, and play gestures) - Prediction of engagement ratings at the stage and session level Data We have collected 160 sessions of 5-minute interaction from 121 children. All multimodal signals are synchronized, including: - 2 frontal view Basler cameras (1920x1080 at 60 FPS) - An overhead view Kinect (RGB-D) camera - 8 side view & 3 overhead view AXIS cameras (640x480 at 30 FPS) - An omnidirectional and a cardioid microphone, ceiling mounted - 2 wireless lapel microphones, worn by both the child and the adult - 4 Affectiva Q-sensors for electrodermal activity and accelerometry, worn by both the adult and the child. Annotations The MMDB dataset contains fine-grained annotations of behaviors, including - Ratings of engagement and responsiveness at substage level - Frame-level, continuous annotation of relevant child behaviors (attention shifts, facial expressions, gestures and vocalizations)