Active Listener: Continuous Generation of Listener’s Head Motion Response in Dyadic Interactions (original) (raw)

Ghosh, Bishal, Li, Emma ORCID logoORCID: https://orcid.org/0000-0003-4200-0669 and Guha, Tanaya ORCID logoORCID: https://orcid.org/0000-0003-2167-4891(2024) Active Listener: Continuous Generation of Listener’s Head Motion Response in Dyadic Interactions. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025), Hyderabad, India, 6-11 April 2025, ISBN 9798350368741(doi: 10.1109/ICASSP49660.2025.10889429)

Abstract

A key component of dyadic spoken interactions is the contextually relevant non-verbal gestures, such as head movements that reflect a listener’s response to the interlocutor’s speech. Although significant progress has been made in the context of generating co-speech gestures, generating listener’s response has remained a challenge. We introduce the task of generating continuous head motion response of a listener in response to the speaker’s speech in real time. To this end, we propose a graph-based end-to-end crossmodal model that takes interlocutor’s speech audio as input and directly generates head pose angles (roll, pitch, yaw) of the listener in real time. Different from previous work, our approach is completely data-driven, does not require manual annotations or oversimplify head motion to merely nods and shakes. Extensive evaluation on the dyadic interaction sessions on the IEMOCAP dataset shows that our model produces a low overall error (4.5 degrees) and a high frame rate, thereby indicating its deployability in real-world human-robot interaction systems. Our code is available at - https://github. com/bigzen/Active-Listener

Item Type: Conference Proceedings
Additional Information: We thankfully acknowledge support from EPSRC DTP EP/W524359/1 and Intel Corporation for this work.
Keywords: Speech analysis, gesture generation, crossmodal analysis, dyadic interaction.
Status: Published
Refereed: Yes
Glasgow Author(s) Enlighten ID: Li, Dr Emma and Guha, Dr Tanaya and Ghosh, Mr Bishal
Authors: Ghosh, B., Li, E., and Guha, T.
College/School: College of Science and Engineering > School of Computing Science
ISSN: 2379-190X
ISBN: 9798350368741
Copyright Holders: Copyright © 2025 IEEE
Publisher Policy: Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record

Funder and Project Information

DTP 2224 University of Glasgow

Christopher Pearce

EP/W524359/1

US - Office of the Vice Principals

Deposit and Record Details

ID Code: 345947
Depositing User: Mrs Nora Helle
Datestamp: 21 Jan 2025 10:17
Last Modified: 04 Jun 2025 01:31
Date of acceptance: 20 December 2024
Date Deposited: 21 January 2025
Data Availability Statement: No