Active Listener: Continuous Generation of Listener’s Head Motion Response in Dyadic Interactions (original) (raw)
Ghosh, Bishal, Li, Emma ORCID: https://orcid.org/0000-0003-4200-0669 and Guha, Tanaya
ORCID: https://orcid.org/0000-0003-2167-4891(2024) Active Listener: Continuous Generation of Listener’s Head Motion Response in Dyadic Interactions. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025), Hyderabad, India, 6-11 April 2025, ISBN 9798350368741(doi: 10.1109/ICASSP49660.2025.10889429)
Abstract
A key component of dyadic spoken interactions is the contextually relevant non-verbal gestures, such as head movements that reflect a listener’s response to the interlocutor’s speech. Although significant progress has been made in the context of generating co-speech gestures, generating listener’s response has remained a challenge. We introduce the task of generating continuous head motion response of a listener in response to the speaker’s speech in real time. To this end, we propose a graph-based end-to-end crossmodal model that takes interlocutor’s speech audio as input and directly generates head pose angles (roll, pitch, yaw) of the listener in real time. Different from previous work, our approach is completely data-driven, does not require manual annotations or oversimplify head motion to merely nods and shakes. Extensive evaluation on the dyadic interaction sessions on the IEMOCAP dataset shows that our model produces a low overall error (4.5 degrees) and a high frame rate, thereby indicating its deployability in real-world human-robot interaction systems. Our code is available at - https://github. com/bigzen/Active-Listener
| Item Type: | Conference Proceedings |
|---|---|
| Additional Information: | We thankfully acknowledge support from EPSRC DTP EP/W524359/1 and Intel Corporation for this work. |
| Keywords: | Speech analysis, gesture generation, crossmodal analysis, dyadic interaction. |
| Status: | Published |
| Refereed: | Yes |
| Glasgow Author(s) Enlighten ID: | Li, Dr Emma and Guha, Dr Tanaya and Ghosh, Mr Bishal |
| Authors: | Ghosh, B., Li, E., and Guha, T. |
| College/School: | College of Science and Engineering > School of Computing Science |
| ISSN: | 2379-190X |
| ISBN: | 9798350368741 |
| Copyright Holders: | Copyright © 2025 IEEE |
| Publisher Policy: | Reproduced in accordance with the publisher copyright policy |
University Staff: Request a correction | Enlighten Editors: Update this record
Funder and Project Information
DTP 2224 University of Glasgow
Christopher Pearce
EP/W524359/1
US - Office of the Vice Principals
Deposit and Record Details
| ID Code: | 345947 |
|---|---|
| Depositing User: | Mrs Nora Helle |
| Datestamp: | 21 Jan 2025 10:17 |
| Last Modified: | 04 Jun 2025 01:31 |
| Date of acceptance: | 20 December 2024 |
| Date Deposited: | 21 January 2025 |
| Data Availability Statement: | No |