Social learning and multimodal interaction for designing artificial agents

International Conference on Multimodal Interaction ICMI 2016

Download our flyer HERE!!

Tokyo, JAPAN
November 16. 2016 

August 28, 2016
September 4th 2016 *FINAL EXTENSION*
September 11th, 2016:

Submission deadline

September 28, 2016:
Acceptance of notification

October 8, 2016:
Camera ready submission







Identifiez-vous pour accéder à la partie privée du site ou le modifier. Déconnexion.


09:00 - 

Welcome and Introduction

09:05 - 

Louis-Philippe Morency


Modeling Human Communication Dynamics

Human face-to-face communication is a little like a dance, in that participants continuously adjust their behaviors based on verbal and nonverbal cues from the social context. Today's computers and interactive devices are still lacking many of these human-like abilities to hold fluid and natural interactions. Leveraging recent advances in machine learning, audio-visual signal processing and computational linguistic, my research focuses on creating computational technologies able to analyze, recognize and predict human subtle communicative behaviors in social context. I formalize this new research endeavor with a Human Communication Dynamics framework, addressing four key computational challenges: behavioral dynamic, multimodal dynamic, interpersonal dynamic and societal dynamic. Central to this research effort is the introduction of new probabilistic models able to learn the temporal and fine-grained latent dependencies across behaviors, modalities and interlocutors. In this talk, I will present some of our recent achievements modeling multiple aspects of human communication dynamics, motivated by applications in healthcare (depression, PTSD, suicide, autism), education (learning analytics), business (negotiation, interpersonal skills) and social multimedia (opinion mining, social influence).

09:45 - 

Joaquín Pérez, Eva Cerezo and Francisco J. Serón


E-VOX: A Socially Enhanced Semantic ECA

In this paper, we present E-VOX, an emotionally enhanced semantic ECA designed to work as a virtual assistant to search information from Wikipedia. It includes a cognitive-affective architecture that integrates an emotion model based on ALMA and the Soar cognitive architecture. This allows the ECA to take into account features needed for social interaction such as learning and emotion management. The architecture makes it possible to influence and modify the behavior of the agent depending on the feedback received from the user and other information from the environment, allowing the ECA to achieve a more realistic and believable interaction with the user. A completely functional prototype has been developed showing the feasibility of our approach.

10:00 - 

Coffee Break

10:20 - 

Catherine Pelachaud


Interacting with socio-emotional embodied conversational agent

In this talk I will present our current work toward endowing virtual agents with socio-emotional capabilities. By applying different methodologies, based on corpus analysis, user-centered, or motion capture, we have enriched the agent’s palette of multimodal behaviors. We have conducted various studies to simulate communicative behaviors, emotional behaviors, social attitudes and behavior expressivity. Through its behaviors patterns, the agent can communicate with different social attitudes; its relationship towards its interlocutors influence how it behaves and places itself while conversing with them.

11:00 - 

Xiaojie Zha and Marie-Luce Bourguet


Experimental Study to Elicit Effective Multimodal Behaviour in Pedagogical Agents

This paper describes a small experimental study into the use of avatars to remediate the lecturer’s absence in voice-over-slide material. Four different avatar behaviours are tested. Avatar A performs all the upper-body gestures of the lecturer, which were recorded using a 3D depth sensor. Avatar B is animated using few random gestures in order to create a natural presence that is unrelated to the speech. Avatar C only performs the lecturer’s pointing gestures, as these are known to indicate important parts of a lecture. Finally, Avatar D performs “lecturer-like” gestures, but these are desynchronised with the speech. Preliminary results indicate students’ preference for Avatars A and C. Although the effect of avatar behaviour on learning did not prove statistically significant, students’ comments indicate that an avatar that behaves quietly and only performs some of the lecturer’s gestures (pointing) is effective. The paper also presents a simple empirical method for automatically detecting pointing gestures in Kinect recorded lecture data.

11:15 - 

Claire Rivoire and Angelica Lim


Habit Detection within a Long-term Interaction with a Social Robot - an Exploratory Study

In recent years, social robots have become more popular for use in the home. In this paper, we describe the problem of robot proactivity in long-term Human-Robot Interactions (HRI). In particular, it is difficult to find the right balance of a robot that speaks or proposes activities at the right moment with appropriate frequency. Too little proactivity, and the robot becomes boring. Too much proactivity, and the robot becomes annoying. Further, the content of proactive utterances by the robot during a long-term HRI become tiresome unless they are contextual and based in reality. We propose a technical solution to this problem divided into three parts: 1) family-specific activity logging, 2) contextual comments that suggest consciousness of time and repeated interactions, and 3) proposals of activities based on the user's habits. Towards a robot that is accepted into people's homes, we propose an evolutive system that learns user's schedules over time and adapts its proactive utterances based on prior history. We show the preliminary results of an exploratory data collection study containing up to 8 weeks of usage of the Pepper humanoid robot in 10 European homes.

11:30 - 

Giuseppe Palestra, Giovanna Varni, Mohamed Chetouani and Floriana Esposito


A Multimodal and Multilevel System for Robotics Treatment of Autism in Children

Several studies suggest that robots can play a relevant role to address Autistic Spectrum Disorder (ASD). This paper presents a humanoid social robot-assisted behavioral system based on a therapeutic multilevel treatment protocol customized to improve eye contact, joint attention, symbolic play, and basic emotion recognition. In the system, the robot acts as a social mediator, trying to elicit specific behaviors in child, taking into account his/her multimodal signals. Statistical differences in eye contact and facial expression imitation behaviors after the use of the system are reported as preliminary results.

11:45 - 

Iren Saltali, Sanem Sariel, Gökhan Ince


Scene Analysis For Lifelong Robot Learning Through Auditory Event Monitoring

The ability to categorize objects and outcomes of events using auditory signals is rather advanced in humans. When it comes to robots, limitations in sensing pose many challenges for this type of categorization specifically required in many robotic applications. In this paper, we propose auditory scene analysis methods for robots in order to monitor events to detect failures and learn from their experiences. Audio data are convenient for these purposes to detect environmental changes surrounding a robot and especially complement visual data. In our study, we investigate supervised learning methods using informative features from sound data for efficient categorization in manipulation scenarios. Furthermore, we use these data for robots to detect execution failures in runtime to prevent potential damages to their environment, objects of interest and even themselves. Firstly, the most distinguishing features for categorization of object materials from a set including glass, metal, porcelain, cardboard and plastic are determined, and then the performances of two supervised learning methods on these features for material categorization are evaluated. In our experimental framework, the performances of the learning methods for categorization of failed action outcomes are evaluated with a mobile robot and a robotic arm. Particularly, drop and hit events are selected for this analysis since these are the most likely failure outcomes that occur during the manipulation of objects. Using the proposed techniques, material categories as well as the interaction events can be determined with high success rates.

12:00 - 

Conclusions and Lunch

logo_acm logo_icmi16 logo_sigchi