Affective Computing logo

TurnStyles:

A speech interface agent to support affective communication

Jonathan Klein

One aspect of affective human-computer communication is the development of systems that are capable of communicating with the user to effect the user's emotional state. Such systems may have to respond to users experiencing strong, negative emotional reactions. One example would be a user, trying unsuccessfully to find information from a help system under deadline, becomes frustrated and upset. The computer should be able to work with the user, and help address the user's frustration, if not also the elicitig problem as well. Such a system might need to communicate sensitive information to a volatile user in ways that don't exacerbate a user's already excited state. A natural means of interaction for affective communication, especially for affective wearable computers, is speech I/O -- the user communicates with the system by talking to it, and the system responds with synthesized speech.

Synthetic speech systems, however, have not improved significantly in their ease of audition, or their ability to express human-like emotion, since the early sixties. To begin to address this problem, a prototype of a learning speech interface agent called TurnStyles has been designed and built. This interface agent dynamically learns critical pacing aspects of conversational style from "listening" to conversations, and adapts the system's synthetic speech output to reflect the stylistic preferences of the user. In a sense, TurnStyles enables a speech I/O system to be able to "speak as it is spoken to."

The theory behind TurnStyles is consistent with research findings in linguistics and sociology, suggesting that conversational pacing, turn-taking patterns and "expressive paralinguistics" are relevant aspects of conversational style, and that people have clear stylistic preferences for their conversational partners. Conversational style has been known for some time by the linguistics community, and is introduced here as an important but hitherto unexamined component of conversational speech interaction with computers.

TurnStyles seeks to leverage these findings by applying conversational style "texture" to audio output interfaces. The design and implementation of TurnStyles is a proof-of-concept that learns several key aspects of conversational style. The goal of TurnStyles is to increase the subjective quality of user experience in dyadic speech interaction with computers. A research study is currently underway to determine the effectiveness of TurnStyles' goals, but the theoretical motivation and system design represent a new and relevant approach to the problem of smoothing the way for affective speech communication, as well as the older problem of humanizing synthetic speech. The system is flexible, and can be adapted to contextual needs (i.e. user stressed and in a hurry, it can communicate in a "hurry up" style, versus when the user is relaxed and wanting to feel "soothed"). Eventually, TurnStyles is envisioned as one in a large palette of speech interface agents, all acting to create natural, human-like speech interaction.

This page was last updated October 15, 1997.