The perceptual interface for the affective wearable consists of sensors and algorithms that attempt to recognize the affective state of the user. The wearable can then try to learn which affective state variables are good predictors of user preferences in different physical and situational contexts [Pic97]. For example, the wearable could learn that when indoors studying, a high level of muscle tension is correlated with a preference for calming music, but while outside walking, a different pattern of physiological variables is correlated with a preference for more energizing music. The Affective DJ is presented here as an example of a computer application for the ambulatory environment that can be potentially improved by perceiving changes in the user's state, without requiring the user to directly input those changes through a keyboard or mouse.
The Affective DJ currently processes information about the user's affective state by detecting changes in the user's skin conductance on the palm of the hand. This response, one of two referred to as the galvanic skin response, has been shown to be an indicator of emotional responsiveness and to be only minimally involved with thermoregulation[SF90]. The measure correlates well with the emotional dimension of arousal and is most appropriate for making decisions about user preferences along this dimension. To avoid interrupting in the middle of a song, the Affective DJ only makes music selections at the end of every whole song.
Based on the whole song criteria, we chose a metric which captured the overall effect of a song. The current program calculates the average of the skin conductance for the last 30 seconds of the song and compares it to the average from the end of the previous song. As the skin conductivity signal is received by the wearable computer at 20 samples a second, through the A/D converter, a running average of the last 600 samples is calculated. At the end of each song this average is piped to the selection algorithm. This coarse metric of the difference in skin conductivity between the beginning and the end of the song seeks to sum up the overall effect of the song while ignoring the fast variations of the signal that are likely to be related to momentary qualities of the music.
It may be that there is much more information to be extracted from the dynamic nature of the skin conductance response. The number of sudden increases in the skin conductance (see Figure 5) can be tracked and recorded[HP98] and may be useful as another metric for quantifying the response of the music. In future versions of the system, this and other measures, such as heart rate variability, respiration rate, and muscle tension may be included in the decision algorithms.
Testing such a system is tricky and time-consuming, as there are a huge number of variables to control for, and the system really needs to have a lot of music and to be worn for a lengthy time before it can provide the planned advantages. Presently, we have some interesting initial findings from an informal experiment involving four subjects using our prototype system. The system trained on an individual during a period in which a random play list was presented, then the algorithm tried to create an ``energizing'' play list which would alternate songs to take the user's baseline to a level 1 micro-Siemen higher than the previous level and finally through a ``relaxing'' play list which would decrease the user's baseline by 1 micro-Siemen. Each session consisted of 10 songs and lasted approximately a half hour. The random play list consisted of selections from the pool of songs considered to be ``high arousal,'' (modern alternative rock) and ``low arousal,'' (mostly classical). During the first session of random song play, the change in skin conductivity for each song was recorded in a database for that user. The baseline was calculated as the average skin conductivity over the random song play session. This baseline was reassessed at the end of each type of play list, random, energizing, relaxing.
To simulate using the system in a natural environment, our users were
allowed to work on their homework or use the computer during the
experiment. Four subjects were run, but for one subject the skin
conductivity sensor was plugged in to the wrong channel of the A/D
converter and the signal was not recorded properly.
During the hour-and-a-half session, users were asked to rate each song
they heard based on how much they liked the song (on a scale of 1-hate
to 7-love) and how exciting they found the song (on a scale of 1-very
relaxing to 5-very energizing).
When the songs were dummy-coded as being in the original categories of
high arousal as ``1'' or low arousal ``0'' was found that the skin
conductance was significantly correlated to this rating (p<0.001).
Skin conductance was also correlated to the
users perception of the excitement level of the song (p<0.005).
In all cases the relaxing play list was able to lower skin
conductivity; however, the arousing play list was not able to increase
skin conductance in all cases. A possible explanation for this
failure of the algorithm is that given the small number songs that
were rated, ten, the best fit song was often a repeat which two of the
users reported to be annoying.
This example illustrates how physiological signals could be perceived by an interface and used to adjust musical selection. In the not too distant future when large databases of music can be automatically downloaded by a wireless system, the interface could try to pre-load selections that it thinks are most likely to be preferred by the user, and then offer that user a choice of music that is most likely to agree with their momentary preferences. Instead of offering the user an unwieldy list of ``everything out there'' it might, for example, offer faster access to forty pieces of the type that best reflect the user's preferences, including their present mood. The user can then choose from this selection, or tell the system to choose; in either case, the system's knowledge of user preferences can help keep it from overloading the user with irrelevant selections. Affective responses, both physical and behavioral, are a key part of the context information that an intelligent interface should perceive in order to learn how to adjust system behavior on the user's behalf.