We "Hear" Personality In Computer-Generated Speech And The More It Sounds Like Us, The More We Like It
- Date:
- October 1, 2001
- Source:
- American Psychological Association
- Summary:
- People read personality into a synthetic voice even when they know that it’s made by a computer. What’s more, if the “voice” mirrors their personalities, people will like and be more readily influenced by that voice. These new findings, which have implications for the design and use of increasingly widespread text-to-speech (TTS) systems, appear in the September issue of the Journal of Experimental Psychology: Applied, published by the American Psychological Association (APA).
- Share:
WASHINGTON — People read personality into a synthetic voice even when they know that it’s made by a computer. What’s more, if the “voice” mirrors their personalities, people will like and be more readily influenced by that voice. These new findings, which have implications for the design and use of increasingly widespread text-to-speech (TTS) systems, appear in the September issue of the Journal of Experimental Psychology: Applied, published by the American Psychological Association (APA).
At Stanford University, Clifford Nass, Ph.D. and Kwan Min Lee, Ph.D. assessed how people evaluated, liked and were influenced by computer-synthesized speech. Text-to-speech systems are growing more popular because they make computers and Internet content more accessible to the visually impaired and blind, an expanding group as the population ages, and to non-literate people, including young children. TTS systems also provide eyes-free information, for example in cell phones and cars.
In their first experiment (72 participants), Nass and Lee modulated the synthetic voice reading book reviews from a mock Web book store, making the voice louder or softer, faster or slower, more varied in frequency, etc. -- traits ascribed to extroversion or introversion (extroverts, for example, speak louder and faster than introverts). Participants accurately judged the voices as extroverted or introverted, which means they detected paralinguistic cues that are very hard to discern in typically flat synthetic speech. What’s more, the 36 participants in the sample who described themselves as extroverted (according to the Myers-Briggs Type Indicator and the Wiggins personality test) were more attracted to the extroverted computer voice, the site’s book reviews and the reviewer -- and vice versa for the 36 introverts. Notably for Web merchants, when the voice personality reading a book review matched their own, participants were more likely to say they’d buy the book.
The second experiment presented participants (40 extroverts and 40 introverts) with a mock Web auction and checked what happened not only when voice personality "matched" the participant, but the spoken text itself, the merchandise descriptions -- unlike the book reviews -- expressed a personality. The results replicated the first experiment and also supported the power of consistency, between voice and text and among voice, text and user. Participants strongly preferred a voice when voice and text personalities matched; in that situation they also liked the text much more. Participants hearing "matches" also found the writer to be more credible and likeable.
Significantly, participants responded as they did despite many reminders, from both the researchers and the voice itself, that the voice was not human. Thus, the results confirm the general observation that computers and computer- -synthesized voices are "social actors;" in other words, people respond to a computerized voice that sounds like them just as they would to a real person who sounds like them. Just as with real people, they prefer consistency in behavior because it’s easier to understand and predict.
The findings, say Nass and Lee, mean that text-to-speech systems are not merely a convenience, but also a "rich social modality that must be tuned to the user and the content being presented." Content providers and interface designers can use this information in order to make their products more appealing and persuasive, with obvious implications for Internet commerce. Nass and Lee write, "To maximize liking and trust, designers should set parameters, for example, words per minute or frequency range, that create a personality that is consistent with the user and the content being presented."
Article: "Does Computer-Synthesized Speech Manifest Personality? Experimental Tests of Recognition, Similarity-Attraction, and Consistency-Attraction," Clifford Nass and Kwan Min Lee, Department of Communication, Stanford University; Journal of Experimental Psychology – Applied, Vol 7. No.3
(Full text of the article is available from the APA Public Affairs Office and at http://www.apa.org/journals/xap/press_releases/september01/pr2.html)
The American Psychological Association (APA), in Washington, DC, is the largest scientific and professional organization representing psychology in the United States and is the world’s largest association of psychologists. APA’s membership includes more than 155,000 researchers, educators, clinicians, consultants and students. Through its divisions in 53 divisions of psychology and affiliations with 60 state, territorial and Canadian provincial associations, APA works to advance psychology as a science, as a profession and as a means of promoting human welfare.
Story Source:
Materials provided by American Psychological Association. Note: Content may be edited for style and length.
Cite This Page: