Emotional stress in synthetic speech

Progress and future directions

Iain R. Murray, John L. Arnott, Elizabeth A. Rohwer

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

Current text-to-speech systems have very good intelligibility, but most are still easily identified as artificial voices and no commercial system incorporates prosodic variation resulting from emotion and related factors. This is largely due to the complexity of identifying and categorising the emotion factors in natural human speech, and implementing these factors within synthetic speech. However, prosodic content in synthetic speech is seen as increasingly important, and there is presently renewed interest in the investigation of human vocal emotion and the expansion of synthesis models to allow greater prosodic variation. Such models could also be used as practical tools in the investigation and validation of models of emotion and other speech-altering stressors. This paper reviews progress to date in the investigation of human vocal emotions and their simulation in synthetic speech, and requirements for future research which is required to develop this area are also presented.
Original languageEnglish
Pages (from-to)85-91
Number of pages7
JournalSpeech Communication
Volume20
Issue number1-2
DOIs
Publication statusPublished - Nov 1996

Fingerprint

emotion
Text-to-speech
Speech intelligibility
Emotion
Speech
Model
Synthesis
Requirements
simulation
Human
Simulation

Cite this

Murray, Iain R. ; Arnott, John L. ; Rohwer, Elizabeth A. / Emotional stress in synthetic speech : Progress and future directions. In: Speech Communication. 1996 ; Vol. 20, No. 1-2. pp. 85-91.
@article{c128bc2b3d614a34b34a4898e2b8ac06,
title = "Emotional stress in synthetic speech: Progress and future directions",
abstract = "Current text-to-speech systems have very good intelligibility, but most are still easily identified as artificial voices and no commercial system incorporates prosodic variation resulting from emotion and related factors. This is largely due to the complexity of identifying and categorising the emotion factors in natural human speech, and implementing these factors within synthetic speech. However, prosodic content in synthetic speech is seen as increasingly important, and there is presently renewed interest in the investigation of human vocal emotion and the expansion of synthesis models to allow greater prosodic variation. Such models could also be used as practical tools in the investigation and validation of models of emotion and other speech-altering stressors. This paper reviews progress to date in the investigation of human vocal emotions and their simulation in synthetic speech, and requirements for future research which is required to develop this area are also presented.",
author = "Murray, {Iain R.} and Arnott, {John L.} and Rohwer, {Elizabeth A.}",
note = "Copyright 2006 Elsevier B.V., All rights reserved.",
year = "1996",
month = "11",
doi = "10.1016/S0167-6393(96)00046-5",
language = "English",
volume = "20",
pages = "85--91",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",
number = "1-2",

}

Emotional stress in synthetic speech : Progress and future directions. / Murray, Iain R.; Arnott, John L.; Rohwer, Elizabeth A.

In: Speech Communication, Vol. 20, No. 1-2, 11.1996, p. 85-91.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Emotional stress in synthetic speech

T2 - Progress and future directions

AU - Murray, Iain R.

AU - Arnott, John L.

AU - Rohwer, Elizabeth A.

N1 - Copyright 2006 Elsevier B.V., All rights reserved.

PY - 1996/11

Y1 - 1996/11

N2 - Current text-to-speech systems have very good intelligibility, but most are still easily identified as artificial voices and no commercial system incorporates prosodic variation resulting from emotion and related factors. This is largely due to the complexity of identifying and categorising the emotion factors in natural human speech, and implementing these factors within synthetic speech. However, prosodic content in synthetic speech is seen as increasingly important, and there is presently renewed interest in the investigation of human vocal emotion and the expansion of synthesis models to allow greater prosodic variation. Such models could also be used as practical tools in the investigation and validation of models of emotion and other speech-altering stressors. This paper reviews progress to date in the investigation of human vocal emotions and their simulation in synthetic speech, and requirements for future research which is required to develop this area are also presented.

AB - Current text-to-speech systems have very good intelligibility, but most are still easily identified as artificial voices and no commercial system incorporates prosodic variation resulting from emotion and related factors. This is largely due to the complexity of identifying and categorising the emotion factors in natural human speech, and implementing these factors within synthetic speech. However, prosodic content in synthetic speech is seen as increasingly important, and there is presently renewed interest in the investigation of human vocal emotion and the expansion of synthesis models to allow greater prosodic variation. Such models could also be used as practical tools in the investigation and validation of models of emotion and other speech-altering stressors. This paper reviews progress to date in the investigation of human vocal emotions and their simulation in synthetic speech, and requirements for future research which is required to develop this area are also presented.

UR - http://www.scopus.com/inward/record.url?scp=0030291449&partnerID=8YFLogxK

U2 - 10.1016/S0167-6393(96)00046-5

DO - 10.1016/S0167-6393(96)00046-5

M3 - Article

VL - 20

SP - 85

EP - 91

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

IS - 1-2

ER -