Peter Juel Henrichsen
Copenhagen Business School, Copenhagen, Denmark
Jens Allwood
University of Gothenburg, Gothenburg, Sweden
Ladda ner artikelIngår i: NEALT Proceedings. Northern European Association for Language and Technology; 4th Nordic Symposium on Multimodal Communication; November 15-16; Gothenburg; Sweden
Linköping Electronic Conference Proceedings 93:7, s. 47-53
NEALT Proceedings Series 21:7, p. 47-53
Publicerad: 2013-10-29
ISBN: 978-91-7519-461-5
ISSN: 1650-3686 (tryckt), 1650-3740 (online)
We present our experiments on attitude detection based on annotated multi-modal dialogue data1. Our long-term goal is to establish a computational model able to predict the attitudinal patterns in human-human dialogue. We believe; such prediction algorithms are useful tools in the pursuit of realistic discourse behavior in conversational agents and other intelligent man-machine interfaces. The present paper deals with two important subgoals in particular: How to establish a meaningful and consistent set of annotation categories for attitude annotation; and how to relate the annotation data to the recorded data (audio and video) in computational models of attitude prediction. We present our current results including a recommended set of analytical annotation labels and a recommended setup for extracting linguistically meaningful data even from noisy audio and video signals.
attitude detection; prediction of attitude flow; attitude annotation; multimodal speech cues
Allwood; J.; Cerrato; L.; Jokinen; K.; Navarretta; C. & Paggio; P. (2007). The MUMIN Coding Scheme for the Annotation of Feedback; Turn Management and Sequencing. In J. C. Martin et al. (eds.) Multimodal Corpora for Modelling Human Multimodal Behavior. Special Issue of the International Journal of Language Resources and Evaluation. Berlin: Springer.
Aylett; M. P. & J. Yamagishi (2008) Combining Statistical Parametric Speech Synthesis and Unit-Selection for Automatic Voice Cloning; LangTech-2008; Rome.
Boersma; P.; & Weenink; D. (2005). Praat: doing phonetics by computer (Version 4.3.01) [Computer program]. Retrieved from http://www.praat.org/
Henrichsen; P.J. (2012) Nature Identical Prosody; data-driven prosodic feature assignment for diphone synthesis; 4th Swedish Language Technology Conference (SLTC-2012); Lund.
Kipp; M. (2001). anvil – a generic annotation tool for multimodal dialogue. In Proceedings of Eurospeech; pages 1367-1370.
Navarretta; C.; Ahlsén; E.; Allwood; J.; Paggio; P. & Jokinen; K. (2011). Creating Comparable Multimodal Corpora for Nordic Languages. Proceedings of the 18th Nordic Conference of Computational Linguistics. Riga; Latvia; May 11-13. NEALT. pp. 153-160. See http://dspace.utlib.ee/dspace/handle/10062/16955
Nivre; J. et al. (2001). Göteborg Transcription Standard (GTS) 6.4. University of Gothenburg; Department of Linguistics.
Nivre J. et al. (2004). Modified Standard Orthography (MSO). University of Gothenburg; Department of Linguistics.
Oparin; I.; V.Kiselev; A.Talanov (2008) Large Scale Russian Hybrid Unit Selection TTS. SLTC-08. Stockholm.