dc.description.abstract | Face-to-face communication is multimodal with varying contributions from all sensory modalities, see e.g. Kopp (2013), Kendon (1980) and Allwood (1979). This paper reports a study of respondents interpreting
vocal and gestural verbal and non-verbal, behavior. 10 clips from 5 different short video + audio recordings of two persons meeting for the first time were used as stimulus in a perception/classification study. The respondents were divided in 3 different groups. The first group watched only the video part
of the clips without any sound. The second group listened to the audio track without video. The third group was exposed to both the audio and video tracks of the clip. In order to collect the data, we used a crowdsourcing questionnaire. The study reports on how respondents classified clips containing 4 different types of behavior (looking up, looking down, nodding and laughing) that were found to be frequent in a previous study (Lanzini 2013) according to which Affective Epistemic State (AES) the behaviors
were perceived as expressing.
We grouped the linguistic terms for the affective epistemic states that the respondents used into 27 different semantic fields. In this paper we will focus on the 7 most common fields, i.e. the fields of Thinking, Nervousness, Happiness, Assertiveness, Embarrassment, Indifference and Interest. The aim of the study is to increase understanding of how exposure to video and/or audio modalities affect
the interpretation of vocal and gestural verbal and non-verbal behavior, when they are displayed unimodally and multi-modally. | sv |