TRANSLITERATION BETWEEN SPOKEN LANGUAGE CORPORA: MOVING BETWEEN DANISH BYSOC AND SWEDISH GSLC
Abstract
The paper discusses problems that arise in trying to transfer a spoken language corpus
transcribed and formatted according to one standard into the standard and format of another
corpus. Some of the problems that arise are related to the differences that exist between the
standards and formats of different corpora. Other problems are related to human errors and
lack of reliability in creating the transcriptions.
Although the discussion is based on transfer and transliteration between two specific corpora
(the Swedish GSLC (Göteborg Spoken Language Corpus) and the Danish BySoc (By
Sociolingvistik Corpus), we believe the discussion in the article documents and highlights
problems of a general kind which have to be faced whenever spoken language corpora of
different formats are to be compared.
Publisher
Institutionen för lingvistik
Collections
View/ Open
Date
2002Author
Allwood, Jens
Henrichsen, Peter Juel
Grönqvist, Leif
Ahlsén, Elisabeth
Gunnarsson, Magnus
Publication type
report
ISSN
0349-1021
Series/Report no.
Gothenburg Papers in Theoretical Linguistics
86
Language
eng