TRANSLITERATION BETWEEN SPOKEN LANGUAGE CORPORA: MOVING BETWEEN DANISH BYSOC AND SWEDISH GSLC
Sammanfattning
The paper discusses problems that arise in trying to transfer a spoken language corpus
transcribed and formatted according to one standard into the standard and format of another
corpus. Some of the problems that arise are related to the differences that exist between the
standards and formats of different corpora. Other problems are related to human errors and
lack of reliability in creating the transcriptions.
Although the discussion is based on transfer and transliteration between two specific corpora
(the Swedish GSLC (Göteborg Spoken Language Corpus) and the Danish BySoc (By
Sociolingvistik Corpus), we believe the discussion in the article documents and highlights
problems of a general kind which have to be faced whenever spoken language corpora of
different formats are to be compared.
Utgivare
Institutionen för lingvistik
Fil(er)
Datum
2002Författare
Allwood, Jens
Henrichsen, Peter Juel
Grönqvist, Leif
Ahlsén, Elisabeth
Gunnarsson, Magnus
Publikationstyp
report
ISSN
0349-1021
Serie/rapportnr.
Gothenburg Papers in Theoretical Linguistics
86
Språk
eng