• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Faculty of Humanities / Humanistiska fakulteten
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • View Item
  •   Home
  • Faculty of Humanities / Humanistiska fakulteten
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Studies in computational historical linguistics: Models and analyses

Abstract
Computational analysis of historical and typological data has made great progress in the last fifteen years. In this thesis, I work with vocabulary lists for addressing some classical problems in historical linguistics such as cognate identification, discriminating related languages from unrelated languages, assigning possible dates to splits in a language family, and providing an internal structure to a language family. I compare the internal structure inferred from vocabulary lists with the family trees given in Ethnologue. I explore the ranking of lexical items in the widely used Swadesh word list and compare my ranking to another quantitative reranking method and short word lists composed for discovering long-distance genetic relationships. I show that the choice of string similarity measures is important for internal classification and for discriminating related from unrelated languages. The dating system presented in this thesis can be used for assigning age estimates to any new language group and overcomes the assumption of a constant rate of lexical replacement assumed by glottochronology. I train and test a linear classifier based on gap-weighted subsequence features for the purpose of cognate identification. An important conclusion from these results is that n-gram approaches can be used for different historical linguistic purposes.
Degree
Doctor of Philosophy
University
Göteborgs universitet. Humanistiska fakulteten
University of Gothenburg. Faculty of Arts
Institution
Department of Swedish ; Institutionen för svenska språket
Disputation
13 november 2015 kl. 13.15 i Lilla hörsalen, Humanisten.
Date of defence
2015-11-13
URI
http://hdl.handle.net/2077/40571
Collections
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • Doctoral Theses from University of Gothenburg / Doktorsavhandlingar från Göteborgs universitet
View/Open
spikbladet (43.12Kb)
Date
2015-10-22
Author
Rama, Taraka
Keywords
Automatic language classification, calibration dates, cognate identification, com- putational historical linguistics, internal classification, language families, n-grams, skip-grams, string similarity measures, typological data, word lists.
Publication type
Doctoral thesis
ISBN
978-91-87850-58-5
Series/Report no.
Data linguistica
27
Language
eng
Metadata
Show full item record

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV