• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Faculty of Humanities / Humanistiska fakulteten
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • View Item
  •   Home
  • Faculty of Humanities / Humanistiska fakulteten
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

I see what you mean

Assessing readability for specific target groups

Abstract
This thesis aims to identify linguistic factors that affect readability and text comprehension, viewed as a function of text complexity. Features at various linguistic levels suggested in existing literature are evaluated, including the Swedish readability formula LIX. Natural language processing methods and resources are employed to investigate characteristics that go beyond traditional superficial measures. A comparable corpus of eay-to-read and ordinary texts from three genres is investigated, and it is shown how features present at various levels of representation differ quantitatively across text types and genres. The findings are confirmed in significance tests as well as principal component analysis. Three machine learning algorithms are employed and evaluated in order to build a statistical model for text classification. The results demonstrate that a proposed language model for Swedish (SVIT), utilizing a combination of linguistic features, actually predicts text complexity and genre with a higher accuracy than LIX. It is suggested that the SVIT language model should be adopted to assess surface language properties, vocabulary load, sentence structure, idea density levels as well as the personal interests of different texts. Specific target groups of readers may then be provided with materials tailored to their level of proficiency.
Degree
Doctor of Philosophy
University
Göteborgs universitet. Humanistiska fakulteten
University of Gothenburg. Faculty of Arts
Institution
Department of Swedish ; Institutionen för svenska språket
Disputation
Fredagen den 26 april 2013, kl. 10.15, Stora hörsalen, Humanisten
Date of defence
2013-04-26
E-mail
katarina.heimann.muhlenbock@gu.se
URI
http://hdl.handle.net/2077/32472
Collections
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • Doctoral Theses from University of Gothenburg / Doktorsavhandlingar från Göteborgs universitet
View/Open
Thesis (2.527Mb)
Spikblad (207.3Kb)
Date
2013-04-03
Author
Heimann Mühlenbock, Katarina
Keywords
readability
text complexity
computational linguistics
language resources
language technology
linguistic features
LIX
SVIT
corpus linguistics
text classification
quantitative methods
natural language processing
multilevel text analysis
Publication type
Doctoral thesis
ISBN
978-91-87850-50-9
Series/Report no.
Data Linguistica
24
Language
eng
Metadata
Show full item record

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV