• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Faculty of Humanities / Humanistiska fakulteten
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • View Item
  •   Home
  • Faculty of Humanities / Humanistiska fakulteten
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Steg för steg. Naturvetenskapligt ämnesspråk som räknas

Step by step. A computational analysis of Swedish textbook language

Abstract
In this work, I present a linguistic investigation of the language of Swedish textbooks in the natural sciences, i.e., biology, physics and chemistry. The textbooks, which are used in secondary and upper secondary school, are examined with respect to traditional readability measures, e.g., LIX, OVIX and nominal ratio. I also extract typical linguistic features of the texts, typicality being determined using a proposed quantitative method, labelled the index principle. This empirical, corpus-based method relies on automatic linguistic annotations produced by language technology tools to calculate what I call index lists, rank-ordered lists of characteristic linguistic features of specific text corpora as compared to reference texts. I produce index lists for typical vocabulary, noun phrase structures and syntactic structures, extracted from a 5.2 million word textbook corpus, compiled as a part of the work presented. As well as being frequent and well dispersed, the linguistic variables selected for the index lists are also characteristic of the text type in question, as is evident when they are compared to a reference corpus, comprising textbooks in the social sciences and mathematics, as well as narrative and academic (university-level) texts. The results show that textbooks in natural science contain a lot of content-specific, technical vocabulary. This characteristic not only distinguishes natural scientific language from everyday language, but also from social scientific language, which on the lexical level has more in common with narrative texts. On the other hand, the textbook language as a whole is structurally distinguishable from narrative texts, as clearly seen, e.g., in its noun phrase complexity. In the transition between secondary and upper secondary school, the scores of almost every readability measure go up, indicating an increase in linguistic demands on the readers. In the upper secondary textbooks the words are longer, the vocabulary more varied, the noun phrases longer and more elaborate, and the most typical syntactic structures more complex. Notably, the linguistic development between the form levels is more marked in the natural-science textbooks, compared to social sciences and mathematics. Nevertheless, the textbook language overall shows a relatively low complexity in comparison to academic language.
Degree
Doctor of Philosophy
University
Göteborgs universitet. Humanistiska fakulteten
University of Gothenburg. Faculty of Arts
Institution
Department of Swedish ; Institutionen för svenska språket
Disputation
Fredagen den 4 december 2015, kl. 13:15, Lilla Hörsalen, Humanisten
Date of defence
2015-12-04
URI
http://hdl.handle.net/2077/40506
Collections
  • Doctoral Theses / Doktorsavhandlingar Institutionen för svenska språket
  • Doctoral Theses from University of Gothenburg / Doktorsavhandlingar från Göteborgs universitet
View/Open
Abstract (53.50Kb)
Cover (1.546Mb)
Thesis (15.56Mb)
Date
2015-11-13
Author
Ribeck, Judy
Keywords
academic language
computational linguistics
corpus linguistics
language technology
natural language processing
scientific language
subject-specific language
Swedish textbooks
quantitative stylistics
Publication type
Doctoral thesis
ISBN
978-91-87850-59-2
ISSN
0347-948X
Series/Report no.
Data linguistica
28
Language
swe
Metadata
Show full item record

Related items

Showing items related by title, author, creator and subject.

  • Why the pond is not outside the frog? Grounding in contextual representations by neural language models 

    Ghanimifard, Mehdi (2020-05-05)
    In this thesis, to build a multi-modal system for language generation and understanding, we study grounded neural language models. Literature in psychology informs us that spatial cognition involves different aspects of ...
  • Proceedings of the 2022 CLASP Conference on (Dis)embodiment 

    Dobnik, Simon; Grove, Julian; Sayeed, Asad; Department of Philosophy, Linguistics and Theory of Science (FLoV); Centre for Linguistic Theory and Studies in Probability (CLASP) (The Association for Computational Linguistics, 2022-09-14)
    Dis)embodiment brings together researchers from several areas examining the role of grounding and embodiment in modelling human language and behaviour – or limits thereof. The conference covers areas such as machine learning, ...
  • LIVE and LEARN - Festschrift in honor of Lars Borin 

    Volodina, Elena; Dannélls, Dana; Berdicevskis, Aleksandrs; Forsberg, Markus; Virk, Shafqat; Institutionen för svenska, flerspråkighet och språkteknologi, Göteborgs universitet (2022-11)
    This Festschrift has been compiled to honor Professor Lars Borin on his 65th anniversary. It consists of 30 articles which reflect a fraction of Lars’ scholarly interests within computational linguistics and related fields. ...

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV