• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Student essays / Studentuppsatser
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Magisteruppsatser (Department of Swedish / Institutionen för svenska språket)
  • View Item
  •   Home
  • Student essays / Studentuppsatser
  • Department of Swedish / Institutionen för svenska språket (-2021)
  • Magisteruppsatser (Department of Swedish / Institutionen för svenska språket)
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

From corpus to language classroom: reusing Stockholm Umeå Corpus in a vocabulary exercise generator SCORVEX

Abstract
In this master thesis the focus has been made on the evaluation of Stockholm Umeå Corpus (SUC) as a source of teaching materials for learners of Swedish as a Second language. The evaluation has been carried out both practically and theoretically. On the theoretical side, readability tests have been run on all SUC texts to analyze whether appropriate texts can be automatically selected for each proficiency level. To make readability analysis more “vocabulary aware” lexical frequency profile of each text has been collected, analyzed and embedded into the final readability score assigned to each text. SUC has proven to be a rich source of texts of different proficiency levels appropriate for language training purposes. Advantages and disadvantages of SUC as a source of pedagogical materials have been identified in the course of work. On the practical side, as a side effect of the theoretical analysis, a pedagogical tool SCORVEX (Swedish CORpus-based Vocabulary EXercise generator) has been designed and implemented. The existing modules of SCORVEX demonstrate to which extent it is possible to generate pedagogically acceptable vocabulary items with SUC as the only language resource. I am demonstrating in the thesis how wordbank items, multiple choice items and c-tests can be automatically generated for a specified proficiency level, word frequency band and a specified wordclass. In yes/no items potential words are generated on the basis of existing morphemes. All the four modules are therefore “language-aware”. Accessing frequency data obtained from SUC is the pre-requisite for the exercise generation, whereas SUC text archive is the only source of texts, sentences and words for vocabulary items. This thesis can hopefully wake interest among teachers to test this generator in real-life conditions and maybe even convince some teachers in the usefulness of this pedagogical tool. The numerous ways for further development of this software are outlined in the paper.
Degree
Student essay
URI
http://hdl.handle.net/2077/22229
Collections
  • Magisteruppsatser (Department of Swedish / Institutionen för svenska språket)
View/Open
gupea_2077_22229_1.pdf (1.360Mb)
Date
2008
Author
Volodina, Elena
Language
eng
Metadata
Show full item record

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV