• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
  • View Item
  •   Home
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

NLP methods for the automatic generation of exercises for second language learning from parallel corpus data

Abstract
Intelligent Computer Assisted Language Learning (ICALL), or Intelligent Computer Assisted Language Instruction (ICALI), is a field of research that combines Artificial Intelligence and Computer Assisted Language Learning (CALL) in order to produce tools that can aid second language learners without human intervention. The automatic generation of exercises for language learners from a corpus enables the students to self-pace learning activities and offers a theoretically infinite, un-mediated and un-biased content. In recent years, the advancement in NLP technology and the increase of available resources made this possibility closer. In particular, relevant sources of knowledge are the large collections of aligned parallel texts: corpora containing sentences in different languages, which can be considered translations of one another. The present work explores the possibility to extract candidate sentences and their translations from a parallel corpus and use them to generate exercises for different proficiency levels. The research was conducted experimenting with several available NLP tools and qualitatively evaluating the results on a training set of documents to define a pipeline for the language pairs: Swedish-English, English-Italian, Swedish-Italian. Finally, a set of 30 random documents was extracted and annotated manually to obtain a quantitative evaluation. The results showed a mean accuracy between 70-90% in the sentence selection, depending on the language pair; between 80-96% using more strict criteria for the selection and reducing the recall. It is interesting to note that the implementation is mostly language independent, there is only one language-specific component to estimate the target proficiency level of the sentence, so in future works the same pipeline could be extended to include other language pairs.
Degree
Student essay
URI
http://hdl.handle.net/2077/66583
Collections
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
View/Open
Student essay (859.7Kb)
Date
2020-09-25
Author
Zanetti, Arianna
Keywords
ICALL
language learning
parallel corpus
exercise generation
Language
eng
Metadata
Show full item record

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV