• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
  • Redigera dokument
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

NLP methods for the automatic generation of exercises for second language learning from parallel corpus data

Sammanfattning
Intelligent Computer Assisted Language Learning (ICALL), or Intelligent Computer Assisted Language Instruction (ICALI), is a field of research that combines Artificial Intelligence and Computer Assisted Language Learning (CALL) in order to produce tools that can aid second language learners without human intervention. The automatic generation of exercises for language learners from a corpus enables the students to self-pace learning activities and offers a theoretically infinite, un-mediated and un-biased content. In recent years, the advancement in NLP technology and the increase of available resources made this possibility closer. In particular, relevant sources of knowledge are the large collections of aligned parallel texts: corpora containing sentences in different languages, which can be considered translations of one another. The present work explores the possibility to extract candidate sentences and their translations from a parallel corpus and use them to generate exercises for different proficiency levels. The research was conducted experimenting with several available NLP tools and qualitatively evaluating the results on a training set of documents to define a pipeline for the language pairs: Swedish-English, English-Italian, Swedish-Italian. Finally, a set of 30 random documents was extracted and annotated manually to obtain a quantitative evaluation. The results showed a mean accuracy between 70-90% in the sentence selection, depending on the language pair; between 80-96% using more strict criteria for the selection and reducing the recall. It is interesting to note that the implementation is mostly language independent, there is only one language-specific component to estimate the target proficiency level of the sentence, so in future works the same pipeline could be extended to include other language pairs.
Examinationsnivå
Student essay
URL:
http://hdl.handle.net/2077/66583
Samlingar
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
Fil(er)
Student essay (859.7Kb)
Datum
2020-09-25
Författare
Zanetti, Arianna
Nyckelord
ICALL
language learning
parallel corpus
exercise generation
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV