• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Faculty of Science / Naturvetenskapliga fakulteten
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Doctoral Theses / Doktorsavhandlingar Institutionen för data- och informationsteknik
  • Redigera dokument
  •   Startsida
  • Faculty of Science / Naturvetenskapliga fakulteten
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Doctoral Theses / Doktorsavhandlingar Institutionen för data- och informationsteknik
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

Methods and Tools for Automating Language Engineering

Sammanfattning
Language-processing software is becoming increasingly present in our society. Making such tools available to the greater number is not just a question of access to technology but also a question of language as they need to be adapted, or localized, to each linguistic community. It is thus important to make the tools necessary to the engineering of language-processing systems as accessible as possible, for instance through automation. Not so much to help the traditional software creators but more importantly to enable communities to bring their language use into the digital world on their own terms. Smart paradigms are created in the hope that they can decrease the amount of work for the lexicographer who wishes to create or update a morphological lexicon. In the first paper, we evaluate smart paradigms implemented in GF. How good are they to guess the correct inflection tables? How much information is required? How good are they at compressing the lexicon? In the second paper, we take some distance from the smart paradigms, although they have been used in this work, they are not the main focus of the study. Instead, we compare two rule-based machine translation systems based on different translation models and try to determine the potential of a possible hybridization. In the third paper we come back to the smart paradigms. If they can reduce the work of the lexicographer, someone still needs to create the smart paradigms in the first place. In this paper we explore the possibility of automatically creating smart paradigms based on existing traditional paradigms using machine-learning techniques. Finally, the last paper presents a collection of tools meant to help grammar engineering work in the Grammatical Framework community: a tokenizer; a library to embedded grammars in Java applications; a build server; a document translator and a kernel to Jupyter notebooks.
Delarbeten
G. Détrez and A. Ranta (2012). “Smart paradigms and the predictability and complexity of inflectional morphology”. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 645–653
 
G. Détrez, V. M. Sánchez-Cartagena, and A. Ranta (2014). “Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium”. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14). European Language Resources Association (ELRA)
 
G. Détrez. “Learning Smart Paradigms”. Under journal submission.
 
G. Détrez (2015). Tools for a grammar engineering community. Tech. rep.
 
Examinationsnivå
Doctor of Philosophy
Universitet
Göteborgs universitet. IT-fakulteten
Institution
Department of Computer Science and Engineering ; Institutionen för data- och informationsteknik
Disputation
Torsdagen den 2 Juni 2016, kl. 10.00, Rum EA, Hörsalsvägen 11, Göteborg
Datum för disputation
2016-06-02
E-post
gregoire.detrez@gu.se
URL:
http://hdl.handle.net/2077/43364
Samlingar
  • Doctoral Theses / Doktorsavhandlingar Institutionen för data- och informationsteknik
  • Doctoral Theses from University of Gothenburg / Doktorsavhandlingar från Göteborgs universitet
Fil(er)
Thesis content (2.006Mb)
Spikblad (76.88Kb)
Thesis cover (1.550Mb)
Datum
2016-05-12
Författare
Détrez, Grégoire
Nyckelord
Natural language processing
Language Engineering
Morphology
Lexicon
Complexity
Publikationstyp
Doctoral thesis
ISBN
978–91–628–9854–0
978–91–628–9855–7
Serie/rapportnr.
Technical Report
127D
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV