Visa enkel post

dc.contributor.authorKolachina, Prasanth
dc.date.accessioned2019-05-24T15:09:46Z
dc.date.available2019-05-24T15:09:46Z
dc.date.issued2019-05-24
dc.identifier.isbn978-91-7833-509-1
dc.identifier.urihttp://hdl.handle.net/2077/60331
dc.description.abstractThis thesis studies the connections between parsing friendly representations and interlingua grammars developed for multilingual language generation. Parsing friendly representations refer to dependency tree representations that can be used for robust, accurate and scalable analysis of natural language text. Shared multilingual abstractions are central to both these representations. Universal Dependencies (UD) is a framework to develop cross-lingual representations, using dependency trees for multlingual representations. Similarly, Grammatical Framework (GF) is a framework for interlingual grammars, used to derive abstract syntax trees (ASTs) corresponding to sentences. The first half of this thesis explores the connections between the representations behind these two multilingual abstractions. The first study presents a conversion method from abstract syntax trees (ASTs) to dependency trees and present the mapping between the two abstractions – GF and UD – by applying the conversion from ASTs to UD. Experiments show that there is a lot of similarity behind these two abstractions and our method is used to bootstrap parallel UD treebanks for 31 languages. In the second study, we study the inverse problem i.e. converting UD trees to ASTs. This is motivated with the goal of helping GF-based interlingual translation by using dependency parsers as a robust front end instead of the parser used in GF. The second half of this thesis focuses on the topic of data augmentation for parsing – specifically using grammar-based backends for aiding in dependency parsing. We propose a generic method to generate synthetic UD treebanks using interlingua grammars and the methods developed in the first half. Results show that these synthetic treebanks are an alternative to develop parsing models, especially for under-resourced languages without much resources. This study is followed up by another study on out-of-vocabulary words (OOVs) – a more focused problem in parsing. OOVs pose an interesting problem in parser development and the method we present in this paper is a generic simplification that can act as a drop-in replacement for any symbolic parser. Our idea of replacing unknown words with known, similar words results in small but significant improvements in experiments using two parsers and for a range of 7 languages.sv
dc.language.isoengsv
dc.relation.ispartofseries174Dsv
dc.relation.haspartPrasanth Kolachina and Aarne Ranta. "From Abstract Syntax to UniversalDependencies". Linguistic Issues in Language Technology 13(3), 2016sv
dc.relation.haspartAarne Ranta and Prasanth Kolachina "From Universal Dependencies to AbstractSyntax". Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies(UDW 2017), pp. 107–116.sv
dc.relation.haspartPrasanth Kolachina and Martin Riedl and Chris Biemann. "Replacing OOV Words For Dependency Parsing With Distributional Semantics". Proceedings of the 21st Nordic Conference on Computational Linguistics (NoDaL-iDa), 2017, pp. 11–19.sv
dc.relation.haspartPrasanth Kolachina and Aarne Ranta. Bootstrapping UD treebanks for Delexicalized Parsing. Under Submission.sv
dc.subjectGrammatical Frameworksv
dc.subjectUniversal Dependenciessv
dc.subjectNatural Language Processingsv
dc.subjectmultilingualitysv
dc.subjectabstract syntax treessv
dc.subjectdependency treessv
dc.subjectmultilingual generationsv
dc.subjectmultilingual parserssv
dc.titleMultilingual Abstractions: Abstract Syntax Trees and Universal Dependenciessv
dc.typeText
dc.type.svepDoctoral thesis
dc.gup.mailprasanth.kolachina@cse.gu.sesv
dc.type.degreeDoctor of Philosophysv
dc.gup.originGöteborgs universitet. IT-fakultetensv
dc.gup.departmentDepartment of Computer Science and Engineering ; Institutionen för data- och informationstekniksv
dc.citation.doiITF
dc.gup.defenceplaceJune 14th 1000AM Room EA, Hörsalsvägen 11sv
dc.gup.defencedate2019-06-14


Filer under denna titel

Thumbnail
Thumbnail

Dokumentet tillhör följande samling(ar)

Visa enkel post