Visa enkel post

dc.contributor.authorGiovanni, Pagliarini
dc.contributor.authorAzfar, Imtiaz
dc.date.accessioned2020-11-06T08:58:07Z
dc.date.available2020-11-06T08:58:07Z
dc.date.issued2020-11-06
dc.identifier.urihttp://hdl.handle.net/2077/66921
dc.description.abstractVisual Relationship Detection (VRD) is a relatively young research area, where the goal is to develop prediction models for detecting the relationships between objects depicted in an image. A relationship is modeled as a subject-predicate-object triplet, where the predicate (e.g an action, a spatial relation, etc. such as “eat”, “chase” or “next to”) describes how the subject and the object are interacting in the given image. VRD can be formulated as a classification problem, but suffers from the effects of having a combinatorial output space; some of the major issues to overcome are long-tail class distribution, class overlapping and intra-class variance. Machine learning models have been found effective for the task and, more specifically, many works proved that combining visual, spatial and semantic features from the detected objects is key to achieving good predictions. This work investigates on the use of distributional embeddings, often used to discover/encode semantic information, in order to improve the results of an existing neural network-based architecture for VRD. Some experiments are performed in order to make the model semantic-aware of the classification output domain, namely, predicate classes. Additionally, different word embedding models are trained from scratch to better account for multi-word objects and predicates, and are then fine-tuned on VRD-related text corpora. We evaluate our methods on two datasets. Ultimately, we show that, for some set of predicate classes, semantic knowledge of the predicates exported from trained-fromscratch distributional embeddings can be leveraged to greatly improve prediction, and it’s especially effective for zero-shot learning.sv
dc.language.isoengsv
dc.subjectDeep Learningsv
dc.subjectNatural Language Processingsv
dc.subjectComputer Visionsv
dc.subjectVisual Relationship Detectionsv
dc.subjectObject Detectionsv
dc.titleInteractionwise Semantic Awareness in Visual Relationship Detectionsv
dc.typetext
dc.setspec.uppsokTechnology
dc.type.uppsokH2
dc.contributor.departmentGöteborgs universitet/Institutionen för data- och informationsteknikswe
dc.contributor.departmentUniversity of Gothenburg/Department of Computer Science and Engineeringeng
dc.type.degreeStudent essay


Filer under denna titel

Thumbnail

Dokumentet tillhör följande samling(ar)

Visa enkel post