EMBODIED QUESTION ANSWERING IN ROBOTIC ENVIRONMENT Automatic generation of a synthetic question-answer data-set
Sammanfattning
Embodied question answering is the task of asking a robot about objects in a 3D environment. The robot has to navigate the environment, find the entities in question, and then stop to answer the question. The answering system consists of navigation and visual-question-answering components. The agent is trained on a synthetic data-set of question-answers and navigational paths called EQA-MP3D. Each question in the
data-set is an executable function that could be run in the environment to yield an answer. EQA-MP3D includes only two types of questions, color and location questions. The type of questions asked could be considered unnatural, and we observe that the question-answers contain biases.
Our work extends the data-set by automatically generating size and spatial questions. We generate a total of 19 207 question-answers for training and 3 186 question-answers for validation. Our data extension is intended to train the system to answer more question types and enhance the system’s overall ability to
perform the task.
Examinationsnivå
Student essay
Fil(er)
Datum
2021-11-12Författare
Aruqi, Ali
Nyckelord
Embodied Question Answering
Question Generation
Spatial Relations
Synthetic Data-sets
Multi-Modality
Språk
eng