Masteruppsatser / Master in Language Technology

Permanent URI for this collectionhttps://gupea-staging.ub.gu.se/handle/2077/61848

Browse

Now showing 1 - 2 of 2

Fast visual grounding in interaction
(2019-10-04) Cano Santín, José Miguel; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori
A big challenge for the development situated agents is that they need to be capable of grounding real objects of their enviroment to representations with semantic meaning, so they can be comunicated to human agents using the human language. de Graaf (2016) developed the KILLE framework, which is a static camerabased robot capable of learning objects and spatial relations from very few samples using image processing algorithms suitable for learning from few samples. However, this framework has a major shortcoming: the time needed to recognise an object increased greatly as the system learned more objects, which motivates us to design a more efficient object recognition module. The following project researches a way to improve object recognition of the same robot framework using a neural network approach suitable for learning from very few image samples: Matching Networks (Vinyals et al., 2016). Our work also investigates how transfer learning form large datasets could be used to improve the object recognition performance and to make learning faster, which are very important features for a robot that interacts online with humans. Therefore, we evaluate the performance of our situated agent with transfer learning from pre-trained models and different conversational strategies with a human tutor. Results show that the robot system is capable of training models really fast and gets very good object recognition performance for small domains.
Kille: Learning Objects and Spatial Relations with Kinect
(2020-08-26) de Graaf, Erik; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori
In order for humans to have meaningful interactions with a robotic system, this system should be capable of grounding semantic representations to their real-world representations, learn spatial relationships and communicate using spoken human language. End users need to be able to query the system what objects it already has knowledge of, for more efficient learning. Such systems exist, but require large sample sizes, thus not allowing end users to teach the system more objects when needed. To overcome this problem, we developed a non-mobile system dubbed Kille, that uses a 3D camera, SIFT features and machine learning to allow a tutor to teach the system objects and spatial relations. The system is built upon the ROS (Robot Operating System) framework and uses Opendial software as a dialogue system, for which a ROS support was written as part of this project. We describe the hardware of the system, the software used and developed, and we evaluate its performance. Our results show that Kille performs well on small learning sets, considering the low sample size it uses to learn. In contrast to other approaches, we focus on learning by a tutor presenting objects and not by providing a dataset. Recognition of spatial relations works well, however no definitive conclusions can be drawn. This is largely due to the small number of participants and the subjective nature of spatial relations.

Browse

Browsing Masteruppsatser / Master in Language Technology by Subject "grounding"