"MOVE TOWARDS THE BIG BLACK PIANO": HOW FINE-GRAINED FEATURES AFFECT THE GOAL OF NAVIGATION Improving salient landmark features in an end-to-end system

Le Glouanec, Bérénice

"MOVE TOWARDS THE BIG BLACK PIANO": HOW FINE-GRAINED FEATURES AFFECT THE GOAL OF NAVIGATION Improving salient landmark features in an end-to-end system

dc.contributor.author	Le Glouanec, Bérénice
dc.contributor.department	University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science	eng
dc.contributor.department	Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori	swe
dc.date.accessioned	2024-10-30T13:02:52Z
dc.date.available	2024-10-30T13:02:52Z
dc.date.issued	2024-10-30
dc.description.abstract	Navigational instructions like ”Move towards the big black piano” or ”Head past the green armchair” are intuitive for humans, as they rely on salient landmarks to guide movement through space. This thesis explores how fine-grained features, such as spatial location, shape, and color, influence the salience of landmarks in navigation systems. Through linguistic analysis of textual descriptions and object recognition using Faster R-CNN implemented with a bottom-up attention mechanism, we captured key attributes that enhance the clarity of instructions. Our experiments were conducted using the Room-to-Room dataset (Anderson et al. (2018)), which provides human instructions for indoor navigation, and the Matterport3D environment (Chang et al. (2017)), offering egocentric visual data. By clustering nouns and attributes based on frequency and semantic similarity, we identified important objects and attributes that guide users efficiently. By examining object distribution in skyboxes and mapping instructions to visual scenes, we evaluated whether accessing multiple skybox views (top, back, left, front, right, and bottom) instead of a single, centered view provides additional contextual value in goal-oriented navigation systems. Finally, we extended previous research by applying a bi-directional boost attention mask over salient landmarks within Anderson et al. (2018)’s Seq2Seq LSTM model, where our experiments demonstrated significant improvements. Notably, the dynamic weights in the attention class achieved 37.65% and 22.22% success rates on seen and unseen data, outperforming the baseline. Therefore, by using linguistic salience to guide visual attention, we improve the navigation task and demonstrate how language refines the model’s focus. Future work should continue refining the attention mechanism and explore further strategies, such as integrating additional views, to provide even richer contextual information and further boost navigation accuracy.	sv
dc.identifier.uri	https://hdl.handle.net/2077/83907
dc.language.iso	eng	sv
dc.setspec.uppsok	HumanitiesTheology
dc.subject	salience, clustering, machine learning	sv
dc.title	"MOVE TOWARDS THE BIG BLACK PIANO": HOW FINE-GRAINED FEATURES AFFECT THE GOAL OF NAVIGATION Improving salient landmark features in an end-to-end system	sv
dc.title.alternative	"MOVE TOWARDS THE BIG BLACK PIANO": HOW FINE-GRAINED FEATURES AFFECT THE GOAL OF NAVIGATION Improving salient landmark features in an end-to-end system	sv
dc.type	Text
dc.type.degree	Student essay
dc.type.uppsok	H2

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Thesis_Berenice_Le_Glouanec.pdf
Size:: 3.6 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 4.68 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masteruppsatser / Master in Language Technology