Object Instance Retrieval in Assistive Robotics: Leveraging Fine-Tuned SimSiam with Multi-View Images Based on 3D Semantic Map

Sakaguchi, Taichi; Taniguchi, Akira; Hagiwara, Yoshinobu; Hafi, Lotfi El; Hasegawa, Shoichi; Taniguchi, Tadahiro

Computer Science > Robotics

arXiv:2404.09647(cs)

[Submitted on 15 Apr 2024 (v1), last revised 13 Sep 2024 (this version, v2)]

Title:Object Instance Retrieval in Assistive Robotics: Leveraging Fine-Tuned SimSiam with Multi-View Images Based on 3D Semantic Map

Authors:Taichi Sakaguchi,Akira Taniguchi,Yoshinobu Hagiwara,Lotfi El Hafi,Shoichi Hasegawa,Tadahiro Taniguchi

View PDF HTML (experimental)

Abstract:Robots that assist humans in their daily lives should be able to locate specific instances of objects in an environment that match a user's desired objects. This task is known as instance-specific image goal navigation (InstanceImageNav), which requires a model that can distinguish different instances of an object within the same class. A significant challenge in robotics is that when a robot observes the same object from various 3D viewpoints, its appearance may differ significantly, making it difficult to recognize and locate accurately. In this paper, we introduce a method called SimView, which leverages multi-view images based on a 3D semantic map of an environment and self-supervised learning using SimSiam to train an instance-identification model on-site. The effectiveness of our approach was validated using a photorealistic simulator, Habitat Matterport 3D, created by scanning actual home environments. Our results demonstrate a 1.7-fold improvement in task accuracy compared with contrastive language-image pre-training (CLIP), a pre-trained multimodal contrastive learning method for object searching. This improvement highlights the benefits of our proposed fine-tuning method in enhancing the performance of assistive robots in InstanceImageNav tasks. The project website isthis https URL.

Comments:	See website atthis https URL.Accepted to IROS2024
Subjects:	Robotics (cs.RO)
Cite as:	arXiv:2404.09647[cs.RO]
	(or arXiv:2404.09647v2[cs.RO]for this version)
	https://doi.org/10.48550/arXiv.2404.09647

Submission history

From: Akira Taniguchi [view email]
[v1] Mon, 15 Apr 2024 10:25:14 UTC (4,740 KB)
[v2] Fri, 13 Sep 2024 03:13:55 UTC (5,278 KB)

Computer Science > Robotics

Title:Object Instance Retrieval in Assistive Robotics: Leveraging Fine-Tuned SimSiam with Multi-View Images Based on 3D Semantic Map

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Object Instance Retrieval in Assistive Robotics: Leveraging Fine-Tuned SimSiam with Multi-View Images Based on 3D Semantic Map

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators