Gaze Assisted Visual Grounding

Page view(s)

Checked on Sep 09, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/13949

Title:

Gaze Assisted Visual Grounding

Journal Title:

Lecture Notes in Computer Science

DOI:

10.1007/978-3-030-90525-5_17

Publication URL:

http://dx.doi.org/10.1007/978-3-030-90525-5_17

Authors:

Kritika Johari, Christopher Tay Zi Tong, Vigneshwaran Subbaraju, Jung-jae Kim, U-Xuan Tan

Keywords:

Publication Date:

02 November 2021

Citation:

Johari, K., Tong, C. T. Z., Subbaraju, V., Kim, J.-J., & Tan, U.-X. (2021). Gaze Assisted Visual Grounding. Lecture Notes in Computer Science, 191–202. doi:10.1007/978-3-030-90525-5_17

Abstract:

There has been an increasing demand for visual grounding in various human-robot interaction applications. However, the accuracy is often limited by the size of the dataset that can be collected, which is often a challenge. Hence, this paper proposes using the natural implicit input modality of human gaze to assist and improve the visual grounding accuracy of human instructions to robotic agents. To demonstrate the capability, mechanical gear objects are used. To achieve that, we utilized a transformer-based text classifier and a small corpus to develop a baseline phrase grounding model. We evaluate this phrase grounding system with and without gaze input to demonstrate the improvement. Gaze information (obtained from Microsoft Hololens2) improves the performance accuracy from 26% to 65%, leading to more efficient human-robot collaboration and applicable to hands-free scenarios. This approach is data-efficient as it requires only a small training dataset to ground the natural language referring expressions.

License type:

Publisher Copyright

Funding Info:

This research / project is supported by the A*STAR - AME Programmatic
Grant Reference no. : A18A2b0046

Description:

This is a post-peer-review, pre-copyedit version of an article published in Social Robotics. The final authenticated version is available online at: http://dx.doi.org/10.1007/978-3-030-90525-5_17

URI:

https://oar.a-star.edu.sg/communities-collections/articles/13949

ISSN:

0302-9743

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
icsr2021-097.pdf	7.25 MB	PDF	Open