OVAL‑Grasp: Open‑Vocabulary Affordance Localization for Task Oriented Grasping

Edmond Tong*1, Advaith Balaji*1, Anthony Opipari1, Stanley Lewis1, Zhen Zeng2, Odest Chadwicke Jenkins1
*Equal Contribution
University of Michigan1 and J.P. Morgan AI Research2
ISER 2025
System Overview

The robot generates task-oriented grasps by using an LLM to identify grasp‑relevant object parts, a VLM to segment them, and a heatmap to filter grasp candidates to fulfill the given task.

Abstract

To manipulate objects in novel, unstructured environments, robots need task-oriented grasps that target object parts based on the given task. Geometry-based methods often struggle with visually defined parts, occlusions, and unseen objects. We introduce OVAL‑Grasp, a zero‑shot open‑vocabulary approach to task‑oriented, affordance‑based grasping that uses large‑language models (LLM) and vision‑language models (VLM) to allow a robot to grasp objects at the correct part according to a given task. Given an RGB image and a task, OVAL‑Grasp identifies parts to grasp or avoid with an LLM, segments them with a VLM, and generates a 2D heatmap of actionable regions on the object. During our evaluations, we found that our method outperformed two task‑oriented grasping baselines on experiments with 20 household objects across three tasks each. OVAL‑Grasp successfully identifies and segments the correct object part 95% of the time and grasps the correct actionable area 78.3% of the time in real‑world experiments with the Fetch mobile manipulator. Additionally, OVAL‑Grasp finds correct object parts under partial occlusions, demonstrating an 80% part selection success rate in cluttered scenes, and we show the benefit of its modular design through ablation studies.

Summary Video

Experiment Video

Results

BibTeX

@misc{tong2024ovalprompt,
  title={OVAL‑Prompt: Open‑Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance‑Grounding},
  author={Edmond Tong and Anthony Opipari and Stanley Lewis and Zhen Zeng and Odest Chadwicke Jenkins},
  year={2024},
  eprint={2404.11000},
  archivePrefix={arXiv},
  primaryClass={cs.RO}
}