Bilkent University
Department of Computer Engineering
CS 590/690 SEMINAR

 

Autonomous Robotic Manipulation with LLMs using Visual Keypoints and Scene Descriptions

 

Ekin Berk Ekinci
Master Student
(Supervisor: Asst.Prof.Özgür Salih Öğüz)
Computer Engineering Department
Bilkent University

Abstract: Extracting accurate grasping locations and making contextual decisions are essential for scalable and versatile robot learning. Large vision models (LVMs) which are trained on extensive datasets, have proven to be effective as feature extractors and context-rich information descriptors. We are able to extract visual features of scenes and grasping keypoints for manipulators. Additionally, large language models (LLMs), which have been trained on enormous amounts of data, show the ability to learn patterns from few examples and to apply them to new scenarios. Building upon this, we aim to integrate LLMs with extracted descriptions of the scenes and grasp locations. Our research aims to develop a generic robotic manipulator capable of adapting to similar environments with different scenarios. We will showcase the related work in this domain and outline our research plan.

 

DATE: November 18, Monday @ 16:50 Place: EA 502