Gemini agent for Turtlebot3 perception
Improving Gemini’s visual capabilities with grounding dino
What it does
The app uses an agent backed by Gemini to monitor the location of a turtlebot3 robot on a 4x5 grid so that it serves for future path planning and navigation. Based on some testing of Geimini's multimodal capabilities it was clear that object detection and location is not something that it can do out of the box. For that reason, I decided to integrated a specialized model (Grounding DINO) as a tool that can help the Gemini agent perform better at robot detection. Once the robot is detected on the grid we can ask the agent other complex tasks like plan a path to move the robot from one location to another one and even send control commands to execute the path with ROS bridge integration.
Built with
- Vertex AI
Team
By
bracavisionai
From
United States