VISION_AI

Revolutionizing navigation for the visually impaired.

What it does

Vision_AI is an AI guidance system designed to provide blind individuals with real-time audio feedback and instructions based on their surroundings and needs. Utilizing cutting-edge generative AI and computer vision, the system captures and processes images and sounds from the environment to offer comprehensive assistance. The automated process includes Real-Time Obstacle Detection and Avoidance, and Integration with Wearable Devices, enabling blind people to travel independently and safely while accessing information and services more easily. Technology Stack:
- **Generative AI (GEMINI) FOR IMAGE PROCESSING**
- **HTML/CSS**
- **Depth Sensing API, Speech Recognition APIs, Text-to-Speech Libraries**
- **Firebase**
- **Blind Stick Integration using ESP32 Module**
- **Integration with Smart Glasses**
**HOW WE USED GEMINI API:**
The Gemini API enables continuous streaming of sensor data from the LiDAR and camera modules to the AI processing unit. This ensures that VISION_AI always has access to the latest environmental information, allowing for real-time analysis and response.
Through the Gemini API, VISION_AI seamlessly integrates Google's Generative AI for object recognition and scene description. The Gemini API provides essential functionalities for image processing, such as image capture, preprocessing, and feature extraction
Feedback Mechanisms:
Leveraging the capabilities of the Gemini API, VISION_AI delivers instantaneous auditory and tactile feedback to the user.

Built with

Web/Chrome
Firebase

Team

VISION_AI

From

India