SecondEye

Your interactive and teachable visual assistant

What it does

SecondEye is an interactive and teachable visual assistant. It can be used for a variety of vision-related tasks and use cases. It can also be taught to perform personalized vision tasks without training or programming. All in an interactive and iterative manner.
SecondEye can only be built currently with Gemini because of the following model’s unique capabilities:
1. Object detection with the ability to return accurate bounding-box positions
2. Native video support with the ability to return time-stamped information
3. Large context window
SecondEye harnesses these capabilities to offer these unique and totally new AI experiences:
• For images (using capability N° 1):
◦ Annotated object definitions
◦ Asking about a specific part of an image
◦ Enhancing:
▪ Object search
▪ How to repair or assemble questions
▪ Visual feedback requests
◦ Teaching the model about an annotated part of an image
• For video (using capability N° 2):
◦ Enhanced video search experience
• For live camera video (using capability N° 3):
◦ Personalized real-time video analysis
◦ Teaching the model something with a video
◦ Real-time visual assistance for people with visual impairments, with the ability to memorize faces, objects, and places for future recognition.
• For live screen sharing (using capability N° 3):
◦ Teaching the model a workflow
◦ IT or programming support.
◦ Enhance web browsing and the general computer experience for people with visual impairments

Built with

  • Web/Chrome
  • Firebase
  • Firebase Genkit
  • Google Speech-to-Text/Text-to-Speech

Team

By

Zakaria KADDARI

From

Morocco