Gemini Voice Companion

Use Gemini in a voice driven manner for personal assistant

What it does

This is a voice assistant powered by Gemini AI, featuring local text-to-speech and speech-to-text capabilities that bridge voice and text interactions. Gemini's API excels at understanding context and dispatching commands for various scenarios. By building up context and utilizing API integrations, Gemini can execute Python scripts to perform diverse functions.
In this application, users can interact with the assistant entirely through voice, enabling hands-free and eyes-free operation. This makes it particularly useful in situations where voice is the only available means of communication. The assistant's capabilities include:

Multiple speaker recognition
Unknown voice noise filtering
Controlling smart home devices
Mimicking the user's voice
Switching between different voices and personalities
Reading and summarizing news articles
Get weather and other information
Playing Spotify music
Capturing photos and analyzing them
Navigate link through Chrome
Schedule voice reminder or a generic action

The Gemini-powered assistant's versatility and voice-centric design make it a powerful tool for a wide range of hands-free and eyes-free applications, including in car entertainment, walking guide, house managing, etc.

Built with

  • Web/Chrome

Team

By

Zhenya Yang

From

Australia