AutoFlow
Empowering effortless computer control through natural language.
What it does
AutoFlow is a voice accessibility assistant designed to simplify computer use and navigation of users with physical disabilities by integrating Gemini as a powerful, natural language-driven agent.
Gemini serves as brain of our 3 agents.
### Planning agent
Planning agent is responsible for creating a plan from UI element and screenshot, UI element are extracted from Win32 UIAutomation API, and screenshot is taken from Win32 User API and will ask Gemini to create a plan from these data.
This agent only have function to start plan execution, and will forward to ring planning system.
### Identify agent
Identify agent is responsible for identify UI element that user want to interact with such as button or links, this agent will use Gemini to identify element that user want to interact with.
### Navigation agent
Navigation agent is responsible to execute mouse and keyboard event to interact with screen, this agent will use Gemini to convert natural language such as `left click` into mouse and keyboard execution.
Built with
- Google Speech To Text (STT)
Team
By
AutoFlow
From
Thailand