OCT 20, 2025

Bug0 reduces manual test debugging by 60% with Gemini 2.5 Pro

Sandeep Panda

co-founder & CTO of Bug0

Vishal Dharmadhikari

Product Solutions Engineer

Traditional software quality assurance (QA) often relies on brittle, selector-based tests that break when user interfaces change. Debugging these failures typically requires engineers to manually review test logs and recordings, a time-consuming process that slows development velocity.

Bug0, an AI-powered QA platform, automates browser and mobile testing for engineering teams. Their platform is designed to generate, maintain, and auto-heal tests at scale, reducing the friction associated with traditional QA.

To improve test reliability and automate the debugging process, Bug0 uses the multimodal reasoning capabilities of Gemini 2.5 Pro to analyze test recordings, validate outcomes, and automatically determine the root cause of failures.

Automating QA analysis with multimodal reasoning

Bug0 sought to reduce reliance on traditional assertion frameworks, such as Playwright, which depend on specific code selectors that frequently become outdated. They also needed a scalable way to analyze test outcomes without manual intervention.

"Watching full test recordings to identify the root cause of a failure was time consuming, and maintaining complex selectors or flaky assertions slowed us down," said Sandeep Panda, Co-founder and CTO of Bug0. "We needed a way to summarize test intent and outcomes automatically using AI."

Bug0 selected Gemini 2.5 Pro specifically for its advanced multimodal capabilities, particularly its ability to interpret video.

They implemented Gemini 2.5 Pro for two primary functions:

AI assertion engine: The engine evaluates whether a test objective was met based on visual or structural evidence, such as video recordings, page screenshots, or accessibility snapshots. This replaces fragile code locators with robust, AI-powered assertions.
Failure summarization: An AI agent analyzes video recordings of failed tests and summarizes the root cause (e.g., a missing button or an incorrect redirect), reducing the need for engineers to review the footage manually.

Implementing video-based assertions and summaries

Bug0 integrated Gemini 2.5 Pro using the Google Gen AI SDK in Node.js. The initial integration, including prompt experimentation and tuning, took approximately three days.

Their AI assertion engine combines the actions of their testing framework with the evaluation capabilities of Gemini 2.5 Pro. The framework executes the test steps, and Gemini 2.5 Pro evaluates the resulting output.

"In our assertion engine, we combine Gemini 2.5 Pro with Playwright. Playwright performs steps. Gemini 2.5 Pro evaluates the visual output and confirms whether the expected outcome was met," Panda explained. "This allows us to skip writing fragile locators or hard-coded expectations and rely on natural-language assertions powered by Gemini 2.5 Pro."

For failure summaries, Bug0 uses a specialized prompt format that includes the video recording, failure logs, and expected behaviors. Gemini 2.5 Pro processes this input to generate human-readable summaries explaining why the test failed. The accuracy of Gemini 2.5 Pro was essential for these critical QA tasks.

Reducing manual test review by 60%

The integration of Gemini 2.5 Pro significantly improved Bug0's debugging workflows and the overall reliability of their platform. By replacing manual debugging and assertion writing with AI-driven workflows, Bug0 accelerated development velocity for its customers.

Key results include:

60% reduction in the number of test failure videos that engineers need to manually watch
Over 70% of test failures are now successfully auto-summarized with accurate root cause explanations
A significant drop in assertion flakiness compared to traditional selector-based methods

"Gemini 2.5 Pro accelerated our velocity," Panda said. "It elevated our core product experience by turning test review from a bottleneck into a fast-feedback loop."

Bug0 is now developing an AI test authoring feature. Users will be able to submit a video of a user flow, and Bug0 will use Gemini 2.5 Pro to analyze the video and automatically generate the corresponding test script and assertions.

To start building your own applications, explore the multimodal capabilities of Gemini models in our API documentation.

Bug0 reduces manual test debugging by 60% with Gemini 2.5 Pro

Automating QA analysis with multimodal reasoning

Implementing video-based assertions and summaries

Reducing manual test review by 60%

Related case studies