借助工具使用功能,Live API 不仅可以进行对话,还可以在保持实时连接的同时执行实际操作并提取外部情境。 您可以使用 Live API 定义工具,例如函数调用和 Google 搜索。
支持的工具概览
以下简要介绍了适用于 Live API 模型的可用工具:
| 工具 |
gemini-2.5-flash-native-audio-preview-09-2025
|
|---|---|
| 搜索 | 是 |
| 函数调用 | 是 |
| Google 地图 | 否 |
| 代码执行 | 否 |
| 网址上下文 | 否 |
函数调用
与常规内容生成请求一样,Live API 也支持函数调用。借助函数调用,Live API 可以与外部数据和程序进行交互,从而大幅提升应用的功能。
您可以将会话配置定义为函数声明的一部分。
在收到工具调用后,客户端应使用 session.send_tool_response 方法返回 FunctionResponse 对象列表。
如需了解详情,请参阅函数调用教程。
Python
import asyncio
import wave
from google import genai
from google.genai import types
client = genai.Client()
model = "gemini-2.5-flash-native-audio-preview-09-2025"
# Simple function definitions
turn_on_the_lights = {"name": "turn_on_the_lights"}
turn_off_the_lights = {"name": "turn_off_the_lights"}
tools = [{"function_declarations": [turn_on_the_lights, turn_off_the_lights]}]
config = {"response_modalities": ["AUDIO"], "tools": tools}
async def main():
async with client.aio.live.connect(model=model, config=config) as session:
prompt = "Turn on the lights please"
await session.send_client_content(turns={"parts": [{"text": prompt}]})
wf = wave.open("audio.wav", "wb")
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(24000) # Output is 24kHz
async for response in session.receive():
if response.data is not None:
wf.writeframes(response.data)
elif response.tool_call:
print("The tool was called")
function_responses = []
for fc in response.tool_call.function_calls:
function_response = types.FunctionResponse(
id=fc.id,
name=fc.name,
response={ "result": "ok" } # simple, hard-coded function response
)
function_responses.append(function_response)
await session.send_tool_response(function_responses=function_responses)
wf.close()
if __name__ == "__main__":
asyncio.run(main())
JavaScript
import { GoogleGenAI, Modality } from '@google/genai';
import * as fs from "node:fs";
import pkg from 'wavefile'; // npm install wavefile
const { WaveFile } = pkg;
const ai = new GoogleGenAI({});
const model = 'gemini-2.5-flash-native-audio-preview-09-2025';
// Simple function definitions
const turn_on_the_lights = { name: "turn_on_the_lights" } // , description: '...', parameters: { ... }
const turn_off_the_lights = { name: "turn_off_the_lights" }
const tools = [{ functionDeclarations: [turn_on_the_lights, turn_off_the_lights] }]
const config = {
responseModalities: [Modality.AUDIO],
tools: tools
}
async function live() {
const responseQueue = [];
async function waitMessage() {
let done = false;
let message = undefined;
while (!done) {
message = responseQueue.shift();
if (message) {
done = true;
} else {
await new Promise((resolve) => setTimeout(resolve, 100));
}
}
return message;
}
async function handleTurn() {
const turns = [];
let done = false;
while (!done) {
const message = await waitMessage();
turns.push(message);
if (message.serverContent && message.serverContent.turnComplete) {
done = true;
} else if (message.toolCall) {
done = true;
}
}
return turns;
}
const session = await ai.live.connect({
model: model,
callbacks: {
onopen: function () {
console.debug('Opened');
},
onmessage: function (message) {
responseQueue.push(message);
},
onerror: function (e) {
console.debug('Error:', e.message);
},
onclose: function (e) {
console.debug('Close:', e.reason);
},
},
config: config,
});
const inputTurns = 'Turn on the lights please';
session.sendClientContent({ turns: inputTurns });
let turns = await handleTurn();
for (const turn of turns) {
if (turn.toolCall) {
console.debug('A tool was called');
const functionResponses = [];
for (const fc of turn.toolCall.functionCalls) {
functionResponses.push({
id: fc.id,
name: fc.name,
response: { result: "ok" } // simple, hard-coded function response
});
}
console.debug('Sending tool response...\n');
session.sendToolResponse({ functionResponses: functionResponses });
}
}
// Check again for new messages
turns = await handleTurn();
// Combine audio data strings and save as wave file
const combinedAudio = turns.reduce((acc, turn) => {
if (turn.data) {
const buffer = Buffer.from(turn.data, 'base64');
const intArray = new Int16Array(buffer.buffer, buffer.byteOffset, buffer.byteLength / Int16Array.BYTES_PER_ELEMENT);
return acc.concat(Array.from(intArray));
}
return acc;
}, []);
const audioBuffer = new Int16Array(combinedAudio);
const wf = new WaveFile();
wf.fromScratch(1, 24000, '16', audioBuffer); // output is 24kHz
fs.writeFileSync('audio.wav', wf.toBuffer());
session.close();
}
async function main() {
await live().catch((e) => console.error('got error', e));
}
main();
根据单个提示,模型可以生成多个函数调用以及将这些函数的输出串联所需的代码。此代码在沙盒环境中执行,生成后续的 BidiGenerateContentToolCall 消息。
异步函数调用
函数调用默认按顺序执行,这意味着执行会暂停,直到每个函数调用的结果可用为止。这可确保按顺序处理,这意味着在运行函数时,您将无法继续与模型互动。
如果您不想阻塞对话,可以告知模型异步运行函数。为此,您首先需要向函数定义添加 behavior:
Python
# Non-blocking function definitions
turn_on_the_lights = {"name": "turn_on_the_lights", "behavior": "NON_BLOCKING"} # turn_on_the_lights will run asynchronously
turn_off_the_lights = {"name": "turn_off_the_lights"} # turn_off_the_lights will still pause all interactions with the model
JavaScript
import { GoogleGenAI, Modality, Behavior } from '@google/genai';
// Non-blocking function definitions
const turn_on_the_lights = {name: "turn_on_the_lights", behavior: Behavior.NON_BLOCKING}
// Blocking function definitions
const turn_off_the_lights = {name: "turn_off_the_lights"}
const tools = [{ functionDeclarations: [turn_on_the_lights, turn_off_the_lights] }]
NON-BLOCKING 可确保函数异步运行,同时您还可以继续与模型互动。
然后,您需要使用 scheduling 参数告知模型在收到 FunctionResponse 时应如何运行。它可以:
- 打断其正在执行的操作,并立即告知您收到的回答 (
scheduling="INTERRUPT"), - 等待其完成当前正在执行的操作 (
scheduling="WHEN_IDLE"), 或者什么都不做,稍后在讨论中使用这些知识 (
scheduling="SILENT")
Python
# for a non-blocking function definition, apply scheduling in the function response:
function_response = types.FunctionResponse(
id=fc.id,
name=fc.name,
response={
"result": "ok",
"scheduling": "INTERRUPT" # Can also be WHEN_IDLE or SILENT
}
)
JavaScript
import { GoogleGenAI, Modality, Behavior, FunctionResponseScheduling } from '@google/genai';
// for a non-blocking function definition, apply scheduling in the function response:
const functionResponse = {
id: fc.id,
name: fc.name,
response: {
result: "ok",
scheduling: FunctionResponseScheduling.INTERRUPT // Can also be WHEN_IDLE or SILENT
}
}
使用 Google 搜索建立依据
您可以在会话配置中启用“使用 Google 搜索建立依据”。这有助于提高 Live API 的准确性并防止出现幻觉。如需了解详情,请参阅建立依据教程。
Python
import asyncio
import wave
from google import genai
from google.genai import types
client = genai.Client()
model = "gemini-2.5-flash-native-audio-preview-09-2025"
tools = [{'google_search': {}}]
config = {"response_modalities": ["AUDIO"], "tools": tools}
async def main():
async with client.aio.live.connect(model=model, config=config) as session:
prompt = "When did the last Brazil vs. Argentina soccer match happen?"
await session.send_client_content(turns={"parts": [{"text": prompt}]})
wf = wave.open("audio.wav", "wb")
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(24000) # Output is 24kHz
async for chunk in session.receive():
if chunk.server_content:
if chunk.data is not None:
wf.writeframes(chunk.data)
# The model might generate and execute Python code to use Search
model_turn = chunk.server_content.model_turn
if model_turn:
for part in model_turn.parts:
if part.executable_code is not None:
print(part.executable_code.code)
if part.code_execution_result is not None:
print(part.code_execution_result.output)
wf.close()
if __name__ == "__main__":
asyncio.run(main())
JavaScript
import { GoogleGenAI, Modality } from '@google/genai';
import * as fs from "node:fs";
import pkg from 'wavefile'; // npm install wavefile
const { WaveFile } = pkg;
const ai = new GoogleGenAI({});
const model = 'gemini-2.5-flash-native-audio-preview-09-2025';
const tools = [{ googleSearch: {} }]
const config = {
responseModalities: [Modality.AUDIO],
tools: tools
}
async function live() {
const responseQueue = [];
async function waitMessage() {
let done = false;
let message = undefined;
while (!done) {
message = responseQueue.shift();
if (message) {
done = true;
} else {
await new Promise((resolve) => setTimeout(resolve, 100));
}
}
return message;
}
async function handleTurn() {
const turns = [];
let done = false;
while (!done) {
const message = await waitMessage();
turns.push(message);
if (message.serverContent && message.serverContent.turnComplete) {
done = true;
} else if (message.toolCall) {
done = true;
}
}
return turns;
}
const session = await ai.live.connect({
model: model,
callbacks: {
onopen: function () {
console.debug('Opened');
},
onmessage: function (message) {
responseQueue.push(message);
},
onerror: function (e) {
console.debug('Error:', e.message);
},
onclose: function (e) {
console.debug('Close:', e.reason);
},
},
config: config,
});
const inputTurns = 'When did the last Brazil vs. Argentina soccer match happen?';
session.sendClientContent({ turns: inputTurns });
let turns = await handleTurn();
let combinedData = '';
for (const turn of turns) {
if (turn.serverContent && turn.serverContent.modelTurn && turn.serverContent.modelTurn.parts) {
for (const part of turn.serverContent.modelTurn.parts) {
if (part.executableCode) {
console.debug('executableCode: %s\n', part.executableCode.code);
}
else if (part.codeExecutionResult) {
console.debug('codeExecutionResult: %s\n', part.codeExecutionResult.output);
}
else if (part.inlineData && typeof part.inlineData.data === 'string') {
combinedData += atob(part.inlineData.data);
}
}
}
}
// Convert the base64-encoded string of bytes into a Buffer.
const buffer = Buffer.from(combinedData, 'binary');
// The buffer contains raw bytes. For 16-bit audio, we need to interpret every 2 bytes as a single sample.
const intArray = new Int16Array(buffer.buffer, buffer.byteOffset, buffer.byteLength / Int16Array.BYTES_PER_ELEMENT);
const wf = new WaveFile();
// The API returns 16-bit PCM audio at a 24kHz sample rate.
wf.fromScratch(1, 24000, '16', intArray);
fs.writeFileSync('audio.wav', wf.toBuffer());
session.close();
}
async function main() {
await live().catch((e) => console.error('got error', e));
}
main();
结合使用多种工具
您可以在 Live API 中组合使用多种工具,从而进一步提升应用的功能:
Python
prompt = """
Hey, I need you to do two things for me.
1. Use Google Search to look up information about the largest earthquake in California the week of Dec 5 2024?
2. Then turn on the lights
Thanks!
"""
tools = [
{"google_search": {}},
{"function_declarations": [turn_on_the_lights, turn_off_the_lights]},
]
config = {"response_modalities": ["AUDIO"], "tools": tools}
# ... remaining model call
JavaScript
const prompt = `Hey, I need you to do two things for me.
1. Use Google Search to look up information about the largest earthquake in California the week of Dec 5 2024?
2. Then turn on the lights
Thanks!
`
const tools = [
{ googleSearch: {} },
{ functionDeclarations: [turn_on_the_lights, turn_off_the_lights] }
]
const config = {
responseModalities: [Modality.AUDIO],
tools: tools
}
// ... remaining model call
后续步骤
- 如需查看更多将工具与 Live API 搭配使用的示例,请参阅工具使用实战宝典。
- 如需全面了解功能和配置,请参阅 Live API 功能指南。