批次模式

Gemini API 批次模式的設計宗旨,是能以標準費用的 50% 非同步處理大量要求。目標處理時間為 24 小時,但大多數情況下,處理速度會更快。

如果不需要立即取得回應,可以針對大規模非緊急工作使用批次模式,例如資料前處理或執行評估。

建立批次工作

您有兩種方式可以在批次模式中提交要求:

  • 內嵌要求直接包含在批次建立要求中的 GenerateContentRequest 物件清單。這適用於較小的批次作業,可將要求總大小維持在 20 MB 以下。模型傳回的 outputinlineResponse 物件的清單。
  • 輸入檔案JSON Lines (JSONL) 檔案,每行包含一個完整的 GenerateContentRequest 物件。建議您對較大的要求使用這個方法。模型傳回的輸出內容為 JSONL 檔案,每行都是 GenerateContentResponse 或狀態物件。

內嵌要求

如果要求數量不多,可以直接在 BatchGenerateContentRequest 中嵌入 GenerateContentRequest 物件。以下範例會使用內嵌要求呼叫 BatchGenerateContent 方法:

Python


from google import genai
from google.genai import types

client = genai.Client()

# A list of dictionaries, where each is a GenerateContentRequest
inline_requests = [
    {
        'contents': [{
            'parts': [{'text': 'Tell me a one-sentence joke.'}],
            'role': 'user'
        }]
    },
    {
        'contents': [{
            'parts': [{'text': 'Why is the sky blue?'}],
            'role': 'user'
        }]
    }
]

inline_batch_job = client.batches.create(
    model="models/gemini-2.5-flash",
    src=inline_requests,
    config={
        'display_name': "inlined-requests-job-1",
    },
)

print(f"Created batch job: {inline_batch_job.name}")

JavaScript


import {GoogleGenAI} from '@google/genai';
const GEMINI_API_KEY = process.env.GEMINI_API_KEY;

const ai = new GoogleGenAI({apiKey: GEMINI_API_KEY});

const inlinedRequests = [
    {
        contents: [{
            parts: [{text: 'Tell me a one-sentence joke.'}],
            role: 'user'
        }]
    },
    {
        contents: [{
            parts: [{'text': 'Why is the sky blue?'}],
            role: 'user'
        }]
    }
]

const response = await ai.batches.create({
    model: 'gemini-2.5-flash',
    src: inlinedRequests,
    config: {
    displayName: 'inlined-requests-job-1',
    }
});

console.log(response);

REST

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:batchGenerateContent \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-X POST \
-H "Content-Type:application/json" \
-d '{
    "batch": {
        "display_name": "my-batch-requests",
        "input_config": {
            "requests": {
                "requests": [
                    {
                        "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}]},
                        "metadata": {
                            "key": "request-1"
                        }
                    },
                    {
                        "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}]},
                        "metadata": {
                            "key": "request-2"
                        }
                    }
                ]
            }
        }
    }
}'

輸入檔案

如要處理大量要求,請準備 JSON Lines (JSONL) 檔案。這個檔案的每一行都必須是 JSON 物件,內含使用者定義的鍵和要求物件,其中要求是有效的 GenerateContentRequest 物件。回應中會使用使用者定義的鍵,指出哪個輸出內容是哪個要求的結果。舉例來說,如果要求定義的金鑰為 request-1,則回應會以相同的金鑰名稱註解。

這個檔案是使用 File API 上傳。輸入檔案的大小上限為 2 GB。

以下是 JSONL 檔案範例。您可以將其儲存到名為 my-batch-requests.json 的檔案:

{"key": "request-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generation_config": {"temperature": 0.7}}}
{"key": "request-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}]}}

與內嵌要求類似,您可以在每個要求 JSON 中指定其他參數,例如系統指令、工具或其他設定。

如以下範例所示,您可以使用 File API 上傳這個檔案。如果您使用多模態輸入內容,可以在 JSONL 檔案中參照其他上傳的檔案。

Python


from google import genai
from google.genai import types

client = genai.Client()

# Create a sample JSONL file
with open("my-batch-requests.jsonl", "w") as f:
    requests = [
        {"key": "request-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}]}},
        {"key": "request-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}]}}
    ]
    for req in requests:
        f.write(json.dumps(req) + "\n")

# Upload the file to the File API
uploaded_file = client.files.upload(
    file='my-batch-requests.jsonl',
    config=types.UploadFileConfig(display_name='my-batch-requests', mime_type='jsonl')
)

print(f"Uploaded file: {uploaded_file.name}")

JavaScript


import {GoogleGenAI} from '@google/genai';
import * as fs from "fs";
import * as path from "path";
import { fileURLToPath } from 'url';

const GEMINI_API_KEY = process.env.GEMINI_API_KEY;
const ai = new GoogleGenAI({apiKey: GEMINI_API_KEY});
const fileName = "my-batch-requests.jsonl";

// Define the requests
const requests = [
    { "key": "request-1", "request": { "contents": [{ "parts": [{ "text": "Describe the process of photosynthesis." }] }] } },
    { "key": "request-2", "request": { "contents": [{ "parts": [{ "text": "What are the main ingredients in a Margherita pizza?" }] }] } }
];

// Construct the full path to file
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename); 
const filePath = path.join(__dirname, fileName); // __dirname is the directory of the current script

// Use async/await for file operations for cleaner code
async function writeBatchRequestsToFile(requests, filePath) {
    try {
        // Open the file in write mode ('w'). If it exists, it will be overwritten.
        // For appending, use 'a'.
        // We'll use a writable stream for efficiency, especially with larger files.
        const writeStream = fs.createWriteStream(filePath, { flags: 'w' });

        // Listen for 'error' event on the stream
        writeStream.on('error', (err) => {
            console.error(`Error writing to file ${filePath}:`, err);
        });

        // Write each request to the file, followed by a newline character
        for (const req of requests) {
            // JSON.stringify converts the JavaScript object to a JSON string
            writeStream.write(JSON.stringify(req) + '\n');
        }

        // Close the stream when all writes are done
        writeStream.end();
        console.log(`Successfully wrote batch requests to ${filePath}`);

    } catch (error) {
        // This catch block is for errors that might occur before stream setup,
        // though stream errors are handled by the 'error' event.
        console.error(`An unexpected error occurred:`, error);
    }
}

// Call the function to write the requests
writeBatchRequestsToFile(requests, filePath);
// Upload the file to the File API
const uploadedFile = await ai.files.upload({file: 'my-batch-requests.jsonl', config: {
    mimeType: 'jsonl',
}});
console.log(uploadedFile.name);

REST

tmp_batch_input_file=batch_input.tmp
echo -e '{"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generationConfig": {"temperature": 0.7}}\n{"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}]}' > batch_input.tmp
MIME_TYPE=$(file -b --mime-type "${tmp_batch_input_file}")
NUM_BYTES=$(wc -c < "${tmp_batch_input_file}")
DISPLAY_NAME=BatchInput

tmp_header_file=upload-header.tmp

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "https://generativelanguage.googleapis.com/upload/v1beta/files \
-D "${tmp_header_file}" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "X-Goog-Upload-Protocol: resumable" \
-H "X-Goog-Upload-Command: start" \
-H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
-H "Content-Type: application/jsonl" \
-d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
-H "Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Offset: 0" \
-H "X-Goog-Upload-Command: upload, finalize" \
--data-binary "@${tmp_batch_input_file}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)

以下範例會使用 File API 上傳的輸入檔案,呼叫 BatchGenerateContent 方法:

Python


# Assumes `uploaded_file` is the file object from the previous step
file_batch_job = client.batches.create(
    model="gemini-2.5-flash",
    src=uploaded_file.name,
    config={
        'display_name': "file-upload-job-1",
    },
)

print(f"Created batch job: {file_batch_job.name}")

JavaScript

// Assumes `uploadedFile` is the file object from the previous step
const fileBatchJob = await ai.batches.create({
    model: 'gemini-2.5-flash',
    src: uploadedFile.name,
    config: {
    displayName: 'file-upload-job-1',
    }
});

console.log(fileBatchJob);

REST

BATCH_INPUT_FILE='files/123456' # File ID
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:batchGenerateContent \
-X POST \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type:application/json" \
-d "{
    'batch': {
        'display_name': 'my-batch-requests',
        'input_config': {
            'requests': {
                'file_name': ${BATCH_INPUT_FILE}
            }
        }
    }
}"

建立批次工作時,系統會傳回工作名稱。您可以使用這個名稱監控工作狀態,並在工作完成後擷取結果

以下是包含工作名稱的輸出範例:


Created batch job from file: batches/123456789

要求設定

您可以加入在標準非批次要求中使用的任何要求設定。例如,你可以指定溫度、系統指令,甚至傳遞其他模式。以下範例顯示內嵌要求,其中包含其中一個要求的系統指令:

Python

inline_requests_list = [
    {'contents': [{'parts': [{'text': 'Write a short poem about a cloud.'}]}]},
    {'contents': [{'parts': [{'text': 'Write a short poem about a cat.'}]}], 'system_instructions': {'parts': [{'text': 'You are a cat. Your name is Neko.'}]}}
]

JavaScript

inlineRequestsList = [
    {contents: [{parts: [{text: 'Write a short poem about a cloud.'}]}]},
    {contents: [{parts: [{text: 'Write a short poem about a cat.'}]}], systemInstructions: {parts: [{text: 'You are a cat. Your name is Neko.'}]}}
]

同樣地,您也可以指定要求要使用的工具。以下範例顯示啟用 Google 搜尋工具的要求:

Python

inline_requests_list = [
    {'contents': [{'parts': [{'text': 'Who won the euro 1998?'}]}]},
    {'contents': [{'parts': [{'text': 'Who won the euro 2025?'}]}], 'tools': [{'google_search ': {}}]}
]

JavaScript

inlineRequestsList = [
    {contents: [{parts: [{text: 'Who won the euro 1998?'}]}]},
    {contents: [{parts: [{text: 'Who won the euro 2025?'}]}], tools: [{googleSearch: {}}]}
]

您也可以指定結構化輸出。 以下範例說明如何為批次要求指定這項設定。

Python

from google import genai
from pydantic import BaseModel, TypeAdapter

class Recipe(BaseModel):
    recipe_name: str
    ingredients: list[str]

client = genai.Client()

# A list of dictionaries, where each is a GenerateContentRequest
inline_requests = [
    {
        'contents': [{
            'parts': [{'text': 'List a few popular cookie recipes, and include the amounts of ingredients.'}],
            'role': 'user'
        }],
        'config': {
            'response_mime_type': 'application/json',
            'response_schema': list[Recipe]
        }
    },
    {
        'contents': [{
            'parts': [{'text': 'List a few popular gluten free cookie recipes, and include the amounts of ingredients.'}],
            'role': 'user'
        }],
        'config': {
            'response_mime_type': 'application/json',
            'response_schema': list[Recipe]
        }
    }
]

inline_batch_job = client.batches.create(
    model="models/gemini-2.5-flash",
    src=inline_requests,
    config={
        'display_name': "structured-output-job-1"
    },
)

# wait for the job to finish
job_name = inline_batch_job.name
print(f"Polling status for job: {job_name}")

while True:
    batch_job_inline = client.batches.get(name=job_name)
    if batch_job_inline.state.name in ('JOB_STATE_SUCCEEDED', 'JOB_STATE_FAILED', 'JOB_STATE_CANCELLED', 'JOB_STATE_EXPIRED'):
        break
    print(f"Job not finished. Current state: {batch_job_inline.state.name}. Waiting 30 seconds...")
    time.sleep(30)

print(f"Job finished with state: {batch_job_inline.state.name}")

# print the response
for i, inline_response in enumerate(batch_job_inline.dest.inlined_responses):
    print(f"\n--- Response {i+1} ---")

    # Check for a successful response
    if inline_response.response:
        # The .text property is a shortcut to the generated text.
        print(inline_response.response.text)

JavaScript


import {GoogleGenAI, Type} from '@google/genai';
const GEMINI_API_KEY = process.env.GEMINI_API_KEY;

const ai = new GoogleGenAI({apiKey: GEMINI_API_KEY});

const inlinedRequests = [
    {
        contents: [{
            parts: [{text: 'List a few popular cookie recipes, and include the amounts of ingredients.'}],
            role: 'user'
        }],
        config: {
            responseMimeType: 'application/json',
            responseSchema: {
            type: Type.ARRAY,
            items: {
                type: Type.OBJECT,
                properties: {
                'recipeName': {
                    type: Type.STRING,
                    description: 'Name of the recipe',
                    nullable: false,
                },
                'ingredients': {
                    type: Type.ARRAY,
                    items: {
                    type: Type.STRING,
                    description: 'Ingredients of the recipe',
                    nullable: false,
                    },
                },
                },
                required: ['recipeName'],
            },
            },
        }
    },
    {
        contents: [{
            parts: [{text: 'List a few popular gluten free cookie recipes, and include the amounts of ingredients.'}],
            role: 'user'
        }],
        config: {
            responseMimeType: 'application/json',
            responseSchema: {
            type: Type.ARRAY,
            items: {
                type: Type.OBJECT,
                properties: {
                'recipeName': {
                    type: Type.STRING,
                    description: 'Name of the recipe',
                    nullable: false,
                },
                'ingredients': {
                    type: Type.ARRAY,
                    items: {
                    type: Type.STRING,
                    description: 'Ingredients of the recipe',
                    nullable: false,
                    },
                },
                },
                required: ['recipeName'],
            },
            },
        }
    }
]

const inlinedBatchJob = await ai.batches.create({
    model: 'gemini-2.5-flash',
    src: inlinedRequests,
    config: {
    displayName: 'inlined-requests-job-1',
    }
});

監控工作狀態

建立批次作業時,請使用取得的作業名稱輪詢作業狀態。 批次工作的狀態欄位會顯示目前的狀態。批次工作可能處於下列其中一種狀態:

  • JOB_STATE_PENDING:工作已建立,正在等待服務處理。
  • JOB_STATE_RUNNING:工作正在進行中。
  • JOB_STATE_SUCCEEDED:作業已順利完成。現在可以擷取結果。
  • JOB_STATE_FAILED:工作失敗。詳情請參閱錯誤詳細資料。
  • JOB_STATE_CANCELLED:使用者已取消工作。
  • JOB_STATE_EXPIRED:工作已過期,因為工作已執行或待處理超過 48 小時。這項工作不會有任何結果可供擷取。 您可以嘗試再次提交工作,或將要求分成較小的批次。

您可以定期輪詢工作狀態,確認工作是否完成。

Python


# Use the name of the job you want to check
# e.g., inline_batch_job.name from the previous step
job_name = "YOUR_BATCH_JOB_NAME"  # (e.g. 'batches/your-batch-id')
batch_job = client.batches.get(name=job_name)

completed_states = set([
    'JOB_STATE_SUCCEEDED',
    'JOB_STATE_FAILED',
    'JOB_STATE_CANCELLED',
    'JOB_STATE_EXPIRED',
])

print(f"Polling status for job: {job_name}")
batch_job = client.batches.get(name=job_name) # Initial get
while batch_job.state.name not in completed_states:
  print(f"Current state: {batch_job.state.name}")
  time.sleep(30) # Wait for 30 seconds before polling again
  batch_job = client.batches.get(name=job_name)

print(f"Job finished with state: {batch_job.state.name}")
if batch_job.state.name == 'JOB_STATE_FAILED':
    print(f"Error: {batch_job.error}")

JavaScript

// Use the name of the job you want to check
// e.g., inlinedBatchJob.name from the previous step
let batchJob;
const completedStates = new Set([
    'JOB_STATE_SUCCEEDED',
    'JOB_STATE_FAILED',
    'JOB_STATE_CANCELLED',
    'JOB_STATE_EXPIRED',
]);

try {
batchJob = await ai.batches.get({name: inlinedBatchJob.name});
while (!completedStates.has(batchJob.state.name)) {
    console.log(`Current state: ${batchJob.state.name}`);
    // Wait for 30 seconds before polling again
    await new Promise(resolve => setTimeout(resolve, 30000));
    batchJob = await client.batches.get({ name: batchJob.name });
}
console.log(`Job finished with state: ${batchJob.state.name}`);
if (batchJob.state.name === 'JOB_STATE_FAILED') {
    // The exact structure of `error` might vary depending on the SDK
    // This assumes `error` is an object with a `message` property.
    console.error(`Error: ${batchJob.state.error ? batchJob.state.error.message : 'Unknown error'}`);
}
} catch (error) {
        console.error(`An error occurred while polling job ${batchJob.name}:`, error);
}

正在擷取結果

工作狀態顯示批次工作成功後,結果就會顯示在 response 欄位中。

Python

import json

# Use the name of the job you want to check
# e.g., inline_batch_job.name from the previous step
job_name = "YOUR_BATCH_JOB_NAME"
batch_job = client.batches.get(name=job_name)

if batch_job.state.name == 'JOB_STATE_SUCCEEDED':

    # If batch job was created with a file
    if batch_job.dest and batch_job.dest.file_name:
        # Results are in a file
        result_file_name = batch_job.dest.file_name
        print(f"Results are in file: {result_file_name}")

        print("Downloading result file content...")
        file_content = client.files.download(file=result_file_name)
        # Process file_content (bytes) as needed
        print(file_content.decode('utf-8'))
        # Parse the JSONL string into a list of dictionaries
        parsed_responses = [
            json.loads(line) for line in file_content.decode('utf-8').strip().split('\n')
        ]

    # If batch job was created with inline request
    elif batch_job.dest and batch_job.dest.inlined_responses:
        # Results are inline
        print("Results are inline:")
        for i, inline_response in enumerate(batch_job.dest.inlined_responses):
            print(f"Response {i+1}:")
            if inline_response.response:
                # Accessing response, structure may vary.
                try:
                    print(inline_response.response.text)
                except AttributeError:
                    print(inline_response.response) # Fallback
            elif inline_response.error:
                print(f"Error: {inline_response.error}")
    else:
        print("No results found (neither file nor inline).")
else:
    print(f"Job did not succeed. Final state: {batch_job.state.name}")
    if batch_job.error:
        print(f"Error: {batch_job.error}")

JavaScript

// Use the name of the job you want to check
// e.g., inlinedBatchJob.name from the previous step
const jobName = "YOUR_BATCH_JOB_NAME"

let batchJob;
try {
batchJob = await ai.batches.get({ name: jobName });

if (batchJob.state.name === 'JOB_STATE_SUCCEEDED') {
    // If batch job was created with a file destination
    if (batchJob.dest && batchJob.dest.file_name) {
    const resultFileName = batchJob.dest.file_name;
    console.log(`Results are in file: ${resultFileName}`);

    console.log("Downloading result file content...");
    const fileContentBuffer = await ai.files.download({ file: resultFileName });
    // Process fileContentBuffer (Buffer) as needed
    console.log(fileContentBuffer.toString('utf-8'));

    }
    // If batch job was created with inline responses
    else if (batchJob.dest && batchJob.dest.inlined_responses) {
    console.log("Results are inline:");
    for (let i = 0; i < batchJob.dest.inlined_responses.length; i++) {
        const inlineResponse = batchJob.dest.inlined_responses[i];
        console.log(`Response ${i + 1}:`);
        if (inlineResponse.response) {
        // Accessing response, structure may vary.
        if (inlineResponse.response.text !== undefined) {
            console.log(inlineResponse.response.text);
        } else {
            console.log(inlineResponse.response); // Fallback
        }
        } else if (inlineResponse.error) {
        console.error(`Error: ${inlineResponse.error}`);
        }
    }
    } else {
    console.log("No results found (neither file nor inline).");
    }
} else {
    console.log(`Job did not succeed. Final state: ${batchJob.state.name}`);
    if (batchJob.error) {
    console.error(`Error: ${typeof batchJob.error === 'string' ? batchJob.error : batchJob.error.message || JSON.stringify(batchJob.error)}`);
    }
}
} catch (error) {
    console.error(`An error occurred while processing job ${jobName}:`, error);
}

REST

BATCH_NAME="batches/123456" # Your batch job name

curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type:application/json" 2> /dev/null > batch_status.json

if jq -r '.done' batch_status.json | grep -q "false"; then
    echo "Batch has not finished processing"
fi

batch_state=$(jq -r '.metadata.state' batch_status.json)
if [[ $batch_state = "JOB_STATE_SUCCEEDED" ]]; then
    if [[ $(jq '.response | has("inlinedResponses")' batch_status.json) = "true" ]]; then
        jq -r '.response.inlinedResponses' batch_status.json
        exit
    fi
    responses_file_name=$(jq -r '.response.responsesFile' batch_status.json)
    curl https://generativelanguage.googleapis.com/download/v1beta/$responses_file_name:download?alt=media \
    -H "x-goog-api-key: $GEMINI_API_KEY" 2> /dev/null
elif [[ $batch_state = "JOB_STATE_FAILED" ]]; then
    jq '.error' batch_status.json
elif [[ $batch_state == "JOB_STATE_CANCELLED" ]]; then
    echo "Batch was cancelled by the user"
elif [[ $batch_state == "JOB_STATE_EXPIRED" ]]; then
    echo "Batch expired after 48 hours"
fi

取消批次工作

您可以使用名稱取消進行中的批次工作。工作取消後,系統會停止處理新要求。

Python

# Cancel a batch job
client.batches.cancel(name=batch_job_to_cancel.name)

JavaScript

await ai.batches.cancel({name: batchJobToCancel.name});

REST

BATCH_NAME="batches/123456" # Your batch job name

# Cancel the batch
curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME:cancel \
-H "x-goog-api-key: $GEMINI_API_KEY" \

# Confirm that the status of the batch after cancellation is JOB_STATE_CANCELLED
curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type:application/json" 2> /dev/null | jq -r '.metadata.state'

刪除批次工作

您可以使用現有批次作業的名稱刪除該作業。刪除工作後,系統會停止處理新要求,並從批次工作清單中移除該工作。

Python

# Delete a batch job
client.batches.delete(name=batch_job_to_delete.name)

JavaScript

await ai.batches.delete({name: batchJobToDelete.name});

REST

BATCH_NAME="batches/123456" # Your batch job name

# Delete the batch job
curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME:delete \
-H "x-goog-api-key: $GEMINI_API_KEY"

技術詳細資料

  • 支援的模型:批次模式支援一系列 Gemini 模型。 如要瞭解各模型是否支援批次模式,請參閱模型頁面。批次模式支援的模態與互動式 (或非批次模式) API 支援的模態相同。
  • 價格:批次模式的用量費用為同等模型標準互動式 API 費用的 50%。詳情請參閱定價頁面。如要詳細瞭解這項功能的速率限制,請參閱速率限制頁面
  • 服務等級目標 (SLO):批次工作的設計目標是在 24 小時內完成。視工作大小和目前系統負載而定,許多工作可能會更快完成。
  • 快取:已為批次要求啟用脈絡快取。如果批次中的要求導致快取命中,快取權杖的價格與非批次模式流量相同。

最佳做法

  • 針對大量要求使用輸入檔案:如要處理大量要求,請務必使用檔案輸入方法,以便妥善管理,並避免達到 BatchGenerateContent 呼叫本身的要求大小限制。請注意,每個輸入檔案的大小上限為 2 GB。
  • 錯誤處理:作業完成後,請檢查 batchStats 是否有 failedRequestCount。如果使用檔案輸出,請剖析每一行,檢查是否為 GenerateContentResponse 或狀態物件,指出該特定要求的錯誤。如需完整的錯誤代碼清單,請參閱疑難排解指南
  • 只提交一次工作:批次工作的建立作業並非冪等。如果重複傳送相同的建立要求,系統會建立兩個不同的批次工作。
  • 將大型批次作業拆分成多個較小的作業:雖然目標處理時間為 24 小時,但實際處理時間可能會因系統負載和作業大小而異。如果需要盡快取得中繼結果,建議將大型工作拆成較小的批次。

後續步驟

如需更多範例,請參閱批次模式筆記本