Gemini Deep Research がプレビュー版で利用可能になりました。共同プランニング、可視化、MCP サポートなどが含まれています。

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

ファイル入力メソッド

注: このバージョンのページでは、現在ベータ版の新しい Interactions API について説明します。
安定した本番環境のデプロイでは、引き続き generateContent API を使用することをおすすめします。このページの切り替えを使用して、バージョンを切り替えることができます。

このガイドでは、Gemini API にリクエストを行う際に、画像、音声、動画、ドキュメントなどのメディアファイルを含めるさまざまな方法について説明します。新しいメソッドは、バッチ、インタラクション、Live API など、すべての Gemini API エンドポイントでサポートされています。適切な方法を選択するかどうかは、ファイルのサイズ、データの保存場所、ファイルの使用頻度によって異なります。

ファイルを入力として含める最も簡単な方法は、ローカルファイルを読み取ってプロンプトに含めることです。次の例は、ローカルの PDF ファイルを読み取る方法を示しています。この方法では、PDF のサイズは 50 MB に制限されます。ファイル入力の種類と上限の全一覧については、入力方法の比較表をご覧ください。

Python

from google import genai
import pathlib

client = genai.Client()

filepath = pathlib.Path('my_local_file.pdf')

prompt = "Summarize this document"
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": prompt},
        {"type": "document", "data": filepath.read_bytes(), "mime_type": "application/pdf"}
    ]
)
# Print the model's text response
for step in interaction.steps:
    if step.type == "model_output":
        for content_block in step.content:
            if content_block.type == "text":
                print(content_block.text)

JavaScript

import { GoogleGenAI } from "@google/genai";
import * as fs from 'node:fs';

const client = new GoogleGenAI({});
const prompt = "Summarize this document";

async function main() {
    const filePath = 'my_local_file.pdf';

    const interaction = await client.interactions.create({
        model: "gemini-3-flash-preview",
        input: [
            { type: "text", text: prompt },
            {
                type: "document",
                data: fs.readFileSync(filePath).toString("base64"),
                mimeType: "application/pdf"
            }
        ]
    });
    const modelStep = interaction.steps.find(s => s.type === 'model_output');
    if (modelStep) {
      for (const contentBlock of modelStep.content) {
        if (contentBlock.type === 'text') console.log(contentBlock.text);
      }
    }
}

main();

REST

# Encode the local file to base64
B64_CONTENT=$(base64 -w 0 my_local_file.pdf)

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gemini-3-flash-preview",
    "input": [
      {"type": "text", "text": "Summarize this document"},
      {
        "type": "document",
        "data": "'${B64_CONTENT}'",
        "mimeType": "application/pdf"
      }
    ]
  }'

入力方法の比較

次の表は、各入力方法とファイルの上限、最適なユースケースを比較したものです。ファイルサイズの上限は、ファイル形式と、ファイルの処理に使用されるモデルまたはトークナイザーによって異なる場合があります。

メソッド	最適な用途	最大ファイルサイズ	永続性
インラインデータ	迅速なテスト、小容量ファイル、リアルタイムアプリケーション。	リクエストまたはペイロード 1 件あたり 100 MB （PDF の場合は 50 MB）	なし（すべてのリクエストで送信）
ファイル API のアップロード	サイズの大きなファイル、複数回使用されるファイル。	ファイルあたり 2 GB、プロジェクトあたり最大 20 GB	48 時間
File API GCS URI 登録	Google Cloud Storage にすでに存在する大きなファイル、複数回使用されるファイル。	1 ファイルあたり 2 GB、保存容量の合計に制限なし	なし（リクエストごとに取得）。1 回の登録で最大 30 日間アクセスできます。
外部 URL	一般公開データまたはクラウドバケット（AWS、Azure、GCS）内のデータを再アップロードせずに使用できます。	リクエスト/ペイロードあたり 100 MB	なし（リクエストごとに取得）

インラインデータ

小さいファイル（100 MB 未満、PDF の場合は 50 MB）の場合は、リクエストペイロードでデータを直接渡すことができます。これは、リアルタイムの一時データを処理するクイックテストやアプリケーションに最適な最も簡単な方法です。データは、Base64 エンコード文字列として提供するか、ローカルファイルを直接読み取って提供できます。

ローカルファイルから読み取る例については、このページの冒頭の例をご覧ください。

URL から取得する

URL からファイルを取得し、バイトに変換して入力に含めることもできます。

Python

from google import genai
import httpx

client = genai.Client()

doc_url = "https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf"
doc_data = httpx.get(doc_url).content

prompt = "Summarize this document"

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "document", "data": doc_data, "mime_type": "application/pdf"},
        {"type": "text", "text": prompt}
    ]
)
# Print the model's text response
for step in interaction.steps:
    if step.type == "model_output":
        for content_block in step.content:
            if content_block.type == "text":
                print(content_block.text)

JavaScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});
const docUrl = 'https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf';
const prompt = "Summarize this document";

async function main() {
    const pdfResp = await fetch(docUrl)
      .then((response) => response.arrayBuffer());

    const interaction = await client.interactions.create({
        model: "gemini-3-flash-preview",
        input: [
            { type: "text", text: prompt },
            {
                type: "document",
                data: Buffer.from(pdfResp).toString("base64"),
                mimeType: "application/pdf"
            }
        ]
    });
    const modelStep = interaction.steps.find(s => s.type === 'model_output');
    if (modelStep) {
      for (const contentBlock of modelStep.content) {
        if (contentBlock.type === 'text') console.log(contentBlock.text);
      }
    }
}

main();

REST

DOC_URL="https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf"
PROMPT="Summarize this document"
DISPLAY_NAME="base64_pdf"

# Download the PDF
wget -O "${DISPLAY_NAME}.pdf" "${DOC_URL}"

# Check for FreeBSD base64 and set flags accordingly
if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
  B64FLAGS="--input"
else
  B64FLAGS="-w0"
fi

# Base64 encode the PDF
ENCODED_PDF=$(base64 $B64FLAGS "${DISPLAY_NAME}.pdf")

# Generate content using interactions
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
    -H "x-goog-api-key: $GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "model": "gemini-3-flash-preview",
      "input": [
        {"type": "document", "data": "'$ENCODED_PDF'", "mimeType": "application/pdf"},
        {"type": "text", "text": "'$PROMPT'"}
      ]
    }' 2> /dev/null > response.json

cat response.json
echo

jq ".steps[] | select(.type == \"model_output\") | .content[] | select(.type == \"text\") | .text" response.json

Gemini File API

File API は、大きなファイル（最大 2 GB）や複数のリクエストで使用するファイルを対象としています。

標準のファイルアップロード

ローカルファイルを Gemini API にアップロードします。この方法でアップロードされたファイルは一時的に（48 時間）保存され、モデルによる効率的な取得のために処理されます。

Python

from google import genai

client = genai.Client()

# Upload the file
audio_file = client.files.upload(file="path/to/your/sample.mp3")
prompt = "Describe this audio clip"

# Use the uploaded file in an interaction
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": prompt},
        {"type": "audio", "uri": audio_file.uri, "mime_type": audio_file.mime_type}
    ]
)
# Print the model's text response
for step in interaction.steps:
    if step.type == "model_output":
        for content_block in step.content:
            if content_block.type == "text":
                print(content_block.text)

JavaScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});
const prompt = "Describe this audio clip";

async function main() {
  const filePath = "path/to/your/sample.mp3";

  const myfile = await client.files.upload({
    file: filePath,
    config: { mimeType: "audio/mpeg" },
  });

  const interaction = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: [
        { type: "text", text: prompt },
        { type: "audio", uri: myfile.uri, mimeType: myfile.mimeType }
    ]
  });
  const modelStep = interaction.steps.find(s => s.type === 'model_output');
  if (modelStep) {
    for (const contentBlock of modelStep.content) {
      if (contentBlock.type === 'text') console.log(contentBlock.text);
    }
  }
}

await main();

REST

AUDIO_PATH="path/to/sample.mp3"
MIME_TYPE=$(file -b --mime-type "${AUDIO_PATH}")
NUM_BYTES=$(wc -c < "${AUDIO_PATH}")
DISPLAY_NAME=AUDIO

tmp_header_file=upload-header.tmp

# Initial resumable request defining metadata.
curl "https://generativelanguage.googleapis.com/upload/v1beta/files" \
  -D "${tmp_header_file}" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "X-Goog-Upload-Protocol: resumable" \
  -H "X-Goog-Upload-Command: start" \
  -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
  -H "Content-Type: application/json" \
  -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
  -H "Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Offset: 0" \
  -H "X-Goog-Upload-Command: upload, finalize" \
  --data-binary "@${AUDIO_PATH}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)

# Now use in an interaction
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
    -H "x-goog-api-key: $GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "model": "gemini-3-flash-preview",
      "input": [
        {"type": "text", "text": "Describe this audio clip"},
        {"type": "audio", "uri": '$file_uri', "mimeType": "'${MIME_TYPE}'"}
      ]
    }'

Google Cloud Storage ファイルを登録する

データがすでに Google Cloud Storage にある場合は、ダウンロードして再アップロードする必要はありません。File API に直接登録できます。

各バケットへの サービスエージェントのアクセス権を付与する
1. Google Cloud プロジェクトで Gemini API を有効にします。
2. サービスエージェントを作成します。
  
  gcloud beta services identity create --service=generativelanguage.googleapis.com --project=<your_project>
3. ストレージバケットを読み取るために、Gemini API サービスエージェントに権限を付与します。
  
  ユーザーは、使用する特定のストレージバケットで、このサービスエージェントに Storage Object Viewer IAM ロールを割り当てる必要があります。
このアクセス権はデフォルトでは期限切れになりませんが、いつでも変更できます。Google Cloud Storage IAM SDK コマンドを使用して権限を付与することもできます。
サービスを認証する

前提条件
- API を有効にする
- 適切な権限を持つサービスアカウントまたはエージェントを作成します。
まず、ストレージオブジェクト閲覧者の権限を持つサービスとして認証する必要があります。この処理は、ファイル管理コードが実行される環境によって異なります。

Google Cloud の外部

コードが Google Cloud の外部（デスクトップなど）から実行されている場合は、次の手順で Google Cloud コンソールからアカウント認証情報をダウンロードします。
1. サービスアカウントコンソールに移動します。
2. 関連するサービスアカウントを選択する
3. [鍵] タブを選択し、[鍵を追加、新しい鍵を作成] を選択します。
4. キータイプとして [JSON] を選択し、ファイルがダウンロードされたマシン上の場所をメモします。
詳細については、サービスアカウントキーの管理に関する Google Cloud の公式ドキュメントをご覧ください。

次のコマンドを使用して認証します。これらのコマンドは、サービスアカウントファイルが現在のディレクトリに service-account.json という名前で存在することを前提としています。
Python
```
from google.oauth2.service_account import Credentials

GCS_READ_SCOPES = [       
  'https://www.googleapis.com/auth/devstorage.read_only',
  'https://www.googleapis.com/auth/cloud-platform'
]

SERVICE_ACCOUNT_FILE = 'service-account.json'

credentials = Credentials.from_service_account_file(
    SERVICE_ACCOUNT_FILE,
    scopes=GCS_READ_SCOPES
)
```
JavaScript
```
const { GoogleAuth } = require('google-auth-library');

const GCS_READ_SCOPES = [
  'https://www.googleapis.com/auth/devstorage.read_only',
  'https://www.googleapis.com/auth/cloud-platform'
];

const SERVICE_ACCOUNT_FILE = 'service-account.json';

const auth = new GoogleAuth({
  keyFile: SERVICE_ACCOUNT_FILE,
  scopes: GCS_READ_SCOPES
});
```
CLI
```
gcloud auth application-default login \
  --client-id-file=service-account.json \
  --scopes='https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/devstorage.read_only'
```
Google Cloud

Cloud Run 関数や Compute Engine インスタンスを使用して Google Cloud で直接実行している場合は、暗黙的な認証情報がありますが、適切なスコープを付与するために再認証する必要があります。
Python
このコードは、Cloud Run や Compute Engine など、アプリケーションのデフォルト認証情報を自動的に取得できる環境でサービスが実行されていることを前提としています。
```
import google.auth

GCS_READ_SCOPES = [       
  'https://www.googleapis.com/auth/devstorage.read_only',
  'https://www.googleapis.com/auth/cloud-platform'
]

credentials, project = google.auth.default(scopes=GCS_READ_SCOPES)
```
JavaScript
このコードは、Cloud Run や Compute Engine など、アプリケーションのデフォルト認証情報を自動的に取得できる環境でサービスが実行されていることを前提としています。
```
const { GoogleAuth } = require('google-auth-library');

const auth = new GoogleAuth({
  scopes: [
    'https://www.googleapis.com/auth/devstorage.read_only',
    'https://www.googleapis.com/auth/cloud-platform'
  ]
});
```
CLI
これはインタラクティブなコマンドです。Compute Engine などのサービスでは、構成レベルで実行中のサービスにスコープを関連付けることができます。例については、ユーザー管理サービスドキュメントをご覧ください。
```
gcloud auth application-default login \
--scopes="https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/devstorage.read_only"
```

ファイル登録（Files API）

Files API を使用してファイルを登録し、Gemini API で直接使用できる Files API パスを生成します。

Python

from google import genai

# Note that you must provide an API key in the GEMINI_API_KEY
# environment variable, but it is unused for the registration endpoint.
client = genai.Client(credentials=credentials)

registered_gcs_files = client.files.register_files(
    uris=["gs://my_bucket/some_object.pdf", "gs://bucket2/object2.txt"]
)
prompt = "Summarize this file."

# call interactions.create for each file
for f in registered_gcs_files.files:
  print(f.name)
  interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
      {"type": "text", "text": prompt},
      {"type": "document", "uri": f.uri, "mime_type": f.mime_type}
    ],
  )
  # Print the model's text response
  for step in interaction.steps:
      if step.type == "model_output":
          for content_block in step.content:
              if content_block.type == "text":
                  print(content_block.text)

JavaScript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ auth: auth });

async function main() {
    const registeredGcsFiles = await ai.files.registerFiles({
        uris: ["gs://my_bucket/some_object.pdf", "gs://bucket2/object2.txt"]
    });

    const prompt = "Summarize this file.";

    for (const file of registeredGcsFiles.files) {
        console.log(file.name);
        const interaction = await ai.interactions.create({
            model: "gemini-3-flash-preview",
            input: [
                { type: "text", text: prompt },
                { type: "document", uri: file.uri, mimeType: file.mimeType }
            ]
        });

        const modelStep = interaction.steps.find(s => s.type === 'model_output');
        if (modelStep) {
            for (const contentBlock of modelStep.content) {
                if (contentBlock.type === 'text') console.log(contentBlock.text);
            }
        }
    }
}

main();

CLI

access_token=$(gcloud auth application-default print-access-token)
project_id=$(gcloud config get-value project)
curl -X POST https://generativelanguage.googleapis.com/v1beta/files:register \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer ${access_token}" \
    -H "x-goog-user-project: ${project_id}" \
    -d '{"uris": ["gs://bucket/object1", "gs://bucket/object2"]}'

外部 HTTP / 署名付き URL

一般公開されている HTTPS URL または事前署名付き URL をリクエストで直接渡すことができます。Gemini API は、処理中にコンテンツを安全に取得します。これは、再アップロードしたくない 100 MB までのファイルに最適です。

Python

from google import genai

uri = "https://ontheline.trincoll.edu/images/bookdown/sample-local-pdf.pdf"
prompt = "Summarize this file"

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "document", "uri": uri, "mime_type": "application/pdf"},
        {"type": "text", "text": prompt}
    ]
)
# Print the model's text response
for step in interaction.steps:
    if step.type == "model_output":
        for content_block in step.content:
            if content_block.type == "text":
                print(content_block.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const uri = "https://ontheline.trincoll.edu/images/bookdown/sample-local-pdf.pdf";

async function main() {
  const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
      { type: "document", uri: uri, mimeType: "application/pdf" },
      { type: "text", text: "summarize this file" }
    ]
  });

  const modelStep = interaction.steps.find(s => s.type === 'model_output');
  if (modelStep) {
    for (const contentBlock of modelStep.content) {
      if (contentBlock.type === 'text') console.log(contentBlock.text);
    }
  }
}

main();

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
      -H 'x-goog-api-key: $GEMINI_API_KEY' \
      -H 'Content-Type: application/json' \
      -d '{
          "model": "gemini-3-flash-preview",
          "input": [
            {"type": "text", "text": "Summarize this pdf"},
            {
              "type": "document",
              "uri": "https://ontheline.trincoll.edu/images/bookdown/sample-local-pdf.pdf",
              "mimeType": "application/pdf"
            }
          ]
        }'

ユーザー補助

指定した URL が、ログインが必要なページや有料コンテンツのページにリンクしていないことを確認します。非公開データベースの場合は、正しいアクセス権限と有効期限で署名付き URL を作成してください。

安全チェック

システムは、URL が安全基準とポリシー基準を満たしていることを確認するため、URL に対してコンテンツモデレーションチェックを実行します。URL がこのチェックに失敗すると、url_retrieval_status は URL_RETRIEVAL_STATUS_UNSAFE になります。

サポートされているコンテンツの種類

サポートされているファイル形式と制限事項のこのリストは、初期のガイダンスとして提供されるものであり、包括的なものではありません。サポートされている型の有効なセットは変更される可能性があり、使用中の特定のモデルとトークナイザーのバージョンによって異なる場合があります。サポートされていない型を使用すると、エラーが発生します。また、これらのファイル形式のコンテンツ取得では、一般公開されている URL のみがサポートされます。

テキストファイル形式

text/html
text/css
text/plain
text/xml
text/csv
text/rtf
text/javascript

アプリケーションファイルの種類

application/json
application/pdf

画像ファイル形式

image/bmp
image/jpeg
image/png
image/webp

ベストプラクティス

適切な方法を選択する: 小さな一時ファイルにはインラインデータを使用します。サイズが大きいファイルや頻繁に使用するファイルには、File API を使用します。すでにオンラインでホストされているデータには、外部 URL を使用します。
MIME タイプを指定する: ファイルデータの正しい MIME タイプを常に指定して、適切な処理を確保します。
エラーを処理する: ネットワーク障害、ファイルアクセスの問題、API エラーなどの潜在的な問題を管理するために、コードにエラー処理を実装します。

制限事項

ファイルサイズの上限は、方法（比較表を参照）とファイル形式によって異なります。
インラインデータはリクエストペイロードサイズを増やします。
File API のアップロードは一時的なもので、48 時間後に期限切れになります。
外部 URL の取得は、ペイロードあたり 100 MB に制限され、特定のコンテンツタイプをサポートしています。

次のステップ

Google AI Studio を使用して、独自のマルチモーダルプロンプトを作成してみてください。
プロンプトにファイルを含める方法については、Vision、音声、ドキュメント処理の各ガイドをご覧ください。

ファイル入力メソッド

Python

JavaScript

REST

入力方法の比較

インライン データ

URL から取得する

Python

JavaScript

REST

Gemini File API

標準のファイル アップロード

Python

JavaScript

REST

Google Cloud Storage ファイルを登録する

Python

JavaScript

CLI

Python

JavaScript

CLI

Python

JavaScript

CLI

外部 HTTP / 署名付き URL

Python

JavaScript

REST

ユーザー補助

安全チェック

サポートされているコンテンツの種類

テキスト ファイル形式

アプリケーション ファイルの種類

画像ファイル形式

ベスト プラクティス

制限事項

次のステップ

インラインデータ

標準のファイルアップロード

テキストファイル形式

アプリケーションファイルの種類

ベストプラクティス