Google の最先端モデルである Gemini 2.5 Pro 試験運用版が利用可能になりました。詳細

このページは Cloud Translation API によって翻訳されました。

チュートリアル: Gemini API のスタートガイド

Google AI で表示

Google Colab で実行

GitHub のソースを表示

このクイックスタートでは、Gemini API 用の Python SDK の使用方法を説明します。により、Google の Gemini 大規模言語モデルにアクセスできます。このクイックスタートでは次の方法を学習します。

Gemini を使用するための開発環境と API アクセスを設定します。
テキスト入力からテキストレスポンスを生成する。
マルチモーダル入力（テキストと画像）からテキストレスポンスを生成する。
マルチターンの会話（チャット）に Gemini を使用します。
大規模言語モデルにはエンベディングを使用します。

前提条件

このクイックスタートは Google Colab このノートブックをブラウザで直接実行するため、追加の必要があります。

このクイックスタートをローカルで完了する場合は、次の要件を満たす必要があります。

Python 3.9 以降
ノートブックを実行するための jupyter のインストール。

セットアップ

Python SDK をインストールする

Gemini API 用の Python SDK は、 google-generativeai パッケージ: pip を使用して依存関係をインストールします。

pip install -q -U google-generativeai

パッケージをインポートする

必要なパッケージをインポートします。

import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

# Used to securely store your API key
from google.colab import userdata

API キーを設定する

Gemini API を使用するには、まず API キーを取得する必要があります。もしキーがない場合は、Google AI Studio でワンクリックでキーを作成します。

API キーを取得する

Colab で、シークレットマネージャーに鍵を追加する「メンズ」を使用します。をクリックします。 GOOGLE_API_KEY という名前を付けます。

API キーを取得したら、SDK に渡します。作成する方法は次の 2 つです。

鍵を GOOGLE_API_KEY 環境変数に設定します（SDK はそこから自動的に選択されます）。
鍵を genai.configure(api_key=...) に渡す

# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

モデルの一覧表示

これで、Gemini API を呼び出す準備が整いました。list_models を使用して、利用可能な Gemini モデル:

gemini-1.5-flash: Google の最速マルチモーダルモデル
gemini-1.5-pro: Google の最も高性能でインテリジェントなマルチモーダルモデル

で確認できます。

for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

テキスト入力からテキストを生成する

テキストのみのプロンプトの場合は、Gemini 1.5 モデルまたは Gemini 1.0 Pro モデルを使用します。

model = genai.GenerativeModel('gemini-1.5-flash')

generate_content メソッドは、次のようなさまざまなユースケースに対応できます。マルチターンチャットとマルチモーダル入力を、基盤となるモデルによってサポートします。使用可能なモデルは入力としてテキストと画像のみをサポートし、テキストは渡します。

最も単純なケースでは、プロンプトの文字列を GenerativeModel.generate_content メソッド:

%%time
response = model.generate_content("What is the meaning of life?")

CPU times: user 110 ms, sys: 12.3 ms, total: 123 ms
Wall time: 8.25 s

シンプルなケースでは、response.text アクセサがあれば十分です。表示する to_markdown 関数を使用します。

to_markdown(response.text)

The query of life's purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.

1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and interests.

2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one's boundaries, confronting personal obstacles, and evolving as a person.

4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one's moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.

5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.

6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.

7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one's contributions, or inspiring and motivating others.

8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one's values and beliefs.

Ultimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them.

API が結果を返せなかった場合は、次のコマンドを使用します。 GenerateContentResponse.prompt_feedback プロンプトに関する安全性上の懸念からブロックされたかどうかを確認します。

response.prompt_feedback

safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Gemini は、1 つのプロンプトに対して複数の回答を生成できます。これらの候補のレスポンスは candidates と呼ばれます。これを確認して、最も適切なものをレスポンスとして生成します

次を使用して回答候補を表示する: GenerateContentResponse.candidates:

response.candidates

[
content {
parts {
text: "The query of life\'s purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.\n\n1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one\'s physical and mental health, and pursuing personal goals and interests.\n\n2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.\n\n3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one\'s boundaries, confronting personal obstacles, and evolving as a person.\n\n4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one\'s moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.\n\n5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.\n\n6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.\n\n7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one\'s contributions, or inspiring and motivating others.\n\n8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one\'s values and beliefs.\n\nUltimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them."
}
role: "model"
}
finish_reason: STOP
index: 0
safety_ratings {
category: HARM_CATEGORY_SEXUALLY_EXPLICIT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HATE_SPEECH
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HARASSMENT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_DANGEROUS_CONTENT
probability: NEGLIGIBLE
}
]

デフォルトでは、モデルは生成全体を完了するとレスポンスを返します。プロセスですまた、レスポンスの生成時にストリーミングすることもできます。モデルは、レスポンスのチャンクが生成されるとすぐに返します。

レスポンスをストリーミングするには、GenerativeModel.generate_content(..., stream=True) を使用します。

%%time
response = model.generate_content("What is the meaning of life?", stream=True)

CPU times: user 102 ms, sys: 25.1 ms, total: 128 ms
Wall time: 7.94 s

for chunk in response:
  print(chunk.text)
  print("_"*80)

The query of life's purpose has perplexed people across centuries, cultures, and
________________________________________________________________________________
continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences
________________________________________________________________________________
.

2. **Meaning
________________________________________________________________________________
ful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, exploring one's interests and abilities, overcoming obstacles, and becoming the best version of oneself.

4. **Connection and Relationships:** For many individuals, the purpose of life is found in their relationships with others. This might entail building
________________________________________________________________________________
strong bonds with family and friends, fostering a sense of community, and contributing to the well-being of those around them.

5. **Spiritual Fulfillment:** For those with religious or spiritual beliefs, the purpose of life may be centered on seeking spiritual fulfillment or enlightenment. This might entail following religious teachings, engaging in spiritual practices, or seeking a deeper understanding of the divine.

6. **Experiencing the Journey:** Some believe that the purpose of life is simply to experience the journey itself, with all its joys and sorrows. This perspective emphasizes embracing the present moment, appreciating life's experiences, and finding meaning in the act of living itself.

7. **Legacy and Impact:** For others, the goal of life is to leave a lasting legacy or impact on the world. This might entail making a significant contribution to a particular field, leaving a positive mark on future generations, or creating something that will be remembered and cherished long after one's lifetime.

Ultimately, the meaning of life is a personal and subjective question, and there is no single, universally accepted answer. It is about discovering what brings you fulfillment, purpose, and meaning in your own life, and living in accordance with those values.
________________________________________________________________________________

ストリーミングの場合、一部のレスポンス属性は反復処理が完了するまで使用できません。すべてのレスポンスチャンクを通過します詳細は以下のとおりです。

response = model.generate_content("What is the meaning of life?", stream=True)

prompt_feedback 属性は以下のように機能します。

response.prompt_feedback

safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

ただし、text などの属性では、次の処理は行われません。

try:
  response.text
except Exception as e:
  print(f'{type(e).__name__}: {e}')

IncompleteIterationError: Please let the response complete iteration before accessing the final accumulated
attributes (or call `response.resolve()`)

画像とテキスト入力からテキストを生成する

Gemini には、マルチモーダル入力を処理できるさまざまなモデルが用意されています（Gemini 1.5 テキストと画像の両方を入力できるようにします。必ずプロンプトの画像の要件をご覧ください。

プロンプト入力にテキストと画像の両方が含まれている場合は、Gemini 1.5 をテキスト出力を生成する GenerativeModel.generate_content メソッド:

画像を追加しましょう。

curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  405k  100  405k    0     0  6982k      0 --:--:-- --:--:-- --:--:-- 7106k

import PIL.Image

img = PIL.Image.open('image.jpg')
img

png

Gemini 1.5 モデルを使用し、generate_content で画像をモデルに渡します。

model = genai.GenerativeModel('gemini-1.5-flash')

response = model.generate_content(img)

to_markdown(response.text)

Chicken Teriyaki Meal Prep Bowls with brown rice, roasted broccoli and bell peppers.

プロンプトでテキストと画像の両方を指定するには、文字列を含むリストを渡します。および画像:

response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.", img], stream=True)
response.resolve()

to_markdown(response.text)

Meal prepping is a great way to save time and money, and it can also help you to eat healthier. This meal is a great example of a healthy and delicious meal that can be easily prepped ahead of time.

This meal features brown rice, roasted vegetables, and chicken teriyaki. The brown rice is a whole grain that is high in fiber and nutrients. The roasted vegetables are a great way to get your daily dose of vitamins and minerals. And the chicken teriyaki is a lean protein source that is also packed with flavor.

This meal is easy to prepare ahead of time. Simply cook the brown rice, roast the vegetables, and cook the chicken teriyaki. Then, divide the meal into individual containers and store them in the refrigerator. When you're ready to eat, simply grab a container and heat it up.

This meal is a great option for busy people who are looking for a healthy and delicious way to eat. It's also a great meal for those who are trying to lose weight or maintain a healthy weight.

If you're looking for a healthy and delicious meal that can be easily prepped ahead of time, this meal is a great option. Give it a try today!

チャットの会話

Gemini を使用すると、複数のターンで自由形式の会話を行うことができます。「 ChatSession クラスは、リソースの状態を管理してプロセスを簡素化し、そのため、generate_content とは異なり、モデルの一時テーブルを会話履歴をリストで表示できます。

チャットを初期化します。

model = genai.GenerativeModel('gemini-1.5-flash')
chat = model.start_chat(history=[])
chat

<google.generativeai.generative_models.ChatSession at 0x7b7b68250100>

「 ChatSession.send_message メソッドはGenerateContentResponse GenerativeModel.generate_content。また、メッセージと返信がチャット履歴に追加されます。

response = chat.send_message("In one sentence, explain how a computer works to a young child.")
to_markdown(response.text)

A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!

chat.history

[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model"
]

メッセージの送信を続けることで会話を続けることができます。こちらのチャットをストリーミングするための stream=True 引数:

response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)

for chunk in response:
  print(chunk.text)
  print("_"*80)

A computer works by following instructions, called a program, which tells it what to
________________________________________________________________________________
 do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor
________________________________________________________________________________
, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a
________________________________________________________________________________
 chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
________________________________________________________________________________

glm.Content オブジェクトには、それぞれに固有の glm.Part オブジェクトのリストが含まれます。テキスト（文字列）または inline_data（glm.Blob）のいずれか。blob にはバイナリが含まれます。データ、mime_type です。チャットの履歴は glm.Content のリストとして表示されます。オブジェクト ChatSession.history:

for message in chat.history:
  display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))

**user**: In one sentence, explain how a computer works to a young child.

**model**: A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!

**user**: Okay, how about a more detailed explanation to a high schooler?

**model**: A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.

トークンをカウントする

大規模言語モデルにはコンテキストウィンドウがあり、コンテキストの長さは トークン数で測定されます。Gemini API を使用すると、 genai.protos.Content オブジェクトあたりのトークンの数を決定します。最も単純なケースですが、クエリ文字列を GenerativeModel.count_tokens メソッドを次のように指定します。

model.count_tokens("What is the meaning of life?")

total_tokens: 7

同様に、token_count で ChatSession を確認できます。

model.count_tokens(chat.history)

total_tokens: 501

エンベディングを使用する

エンベディング浮動小数点数のリストとして情報を表すために使用される手法配列内に格納されますGemini では、テキスト（単語、文、ブロック）を表現できます。ベクトル化されているので、比較や対比が容易支援します。たとえば、類似した主題または共通の 2 つのテキストが類似したエンべディングが必要です。エンべディングはコサイン類似度などの数学的な比較手法を使用します。P-MAX キャンペーンとエンべディングを使用する理由については、エンべディング（Embeddings）ガイドをご覧ください。

embed_content メソッドを使用してエンベディングを生成します。このメソッドは、次のタスク（task_type）のエンベディング:

タスクの種類	説明
RETRIEVAL_QUERY	指定したテキストが検索 / 取得設定のクエリであることを指定します。
RETRIEVAL_DOCUMENT	指定したテキストが検索/取得設定のドキュメントであることを指定します。このタスクタイプを使用するには、`title` が必要です。
SEMANTIC_SIMILARITY	指定したテキストが意味論的テキスト類似性（STS）で使用されることを指定します。
分類	エンベディングを分類に使用することを指定します。
クラスタリング	エンベディングをクラスタリングに使用することを指定します。

以下では、ドキュメントを取得するための単一の文字列のエンベディングを生成します。

result = genai.embed_content(
    model="models/embedding-001",
    content="What is the meaning of life?",
    task_type="retrieval_document",
    title="Embedding of single string")

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')

[-0.003216741, -0.013358698, -0.017649598, -0.0091 ... TRIMMED]

文字列のバッチを処理するには、文字列のリストを content に渡します。

result = genai.embed_content(
    model="models/embedding-001",
    content=[
      'What is the meaning of life?',
      'How much wood would a woodchuck chuck?',
      'How does the brain work?'],
    task_type="retrieval_document",
    title="Embedding of list of strings")

# A list of inputs > A list of vectors output
for v in result['embedding']:
  print(str(v)[:50], '... TRIMMED ...')

[0.0040260437, 0.004124458, -0.014209415, -0.00183 ... TRIMMED ...
[-0.004049845, -0.0075574904, -0.0073463684, -0.03 ... TRIMMED ...
[0.025310587, -0.0080734305, -0.029902633, 0.01160 ... TRIMMED ...

genai.embed_content 関数は文字列または文字列のリストを受け入れますが、実際には genai.protos.Content タイプ（ GenerativeModel.generate_content）。 glm.Content オブジェクトは、API における会話の基本単位です。

genai.protos.Content オブジェクトはマルチモーダルですが、embed_content メソッドはテキストエンベディングのみをサポートします。この設計により、API は 可能性があります。

response.candidates[0].content

parts {
  text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
}
role: "model"

result = genai.embed_content(
    model = 'models/embedding-001',
    content = response.candidates[0].content)

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')

[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED ...

同様に、チャットの履歴には genai.protos.Content オブジェクトのリストが含まれます。これは embed_content 関数に直接渡すことができます。

chat.history

[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model",
  parts {
    text: "Okay, how about a more detailed explanation to a high schooler?"
  }
  role: "user",
  parts {
    text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
  }
  role: "model"
]

result = genai.embed_content(
    model = 'models/embedding-001',
    content = chat.history)

# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
  print(str(v)[:50], '... TRIMMED...')

[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...

高度なユースケース

以降のセクションでは、高度なユースケースと、 Gemini API 用の Python SDK。

安全性設定

safety_settings 引数を使用すると、モデルでブロックする対象と、プロンプトとレスポンスの両方で許可されるようにしますデフォルトでは、安全性設定によりコンテンツがブロックされます全体にわたって安全でないコンテンツである可能性が中程度または高い定義できます。安全性の詳細設定をご覧ください。

疑問が生まれるプロンプトを入力し、デフォルトの安全性設定でモデルを実行します。その場合、候補は返されません。

response = model.generate_content('[Questionable prompt here]')
response.candidates

[
  content {
    parts {
      text: "I\'m sorry, but this prompt involves a sensitive topic and I\'m not allowed to generate responses that are potentially harmful or inappropriate."
    }
    role: "model"
  }
  finish_reason: STOP
  index: 0
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
]

prompt_feedback には、プロンプトをブロックした安全フィルタが表示されます。

response.prompt_feedback

safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

次に、新しく構成した安全性設定を使用して同じプロンプトをモデルに提供します。レスポンスが返される場合があります。

response = model.generate_content('[Questionable prompt here]',
                                  safety_settings={'HARASSMENT':'block_none'})
response.text

また、プロンプトで各候補が独自の safety_ratings を持つことにも注意してください。個々のレスポンスが安全チェックに不合格だった場合。

メッセージのエンコード

前のセクションでは、プロンプトを簡単に送信できるようにするため、SDK を利用してきました定義できます。このセクションでは、先ほど説明したアクセスの仕組みに関する下位レベルの詳細が SDK がメッセージをエンコードします。

SDK はメッセージを genai.protos.Content オブジェクトに変換しようとします。これには、それぞれ次のいずれかを含む genai.protos.Part オブジェクトのリストが含まれます。

text（文字列）
inline_data（genai.protos.Blob）。ここで、blob にはバイナリの data と mime_type。
データを格納できます。

これらのクラスのいずれかを同等の辞書として渡すこともできます。

したがって、前の例の完全型は次のようになります。

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    genai.protos.Content(
        parts = [
            genai.protos.Part(text="Write a short, engaging blog post based on this picture."),
            genai.protos.Part(
                inline_data=genai.protos.Blob(
                    mime_type='image/jpeg',
                    data=pathlib.Path('image.jpg').read_bytes()
                )
            ),
        ],
    ),
    stream=True)

response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")

Meal prepping is a great way to save time and money, and it can also help you to eat healthier. By ... [TRIMMED] ...

マルチターンの会話

前述の genai.ChatSession クラスは多くのユースケースを処理できますが、なんらかの仮定を立てます。ユースケースがこのチャットに当てはまらない場合 genai.ChatSession は単なるラッパーであることを忘れないでください。約 GenerativeModel.generate_content。単一のリクエストだけでなく、マルチターンの会話を処理できます。

個々のメッセージが genai.protos.Content オブジェクトであるか、互換性がある場合使用する必要があります。辞書として、メッセージは role キーと parts キーが必要です。会話内の role は、次のいずれかになります。プロンプトを提供する user、またはレスポンスを提供する model。

genai.protos.Content オブジェクトのリストを渡すと、マルチターンチャット:

model = genai.GenerativeModel('gemini-1.5-flash')

messages = [
    {'role':'user',
     'parts': ["Briefly explain how a computer works to a young child."]}
]
response = model.generate_content(messages)

to_markdown(response.text)

Imagine a computer as a really smart friend who can help you with many things. Just like you have a brain to think and learn, a computer has a brain too, called a processor. It's like the boss of the computer, telling it what to do.

Inside the computer, there's a special place called memory, which is like a big storage box. It remembers all the things you tell it to do, like opening games or playing videos.

When you press buttons on the keyboard or click things on the screen with the mouse, you're sending messages to the computer. These messages travel through special wires, called cables, to the processor.

The processor reads the messages and tells the computer what to do. It can open programs, show you pictures, or even play music for you.

All the things you see on the screen are created by the graphics card, which is like a magic artist inside the computer. It takes the processor's instructions and turns them into colorful pictures and videos.

To save your favorite games, videos, or pictures, the computer uses a special storage space called a hard drive. It's like a giant library where the computer can keep all your precious things safe.

And when you want to connect to the internet to play games with friends or watch funny videos, the computer uses something called a network card to send and receive messages through the internet cables or Wi-Fi signals.

So, just like your brain helps you learn and play, the computer's processor, memory, graphics card, hard drive, and network card all work together to make your computer a super-smart friend that can help you do amazing things!

会話を続けるには、返信と別のメッセージを追加します。

messages.append({'role':'model',
                 'parts':[response.text]})

messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})

response = model.generate_content(messages)

to_markdown(response.text)

At its core, a computer is a machine that can be programmed to carry out a set of instructions. It consists of several essential components that work together to process, store, and display information:

**1. Processor (CPU):**
- The brain of the computer.
- Executes instructions and performs calculations.
- Speed measured in gigahertz (GHz).
- More GHz generally means faster processing.

**2. Memory (RAM):**
- Temporary storage for data being processed.
- Holds instructions and data while the program is running.
- Measured in gigabytes (GB).
- More GB of RAM allows for more programs to run simultaneously.

**3. Storage (HDD/SSD):**
- Permanent storage for data.
- Stores operating system, programs, and user files.
- Measured in gigabytes (GB) or terabytes (TB).
- Hard disk drives (HDDs) are traditional, slower, and cheaper.
- Solid-state drives (SSDs) are newer, faster, and more expensive.

**4. Graphics Card (GPU):**
- Processes and displays images.
- Essential for gaming, video editing, and other graphics-intensive tasks.
- Measured in video RAM (VRAM) and clock speed.

**5. Motherboard:**
- Connects all the components.
- Provides power and communication pathways.

**6. Input/Output (I/O) Devices:**
- Allow the user to interact with the computer.
- Examples: keyboard, mouse, monitor, printer.

**7. Operating System (OS):**
- Software that manages the computer's resources.
- Provides a user interface and basic functionality.
- Examples: Windows, macOS, Linux.

When you run a program on your computer, the following happens:

1. The program instructions are loaded from storage into memory.
2. The processor reads the instructions from memory and executes them one by one.
3. If the instruction involves calculations, the processor performs them using its arithmetic logic unit (ALU).
4. If the instruction involves data, the processor reads or writes to memory.
5. The results of the calculations or data manipulation are stored in memory.
6. If the program needs to display something on the screen, it sends the necessary data to the graphics card.
7. The graphics card processes the data and sends it to the monitor, which displays it.

This process continues until the program has completed its task or the user terminates it.

生成構成

generation_config 引数を使用すると、生成パラメータを変更できます。モデルに送信するすべてのプロンプトには、モデルにどのようにモデルがレスポンスを生成します。

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    'Tell me a story about a magic backpack.',
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=['x'],
        max_output_tokens=20,
        temperature=1.0)
)

text = response.text

if response.candidates[0].finish_reason.name == "MAX_TOKENS":
    text += '...'

to_markdown(text)

Once upon a time, in a small town nestled amidst lush green hills, lived a young girl named...

次のステップ

プロンプト設計は、望ましい結果を得るためのプロンプトを作成するプロセス返すことができます。適切に構造化されたプロンプトを記述することは、正確で質の高い回答を得るために不可欠な要素ですモデルです。プロンプトのベストプラクティスについて学習する記述をご覧ください。
Gemini には、さまざまな用途のニーズに対応する複数のモデルバリエーションが用意されています。入力の種類や複雑さ、チャットやその他のツールの実装、タスク、サイズの制約です。利用可能な Gemini モデル。