開始使用 Gemini API:Python

在 Google AI 中查看 在 Google Colab 中執行 在 GitHub 上查看原始碼

本快速入門導覽課程說明如何針對 Gemini API 使用 Python SDK,以存取 Google Gemini 大型語言模型。在本快速入門導覽課程中,您將瞭解以下內容:

  1. 設定開發環境和 API 存取權以使用 Gemini。
  2. 根據文字輸入產生文字回應。
  3. 透過多模態輸入內容 (文字和圖片) 產生文字回應。
  4. 使用 Gemini 進行多輪對話 (即時通訊)。
  5. 針對大型語言模型使用嵌入功能。

必要條件

您可以在 Google Colab 中執行本快速入門導覽課程,直接在瀏覽器中執行這個筆記本,無須進行其他環境設定。

或者,如要在本機完成本快速入門導覽課程,請確保您的開發環境符合下列需求:

  • Python 3.9 以上版本
  • 安裝 jupyter 以執行筆記本。

設定

安裝 Python SDK

Gemini API 的 Python SDK 已納入 google-generativeai 套件中。使用 pip 安裝依附元件:

pip install -q -U google-generativeai

匯入套件

匯入必要的套件。

import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown


def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
# Used to securely store your API key
from google.colab import userdata

設定 API 金鑰

您必須先取得 API 金鑰,才能使用 Gemini API。如果還沒有金鑰,請在 Google AI Studio 中按一下即可建立。

取得 API 金鑰

在 Colab 中,請在左側面板的「🔑?」下方,將密鑰新增至密鑰管理工具。輸入名稱 GOOGLE_API_KEY

取得 API 金鑰後,請傳遞至 SDK。操作方式有以下兩種:

  • 將金鑰放入 GOOGLE_API_KEY 環境變數中 (SDK 會自動從該處取得金鑰)。
  • 將金鑰傳送至 genai.configure(api_key=...)
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

列出模型

您現在可以呼叫 Gemini API 了。使用 list_models 查看可用的 Gemini 模型:

  • gemini-pro:針對純文字提示進行最佳化。
  • gemini-pro-vision:已針對文字和圖片提示進行最佳化。
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

genai 套件也支援 PaLM 系列模型,但只有 Gemini 模型支援 generateContent 方法的一般多模態功能。

根據輸入文字產生文字

若是純文字提示,請使用 gemini-pro 模型:

model = genai.GenerativeModel('gemini-pro')

generate_content 方法可處理各種用途,包括多輪聊天和多模態輸入,視基礎模型支援的類型而定。可用的模型僅支援文字和圖像做為輸入內容,以及將文字做為輸出。

在最簡單的情況下,您可以將提示字串傳遞至 GenerativeModel.generate_content 方法:

%%time
response = model.generate_content("What is the meaning of life?")
CPU times: user 110 ms, sys: 12.3 ms, total: 123 ms
Wall time: 8.25 s

在簡單的情況下,您只需要 response.text 存取子即可。如要顯示格式化的 Markdown 文字,請使用 to_markdown 函式:

to_markdown(response.text)

人們對生活目的的查詢造成了幾個世紀、文化和大陸的困境。雖然這不是眾人都認可的回覆,但許多想法都不斷推陳出新,回應通常會因個人的想法、信念和生活經驗而異。

  1. 幸福與福祉:許多人認為生活的目的在於實現個人幸福感和身心健康。這可能包括尋找追尋喜悅、建立顯著人際關係、照顧個人身心健康,以及追求個人目標和興趣。

  2. 有意義的貢獻:某些人認為生活的宗旨是對全世界做出有意義的貢獻。這類行動包括追求有益於他人的職業、參與志工或慈善活動、產生藝術或文學,或是發明。

  3. 自我實現與個人成長:在現實生活中,另一項常見的目標就是自我實現與個人發展。這類遊戲可能包括學習新技能、挑戰自身極限、相互對抗個人障礙,以及隨個人變革。

  4. 道德和道德行為:某些人相信生活的目標在於實踐道德且道德行為。這可能伴隨一個道德原則、即使在困難時也做正確的事,並且嘗試打造更美好的世界。

  5. 精神履行:對某些人來說,生活的目的與靈性或宗教信仰有關。這可能包括尋求與更高能力接觸、練習宗教儀式,或遵循靈修教學。

  6. 體驗至滿滿載:有些人認為人生的用意是要體驗其所含一切。例如旅遊、嘗試新事物、面臨風險及擁抱新接觸。

  7. 舊版與影響:一些人認為生命的意義是讓世人留下深遠的遺產,對世界產生影響。這類成果包括成就值得注意的事、對某人的貢獻留下印象,或是激勵及鼓舞他人。

  8. 尋找平衡與和諧:對某些人來說,生命的目的就是在生活的各個層面之間找到平衡和和諧。這可能包括照顧個人、職業和社會義務、尋求內心的平靜與調適,以及符合個人價值觀和信念的生活。

歸根究柢,人生的意義是一段個人旅程,不同人可能會透過經歷、反思以及與周遭世界的互動而發現自己的獨特目的。

如果 API 無法傳回結果,請使用 GenerateContentResponse.prompt_feedback 確認該結果是否因提示安全考量而遭到封鎖。

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Gemini 可以為單一提示產生多個可能的回應。這些可能的回應稱為 candidates,您可以查看這些回應,選取最合適的回應。

使用 GenerateContentResponse.candidates 查看應徵者:

response.candidates
[content {
  parts {
    text: "The query of life\'s purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.\n\n1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one\'s physical and mental health, and pursuing personal goals and interests.\n\n2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.\n\n3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one\'s boundaries, confronting personal obstacles, and evolving as a person.\n\n4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one\'s moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.\n\n5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.\n\n6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.\n\n7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one\'s contributions, or inspiring and motivating others.\n\n8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one\'s values and beliefs.\n\nUltimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them."
  }
  role: "model"
}
finish_reason: STOP
index: 0
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}
]

根據預設,模型會在完成整個產生程序後傳回回應。您也可以在系統產生回應時串流處理回應,而且模型會在產生回應後立即傳回區塊。

如要串流回覆,請使用 GenerativeModel.generate_content(..., stream=True)

%%time
response = model.generate_content("What is the meaning of life?", stream=True)
CPU times: user 102 ms, sys: 25.1 ms, total: 128 ms
Wall time: 7.94 s
for chunk in response:
  print(chunk.text)
  print("_"*80)
The query of life's purpose has perplexed people across centuries, cultures, and
________________________________________________________________________________
 continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences
________________________________________________________________________________
.

1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and aspirations.

2. **Meaning
________________________________________________________________________________
ful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, exploring one's interests and abilities, overcoming obstacles, and becoming the best version of oneself.

4. **Connection and Relationships:** For many individuals, the purpose of life is found in their relationships with others. This might entail building
________________________________________________________________________________
 strong bonds with family and friends, fostering a sense of community, and contributing to the well-being of those around them.

5. **Spiritual Fulfillment:** For those with religious or spiritual beliefs, the purpose of life may be centered on seeking spiritual fulfillment or enlightenment. This might entail following religious teachings, engaging in spiritual practices, or seeking a deeper understanding of the divine.

6. **Experiencing the Journey:** Some believe that the purpose of life is simply to experience the journey itself, with all its joys and sorrows. This perspective emphasizes embracing the present moment, appreciating life's experiences, and finding meaning in the act of living itself.

7. **Legacy and Impact:** For others, the goal of life is to leave a lasting legacy or impact on the world. This might entail making a significant contribution to a particular field, leaving a positive mark on future generations, or creating something that will be remembered and cherished long after one's lifetime.

Ultimately, the meaning of life is a personal and subjective question, and there is no single, universally accepted answer. It is about discovering what brings you fulfillment, purpose, and meaning in your own life, and living in accordance with those values.
________________________________________________________________________________

在串流時,您必須先逐一處理所有回應區塊,才能使用部分回應屬性。示範如下:

response = model.generate_content("What is the meaning of life?", stream=True)

prompt_feedback 屬性的運作方式如下:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

text 這類屬性不會:

try:
  response.text
except Exception as e:
  print(f'{type(e).__name__}: {e}')
IncompleteIterationError: Please let the response complete iteration before accessing the final accumulated
attributes (or call `response.resolve()`)

根據圖片和文字輸入內容產生文字

Gemini 提供的多模態模型 (gemini-pro-vision) 可接受文字、圖片和輸入內容。GenerativeModel.generate_content API 旨在處理多模態提示並傳回文字輸出內容。

以下加入圖片:

curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  405k  100  405k    0     0  6982k      0 --:--:-- --:--:-- --:--:-- 7106k
import PIL.Image

img = PIL.Image.open('image.jpg')
img

png

使用 gemini-pro-vision 模型,並使用 generate_content 將圖片傳送至模型。

model = genai.GenerativeModel('gemini-pro-vision')
response = model.generate_content(img)

to_markdown(response.text)

雞肉醬油炸碗,搭配褐色米飯、烤花椰菜和鈴鐺。

如要在提示中提供文字和圖片,請傳送包含字串和圖片的清單:

response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.", img], stream=True)
response.resolve()
to_markdown(response.text)

備餐既省時又省錢,還能幫助養成更健康的飲食習慣。這種餐點是健康且可輕鬆事先準備的理想飲食之一。

供應棕色米飯、烤蔬菜和雞肉照燒料理。棕色的米粉實在一顆顆粒,具有高纖維和營養素。如果想每日增加維生素和礦物質,使用烤蔬菜是不錯的方法。雞肉醬油是簡潔的蛋白質來源,也充滿了各種口味。

這種餐點可以事先做好準備。只需輕鬆烹煮棕色米飯、烤蔬菜,還可以烹調雞肉照燒。接著將餐點分成個別容器,並存放在冰箱中。準備吃時,只要拿起容器並將容器加熱即可。

這種餐點十分適合忙碌、想擁有健康美味用餐方式的人使用。對於想要減重或維持健康體重的人而言,這會是一大餐利器。

如果您正在尋找健康又美味的餐點,且希望能事先做好準備,那麼這道餐點對您來說是個不錯的選擇。現在就試試看吧!

即時通訊對話

Gemini 可讓您跨輪進行任意形式的對話。使用 ChatSession 類別可管理對話狀態來簡化程序,因此與 generate_content 不同,您不必將對話記錄儲存為清單。

初始化即時通訊:

model = genai.GenerativeModel('gemini-pro')
chat = model.start_chat(history=[])
chat
<google.generativeai.generative_models.ChatSession at 0x7b7b68250100>

ChatSession.send_message 方法會傳回與 GenerativeModel.generate_content 相同的 GenerateContentResponse 類型。你的訊息和回覆也會附加到即時通訊記錄中:

response = chat.send_message("In one sentence, explain how a computer works to a young child.")
to_markdown(response.text)

電腦就像是非常的智慧型裝置,可以理解並遵循我們的指示,幫助我們完成工作,甚至和我們一起玩遊戲!

chat.history
[parts {
   text: "In one sentence, explain how a computer works to a young child."
 }
 role: "user",
 parts {
   text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
 }
 role: "model"]

你可以繼續傳送訊息,延續對話。使用 stream=True 引數來串流聊天內容:

response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)

for chunk in response:
  print(chunk.text)
  print("_"*80)
A computer works by following instructions, called a program, which tells it what to
________________________________________________________________________________
 do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor
________________________________________________________________________________
, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a
________________________________________________________________________________
 chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
________________________________________________________________________________

glm.Content 物件包含 glm.Part 物件清單,每個物件都包含文字 (字串) 或 inline_data (glm.Blob),其中 blob 包含二進位資料和 mime_type。您可以在 ChatSession.history 中,以 glm.Content 物件清單的形式查看即時通訊記錄:

for message in chat.history:
  display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))

使用者:用一句話說明電腦運作方式。

model:電腦就像是非常的智慧型裝置,可以理解並遵循我們的指示,協助我們完成工作,甚至和我們一起玩遊戲!

user:好吧,那是給高中生的詳細說明呢?

model:電腦運作時會遵循稱為「程式」的指令,程式會指示該怎麼做。這些指示是以電腦可以理解的特殊語言寫成,並儲存在電腦的記憶體中。電腦的處理器 (或稱為 CPU) 會從記憶體中讀取指令並進行執行,根據程式的邏輯執行計算和決策。這些計算結果與決定的結果會顯示在電腦的螢幕上,或是儲存在記憶體中,以供日後使用。

要提供簡單的比喻,不妨把電腦想像成依照食譜烹飪的廚師。食譜就跟程式一樣,主廚的行動就像電腦中的指示操作。廚師會讀取食譜 (程式) 並執行各種操作,例如收集食材 (從記憶體中擷取資料)、混合在一起 (執行計算) 及烹飪 (處理資料)。最終的料理 (輸出內容) 會隨即顯示在固定板 (電腦螢幕) 上。

簡單來說,電腦的運作方式是執行一系列的指示,並儲存在記憶體中,以便執行計算、做出決策,以及顯示或儲存結果。

計算符記

大型語言模型有背景期間,背景資訊長度通常會根據符記數量來評估。使用 Gemini API 時,您可以決定每個 glm.Content 物件的權杖數量。在最簡單的情況下,您可以將查詢字串傳送至 GenerativeModel.count_tokens 方法,如下所示:

model.count_tokens("What is the meaning of life?")
total_tokens: 7

同樣地,您可以檢查 ChatSessiontoken_count

model.count_tokens(chat.history)
total_tokens: 501

使用嵌入

「嵌入」是用來在陣列中以浮點數清單表示資訊的技術。使用 Gemini 時,您可以使用向量化格式來表示文字 (字詞、句子和文字區塊),讓它們更容易比較及對比嵌入。舉例來說,如果兩段文字具有相同的主題或情緒,就應該採用類似的嵌入,而這些嵌入可透過數學相似度等數學比較技巧加以識別。如要進一步瞭解嵌入功能的使用方式和好處,請參閱嵌入指南

使用 embed_content 方法產生嵌入。這個方法會處理下列工作 (task_type) 的嵌入作業:

工作類型 說明
RETRIEVAL_QUERY 指出指定文字是搜尋/擷取設定中的查詢。
RETRIEVAL_DOCUMENT 指定指定文字是搜尋/擷取設定中的文件。必須使用 title,才能使用這個工作類型。
SEMANTIC_SIMILARITY 指定指定文字將用於語意文字相似度 (STS)。
分類 指定將嵌入用於分類。
叢集 指定會使用嵌入進行叢集處理。

以下內容會針對文件擷取的單一字串產生嵌入:

result = genai.embed_content(
    model="models/embedding-001",
    content="What is the meaning of life?",
    task_type="retrieval_document",
    title="Embedding of single string")

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')
[-0.003216741, -0.013358698, -0.017649598, -0.0091 ... TRIMMED]

如要處理一批字串,請在 content 中傳遞字串清單:

result = genai.embed_content(
    model="models/embedding-001",
    content=[
      'What is the meaning of life?',
      'How much wood would a woodchuck chuck?',
      'How does the brain work?'],
    task_type="retrieval_document",
    title="Embedding of list of strings")

# A list of inputs > A list of vectors output
for v in result['embedding']:
  print(str(v)[:50], '... TRIMMED ...')
[0.0040260437, 0.004124458, -0.014209415, -0.00183 ... TRIMMED ...
[-0.004049845, -0.0075574904, -0.0073463684, -0.03 ... TRIMMED ...
[0.025310587, -0.0080734305, -0.029902633, 0.01160 ... TRIMMED ...

雖然 genai.embed_content 函式接受簡易字串或字串清單,但實際上是以 glm.Content 類型 (例如 GenerativeModel.generate_content) 建構而成。glm.Content 物件是 API 中對話的主要單元。

雖然 glm.Content 物件為多模態,但 embed_content 方法僅支援文字嵌入。這樣的設計為 API 提供可能展開為多模態嵌入的功能。

response.candidates[0].content
parts {
  text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
}
role: "model"
result = genai.embed_content(
    model = 'models/embedding-001',
    content = response.candidates[0].content)

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED ...

同樣地,即時通訊記錄包含 glm.Content 物件清單,您可以將這些物件直接傳遞至 embed_content 函式:

chat.history
[parts {
   text: "In one sentence, explain how a computer works to a young child."
 }
 role: "user",
 parts {
   text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
 }
 role: "model",
 parts {
   text: "Okay, how about a more detailed explanation to a high schooler?"
 }
 role: "user",
 parts {
   text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
 }
 role: "model"]
result = genai.embed_content(
    model = 'models/embedding-001',
    content = chat.history)

# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
  print(str(v)[:50], '... TRIMMED...')
[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...

進階用途

以下各節將說明 Gemini API 的 Python SDK 進階用途和較低層級的詳細資料。

安全性設定

safety_settings 引數可讓您設定模型封鎖的項目,以及允許在提示和回應中允許的內容。根據預設,安全設定會在所有維度中封鎖中度和/或極有可能為不安全的內容。進一步瞭解安全設定

輸入可疑的提示,以預設安全設定執行模型。模型不會傳回任何候選項目:

response = model.generate_content('[Questionable prompt here]')
response.candidates
[content {
  parts {
    text: "I\'m sorry, but this prompt involves a sensitive topic and I\'m not allowed to generate responses that are potentially harmful or inappropriate."
  }
  role: "model"
}
finish_reason: STOP
index: 0
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}
]

prompt_feedback 會顯示封鎖提示的安全性篩選器:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

現在,系統會按照新設定的安全設定向模型提供相同提示,這樣您可能就會收到回應。

response = model.generate_content('[Questionable prompt here]',
                                  safety_settings={'HARASSMENT':'block_none'})
response.text

另請注意,每個候選項目都有專屬的 safety_ratings,以防提示通過,但個別回應未通過安全檢查。

為訊息編碼

前幾節依附於 SDK,讓您輕鬆將提示傳送至 API。本節提供完全與上述範例相同的類型,方便您進一步瞭解 SDK 如何將訊息編碼。

Python SDK 的基礎是 google.ai.generativelanguage 用戶端程式庫:

import google.ai.generativelanguage as glm

SDK 會嘗試將您的訊息轉換為 glm.Content 物件,該物件包含 glm.Part 物件清單,每個物件都包含以下其中一項:

  1. text (字串)
  2. inline_data (glm.Blob),其中 blob 包含二進位 datamime_type

您也可以傳遞任何這些類別,做為對等的字典。

因此,完全符合上一個範例的範例如下:

model = genai.GenerativeModel('gemini-pro-vision')
response = model.generate_content(
    glm.Content(
        parts = [
            glm.Part(text="Write a short, engaging blog post based on this picture."),
            glm.Part(
                inline_data=glm.Blob(
                    mime_type='image/jpeg',
                    data=pathlib.Path('image.jpg').read_bytes()
                )
            ),
        ],
    ),
    stream=True)
response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")

備餐既省時又省錢,還能幫助養成更健康的飲食習慣。根據 ... [TRIMMED] ...

多輪對話

雖然先前顯示的 genai.ChatSession 類別可以處理許多用途,但還是有部分假設。如果您的用途不符合此即時通訊實作項目,最好記住 genai.ChatSession 只是 GenerativeModel.generate_content 的包裝函式。除了單一要求外,這個功能也可以處理多輪對話。

如前幾節所述,個別訊息是 glm.Content 物件或相容的字典。做為字典,訊息需要 roleparts 鍵。對話中的 role 可以是提供提示的 user,也可以是提供回應的 model

傳遞 glm.Content 物件清單,系統就會將此物件視為多輪即時通訊:

model = genai.GenerativeModel('gemini-pro')

messages = [
    {'role':'user',
     'parts': ["Briefly explain how a computer works to a young child."]}
]
response = model.generate_content(messages)

to_markdown(response.text)

想像一下,電腦就像聰明的朋友,能夠協助你處理生活大小事。就像您渴望思考和學習的腦力,電腦也有所謂的「處理器」。這就像是老闆,告訴機器怎麼做。

電腦中有一個稱為「記憶體」的特殊地點,就像一個大型儲存盒。這項功能會記住你要求執行的所有操作,例如開啟遊戲或播放影片。

當你按下鍵盤上的按鈕或用滑鼠點按畫面上的項目,訊息就會傳送至電腦。這些訊息會透過特殊電線 (稱為傳輸線) 傳輸到處理器。

處理器會讀取訊息並指示電腦該做什麼事。它可以開啟程式、向您顯示圖片,甚至為您播放音樂。

螢幕上顯示的所有內容都是圖形卡,就像是電腦的魔法藝術家。並擷取處理器的指示,然後轉化成色彩繽紛的相片和影片。

為了儲存喜愛的遊戲、影片或相片,電腦使用稱為「硬碟」的特殊儲存空間。就像是一座巨大的圖書館,電腦可以為您保護所有珍貴的事物。

當您想要連線到網際網路並與好友一起玩遊戲或觀看趣味影片時,電腦會使用稱為網路卡的應用程式,透過網際網路纜線或 Wi-Fi 訊號收發訊息。

因此,就像腦力激盪一樣,電腦的處理器、記憶體、顯示卡、硬碟和網路卡都能搭配運作,讓電腦成為超級聰明的朋友,協助你創造出色的作品!

如要延續對話,請新增回覆內容和其他訊息。

messages.append({'role':'model',
                 'parts':[response.text]})

messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})

response = model.generate_content(messages)

to_markdown(response.text)

本質上,電腦是指可透過程式編寫一系列指示的電腦。它是由數個必要元件組成,共同處理、儲存及顯示資訊:

1. 處理器 (CPU): - 電腦的大腦。- 執行指示並執行計算。 - 測量的速度,單位為 gigahertz (GHz)。 - 通常 GHz 越多,處理速度就越快。

2. 記憶體 (RAM): - 處理資料的臨時儲存空間。 - 在程式執行期間保留指示和資料。 - 以 GB 為單位計算。 - 更多的 RAM 可讓更多程式同時執行。

3. 儲存空間 (HDD/SSD): - 用來存放資料的永久儲存空間。 - 儲存作業系統、程式和使用者檔案。 - 以 GB 或 TB 為單位計算。- 傳統硬碟 (HDD) 較傳統、速度較慢,費用較低。 - 固態硬碟 (SSD) 較新、速度更快,且費用高昂。

4.顯示卡 (GPU): - 處理及顯示圖片。 - 遊戲、影片編輯和其他需要大量使用圖像的工作不可或缺。 - 以視訊 RAM (VRAM) 和時脈速度計算。

5. 主機板: - 連結所有元件。- 提供力量和溝通管道。

6. 輸入/輸出 (I/O) 裝置: - 允許使用者與電腦互動。- 範例:鍵盤、滑鼠、螢幕、印表機。

7. 作業系統 (OS): - 管理電腦資源的軟體。 - 提供使用者介面和基本功能。 - 範例:Windows、macOS、Linux

在電腦上執行程式時,會發生下列情況:

  1. 程式指示會從儲存空間載入記憶體。
  2. 處理器會從記憶體中讀取指令,並逐一執行。
  3. 如果指令涉及計算,處理方會使用其算術邏輯單位 (ALU) 執行運算。
  4. 如果指令涉及資料,處理器會在記憶體中讀取或寫入資料。
  5. 計算結果或資料操縱結果會儲存在記憶體中。
  6. 如果程式需要在螢幕上顯示某些內容,它會將必要的資料傳送至顯示卡。
  7. 顯示卡會處理資料並將其傳送至監視器,以便顯示資料。

這項程序會持續到程式完成工作或使用者終止為止。

產生設定

generation_config 引數可讓您修改產生參數。您傳送至模型的每個提示都含有參數值,用來控制模型生成回覆的方式,

model = genai.GenerativeModel('gemini-pro')
response = model.generate_content(
    'Tell me a story about a magic backpack.',
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=['x'],
        max_output_tokens=20,
        temperature=1.0)
)
text = response.text

if response.candidates[0].finish_reason.name == "MAX_TOKENS":
    text += '...'

to_markdown(text)

從前,有個小鎮陷入綠意盎然的山丘之中,住在一個年輕的女孩...

後續步驟

  • 「提示設計」是指建立提示,引領語言模型產生所需回應的過程。撰寫結構周全的提示,是確保語言模型提供準確優質回覆的關鍵。瞭解提示撰寫的最佳做法。
  • Gemini 提供多種模型變化版本,以滿足不同用途的需求,例如輸入類型和複雜度、即時通訊或其他對話方塊語言工作的實作方式,以及大小限制。瞭解可用的 Gemini 模型
  • Gemini 提供提高頻率限制的選項。Gemini- Pro 模型的頻率上限為每分鐘 60 個要求 (RPM)。