教學課程:開始使用 Gemini API


透過 Google AI 查看 在 Google Colab 中執行 前往 GitHub 查看原始碼

本快速入門導覽課程說明如何使用 Gemini API 適用的 Python SDK,取得 Google Gemini 大型語言模型的存取權。在本快速入門導覽課程中,您將瞭解如何:

  1. 設定開發環境和 API 存取權,以便使用 Gemini。
  2. 依據文字輸入內容生成文字回覆。
  3. 使用多模態輸入內容 (文字和圖片) 生成文字回應。
  4. 使用 Gemini 進行多輪對話 (聊天)。
  5. 針對大型語言模型使用嵌入。

先備知識

你可以透過 Google Colab 執行本快速入門導覽課程,直接透過瀏覽器執行這個筆記本,無須額外設定環境。

或者,如要在本機完成本快速入門導覽課程,請確保您的開發環境符合下列需求:

  • Python 3.9 以上版本
  • 安裝 jupyter 以執行筆記本。

設定

安裝 Python SDK

Gemini API 適用的 Python SDK 包含在 google-generativeai 套件中。使用 pip 安裝依附元件:

pip install -q -U google-generativeai

匯入套件

匯入必要的套件。

import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown


def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
# Used to securely store your API key
from google.colab import userdata

設定 API 金鑰

您必須先取得 API 金鑰,才能使用 Gemini API。如果您尚未建立金鑰,請在 Google AI Studio 中按一下滑鼠即可建立金鑰。

取得 API 金鑰

在 Colab 左側面板的「🔑?」下方,將金鑰新增至密鑰管理員。輸入名稱 GOOGLE_API_KEY

取得 API 金鑰後,請將其傳遞至 SDK。操作方式有以下兩種:

  • 將金鑰放入 GOOGLE_API_KEY 環境變數中 (SDK 會自動從中取得)。
  • 將金鑰傳遞至 genai.configure(api_key=...)
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

列出模型

您現在可以呼叫 Gemini API 了。使用 list_models 查看可用的 Gemini 模型:

  • gemini-1.5-flash:我們最快的多模態模型
  • gemini-1.5-pro:我們最強大的智慧多模態模型
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

使用輸入的文字來生成文字

如果是純文字提示,請使用 Gemini 1.5 模型或 Gemini 1.0 Pro 模型:

model = genai.GenerativeModel('gemini-1.5-flash')

generate_content 方法可處理多種用途,包括多輪聊天和多模態輸入,視基礎模型支援而定。可用的模型僅支援文字和圖片做為輸入內容,並支援文字做為輸出。

在最簡單的情況下,您可以將提示字串傳遞至 GenerativeModel.generate_content 方法:

%%time
response = model.generate_content("What is the meaning of life?")
CPU times: user 110 ms, sys: 12.3 ms, total: 123 ms
Wall time: 8.25 s

在簡單的情況下,您只需要 response.text 存取子即可。如要顯示格式化 Markdown 文字,請使用 to_markdown 函式:

to_markdown(response.text)

關於人生目標的查詢,遍及數個世紀、文化與各大洲。雖然沒有廣受大眾認可的回應,但有很多想法問世,回應內容也經常取決於個人的想法、信仰和生活經驗。

  1. 幸福與健康:許多人認為生活的目標在於追求個人幸福與身心健康。這包括尋找能提供歡樂、建立深厚關係、照顧身心健康以及追求個人目標和興趣的追捕者。

  2. 有意義的貢獻:有些人相信生活的用途是為全世界做出有意義的貢獻。這包括追求能造福他人的職業、參與志工或慈善活動、創作藝術或文學,或是發明。

  3. 實現自我實現與個人成長:另一個人生目標的常見目標,是追求自我實現和個人發展。包括學習新技能、挑戰極限、正視個人障礙以及以個人身分演進。

  4. 倫理和道德行為:有些人認為生活的目標在於合乎道德和故意行為。這包括遵循一項道德原則、即使困難時也能做正確的事情,以及努力讓世界變得更美好。

  5. 靈性履行:在某些情況下,生活目的與宗教信仰或宗教信仰有關。這包括尋求更高權力的交流、實踐宗教儀式,或跟隨心靈教育。

  6. 體驗完整生活:有些人認為人生的目標在於體驗一切,像是出遊、嘗試新事物、承擔風險以及嘗試新碰面等。

  7. 傳單及影響:其他人認為生活的用途是留下深刻的遺傳,並為世界帶來影響。像是完成值得注意的事情、記住某人的貢獻,或是鼓舞他人並鼓舞他人。

  8. 尋找平衡和諧:對某些人來說,生活的主要目的就是在生活的各個方面找到平衡和和諧。這可能包括必須履行個人、專業和社會義務,尋求內心的平靜和滿足,以及按照個人價值觀和信念的生活生活。

最終,生活的意義是個人旅程,每個人透過自身經驗、反思與與周遭世界互動,來發現自己的獨特目的。

如果 API 無法傳回結果,請使用 GenerateContentResponse.prompt_feedback 確認該提示是否因與提示相關的安全疑慮而遭到封鎖。

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Gemini 可以針對單一提示生成多個可能的回覆。這些可能的回覆稱為 candidates,您可以查看這些回應來選出最適合的回覆。

使用 GenerateContentResponse.candidates 查看候選回應:

response.candidates
[content {
  parts {
    text: "The query of life\'s purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.\n\n1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one\'s physical and mental health, and pursuing personal goals and interests.\n\n2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.\n\n3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one\'s boundaries, confronting personal obstacles, and evolving as a person.\n\n4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one\'s moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.\n\n5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.\n\n6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.\n\n7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one\'s contributions, or inspiring and motivating others.\n\n8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one\'s values and beliefs.\n\nUltimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them."
  }
  role: "model"
}
finish_reason: STOP
index: 0
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}
]

根據預設,模型會在完成整個生成程序後傳回回應。您也可以在產生回應時串流處理,而模型會在產生回應後立即傳回回應區塊。

如要逐句顯示回覆,請使用 GenerativeModel.generate_content(..., stream=True)

%%time
response = model.generate_content("What is the meaning of life?", stream=True)
CPU times: user 102 ms, sys: 25.1 ms, total: 128 ms
Wall time: 7.94 s
for chunk in response:
  print(chunk.text)
  print("_"*80)
The query of life's purpose has perplexed people across centuries, cultures, and
________________________________________________________________________________
 continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences
________________________________________________________________________________
.

1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and aspirations.

2. **Meaning
________________________________________________________________________________
ful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, exploring one's interests and abilities, overcoming obstacles, and becoming the best version of oneself.

4. **Connection and Relationships:** For many individuals, the purpose of life is found in their relationships with others. This might entail building
________________________________________________________________________________
 strong bonds with family and friends, fostering a sense of community, and contributing to the well-being of those around them.

5. **Spiritual Fulfillment:** For those with religious or spiritual beliefs, the purpose of life may be centered on seeking spiritual fulfillment or enlightenment. This might entail following religious teachings, engaging in spiritual practices, or seeking a deeper understanding of the divine.

6. **Experiencing the Journey:** Some believe that the purpose of life is simply to experience the journey itself, with all its joys and sorrows. This perspective emphasizes embracing the present moment, appreciating life's experiences, and finding meaning in the act of living itself.

7. **Legacy and Impact:** For others, the goal of life is to leave a lasting legacy or impact on the world. This might entail making a significant contribution to a particular field, leaving a positive mark on future generations, or creating something that will be remembered and cherished long after one's lifetime.

Ultimately, the meaning of life is a personal and subjective question, and there is no single, universally accepted answer. It is about discovering what brings you fulfillment, purpose, and meaning in your own life, and living in accordance with those values.
________________________________________________________________________________

進行串流時,有些回應屬性只有在疊代所有回應區塊後才能使用。示範方式如下:

response = model.generate_content("What is the meaning of life?", stream=True)

prompt_feedback 屬性的運作方式如下:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

text 等屬性則不會:

try:
  response.text
except Exception as e:
  print(f'{type(e).__name__}: {e}')
IncompleteIterationError: Please let the response complete iteration before accessing the final accumulated
attributes (or call `response.resolve()`)

依據圖片和文字輸入內容來生成文字

Gemini 提供多種可處理多模態輸入的模型 (Gemini 1.5 模型),方便您輸入文字和圖片。請務必查看提示的圖片規定

如果提示輸入內容包含文字和圖片,請搭配使用 Gemini 1.5 和 GenerativeModel.generate_content 方法,產生文字輸出內容:

加入圖片:

curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  405k  100  405k    0     0  6982k      0 --:--:-- --:--:-- --:--:-- 7106k
import PIL.Image

img = PIL.Image.open('image.jpg')
img

png

使用 Gemini 1.5 模型,並透過 generate_content 將圖片傳送至模型。

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(img)

to_markdown(response.text)

雞肉照燒碗料理,食材包含棕色米飯、烘焙的花椰菜和鈴鐺。

如要在提示中提供文字和圖片,請傳送包含字串和圖片的清單:

response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.", img], stream=True)
response.resolve()
to_markdown(response.text)

備餐是省時和省錢的絕佳方法,還可以幫助飲食更加健康。這種飲食模式有助於您輕鬆提前準備健康又美味的餐點,

早餐包括棕色米飯、烤蔬菜和雞肉照燒。棕色飯是具有高纖維和營養素的整顆粒子。烤蔬菜是獲取每日維他命和礦物的絕佳方式。雞照燒料理是常見的蛋白質來源,也有多種口味。

這種餐點很容易做,只要烹煮棕色飯、烤蔬菜和雞肉照燒料理就可以了。接著,將餐點分割成個別容器,並儲存在冰箱中。當你準備吃飯時,只要拿起容器當中的溫度即可。

這對忙碌的人來說是吃大餐的好選擇。這對希望減重或維持健康體重的人來說也是一大福音。

若您需要可輕鬆提前準備的健康美味餐點,這個餐點便是絕佳選擇。現在就試試看吧!

即時通訊對話

Gemini 讓您可以輪流進行各種語音交流,ChatSession 類別透過管理對話狀態來簡化程序,因此與 generate_content 不同,您不需要將對話記錄儲存為清單。

初始化即時通訊:

model = genai.GenerativeModel('gemini-1.5-flash')
chat = model.start_chat(history=[])
chat
<google.generativeai.generative_models.ChatSession at 0x7b7b68250100>

ChatSession.send_message 方法會傳回與 GenerativeModel.generate_content 相同的 GenerateContentResponse 類型。系統也會將您的訊息和回覆附加到即時通訊記錄中:

response = chat.send_message("In one sentence, explain how a computer works to a young child.")
to_markdown(response.text)

電腦就像是非常聰明的機器,可以瞭解並遵循我們的指示、協助我們完成工作,甚至與我們一起玩遊戲!

chat.history
[parts {
   text: "In one sentence, explain how a computer works to a young child."
 }
 role: "user",
 parts {
   text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
 }
 role: "model"]

你可以繼續傳送訊息,繼續對話。使用 stream=True 引數串流即時通訊:

response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)

for chunk in response:
  print(chunk.text)
  print("_"*80)
A computer works by following instructions, called a program, which tells it what to
________________________________________________________________________________
 do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor
________________________________________________________________________________
, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a
________________________________________________________________________________
 chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
________________________________________________________________________________

genai.protos.Content 物件包含 genai.protos.Part 物件清單,每個物件各包含一個文字 (字串) 或 inline_data (genai.protos.Blob),而 blob 含有二進位資料和 mime_type。系統會以 ChatSession.historygenai.protos.Content 物件清單的形式提供即時通訊記錄:

for message in chat.history:
  display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))

user:用一句話向年幼的孩子解釋電腦的運作方式。

model:電腦就像是非常聰明的機器,可以理解並遵循我們的指示、協助我們處理工作,甚至來玩遊戲!

user:好,我想知道更多關於高中學生的詳細說明嗎?

model:電腦會依照指示 (稱為程式) 下達指令,指示電腦該執行哪些動作。這些指示是以電腦可以理解的特殊語言編寫而成,並儲存在電腦的記憶體中。電腦的處理器 (或稱 CPU) 會從記憶體讀取指令,並執行指示、執行計算,並根據程式的邏輯做出決策。這些計算結果與決定結果會顯示在電腦螢幕上,或儲存於記憶體供日後使用。

我們來簡單做比喻,將電腦視為遵循食譜的主廚。食譜就像是程式,而主廚的行動就像電腦遵循的指示。廚師會閱讀食譜 (程式),並執行多項動作,例如收集食材 (從記憶體中擷取資料)、混合食材組合 (執行計算),以及烹飪 (處理資料)。然後透過盤子 (電腦螢幕) 呈現最終料理 (輸出內容)。

總結來說,電腦會執行儲存在記憶體中的一系列指示,執行計算、做出決策,以及顯示或儲存結果。

計算符記數量

大型語言模型有脈絡窗口,背景長度通常是以符記數量為單位。您可以使用 Gemini API 決定每個 genai.protos.Content 物件的符記數量。最簡單的情況下,您可以將查詢字串傳送至 GenerativeModel.count_tokens 方法,如下所示:

model.count_tokens("What is the meaning of life?")
total_tokens: 7

同樣地,您也可以為 ChatSession 檢查 token_count

model.count_tokens(chat.history)
total_tokens: 501

使用嵌入

嵌入是一種技術,用於將資訊顯示為陣列中的浮點數清單。Gemini 可讓您以向量形式表示文字 (字詞、句子和文字區塊),方便您比較嵌入及對照嵌入。舉例來說,如果兩段文字具有相同的主題或情緒,就可以取得相似的嵌入,可以透過數學相似度等數學比較技巧加以識別。如要進一步瞭解嵌入功能的使用方式和原因,請參閱嵌入指南

使用 embed_content 方法產生嵌入。這個方法會處理下列工作的嵌入 (task_type):

工作類型 說明
RETRIEVAL_QUERY 指定指定文字是搜尋/擷取設定中的查詢。
RETRIEVAL_DOCUMENT 指定文字是搜尋/擷取設定中的文件。使用這個工作類型需要 title
SEMANTIC_SIMILARITY 指定指定文字將用於語意文字相似度 (STS)。
分類 指定要將嵌入用於分類。
叢集 指定嵌入將用於分群。

以下程式碼會產生用於擷取文件的單一字串嵌入:

result = genai.embed_content(
    model="models/embedding-001",
    content="What is the meaning of life?",
    task_type="retrieval_document",
    title="Embedding of single string")

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')
[-0.003216741, -0.013358698, -0.017649598, -0.0091 ... TRIMMED]

如要處理整批字串,請在 content 中傳遞字串清單:

result = genai.embed_content(
    model="models/embedding-001",
    content=[
      'What is the meaning of life?',
      'How much wood would a woodchuck chuck?',
      'How does the brain work?'],
    task_type="retrieval_document",
    title="Embedding of list of strings")

# A list of inputs > A list of vectors output
for v in result['embedding']:
  print(str(v)[:50], '... TRIMMED ...')
[0.0040260437, 0.004124458, -0.014209415, -0.00183 ... TRIMMED ...
[-0.004049845, -0.0075574904, -0.0073463684, -0.03 ... TRIMMED ...
[0.025310587, -0.0080734305, -0.029902633, 0.01160 ... TRIMMED ...

雖然 genai.embed_content 函式接受簡易字串或字串清單,但實際上是針對 genai.protos.Content 類型 (例如 GenerativeModel.generate_content) 建構而成。genai.protos.Content 物件是 API 中的對話主要單位。

雖然 genai.protos.Content 物件為多模態,但 embed_content 方法僅支援文字嵌入。此設計讓 API 有可能擴展到多模態嵌入的方式。

response.candidates[0].content
parts {
  text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
}
role: "model"
result = genai.embed_content(
    model = 'models/embedding-001',
    content = response.candidates[0].content)

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED ...

同樣地,即時通訊記錄包含 genai.protos.Content 物件清單,您可以直接將其傳遞至 embed_content 函式:

chat.history
[parts {
   text: "In one sentence, explain how a computer works to a young child."
 }
 role: "user",
 parts {
   text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
 }
 role: "model",
 parts {
   text: "Okay, how about a more detailed explanation to a high schooler?"
 }
 role: "user",
 parts {
   text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
 }
 role: "model"]
result = genai.embed_content(
    model = 'models/embedding-001',
    content = chat.history)

# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
  print(str(v)[:50], '... TRIMMED...')
[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...

進階用途

以下各節將討論 Gemini API 適用的 Python SDK 進階用途和較低層級的詳細資料。

安全性設定

safety_settings 引數可讓您設定模型封鎖的內容,以及允許在提示和回應中執行的動作。根據預設,安全性設定會封鎖中等和/或很有可能為不安全的內容。進一步瞭解安全性設定

輸入可疑的提示,並以預設安全性設定執行模型,且不會傳回任何候選人:

response = model.generate_content('[Questionable prompt here]')
response.candidates
[content {
  parts {
    text: "I\'m sorry, but this prompt involves a sensitive topic and I\'m not allowed to generate responses that are potentially harmful or inappropriate."
  }
  role: "model"
}
finish_reason: STOP
index: 0
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}
]

prompt_feedback 會顯示哪個安全篩選器封鎖了提示:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

現在使用新調整的安全性設定,向模型提供相同的提示,您可能會收到回應。

response = model.generate_content('[Questionable prompt here]',
                                  safety_settings={'HARASSMENT':'block_none'})
response.text

另請注意,如果提示通過,但個別回應未通過安全檢查,每位候選人都有自己的 safety_ratings

將訊息編碼

先前章節都仰賴 SDK,方便您傳送提示到 API。本節提供與先前範例相同的完整類型詳細資料,協助您進一步瞭解 SDK 如何編碼訊息的較低層級詳細資料。

基本上,Python SDK 是 google.ai.generativelanguage 用戶端程式庫:

SDK 會嘗試將訊息轉換為 genai.protos.Content 物件,該物件包含的 genai.protos.Part 物件清單,每個物件都會包含下列任一項:

  1. text (字串)
  2. inline_data (genai.protos.Blob),其中 blob 包含二進位檔 datamime_type

您也可以傳遞任一個類別,做為對等的字典。

因此,與上述範例的完整型別相等為:

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    genai.protos.Content(
        parts = [
            genai.protos.Part(text="Write a short, engaging blog post based on this picture."),
            genai.protos.Part(
                inline_data=genai.protos.Blob(
                    mime_type='image/jpeg',
                    data=pathlib.Path('image.jpg').read_bytes()
                )
            ),
        ],
    ),
    stream=True)
response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")

備餐是省時和省錢的絕佳方法,還可以幫助飲食更加健康。作者 ... [TRIMMED] ...

多轉折對話

雖然先前顯示的 genai.ChatSession 類別可以處理許多用途,但它還是會進行一些假設。如果您的用途不符合這個即時通訊實作項目,建議您記住 genai.ChatSession 只是 GenerativeModel.generate_content 的包裝函式。除了單一要求外,還可處理多輪對話。

個別訊息為 genai.protos.Content 物件或相容的字典,如前幾節所述。訊息需要 roleparts 鍵。對話中的 role 可以是提供提示的 user,也可以是提供回應的 model

傳遞 genai.protos.Content 物件清單,系統會將該物件視為多輪即時通訊:

model = genai.GenerativeModel('gemini-1.5-flash')

messages = [
    {'role':'user',
     'parts': ["Briefly explain how a computer works to a young child."]}
]
response = model.generate_content(messages)

to_markdown(response.text)

想像一下,電腦就像是聰明的朋友,能幫您完成許多事情。電腦就像是腦力思考和學習一樣,電腦也具備處理器,稱為處理器。這就像是電腦的老闆,告訴它該怎麼做。

電腦中有一個稱為「記憶體」的特殊位置,就像一個更大的儲存盒。應用程式會記住你做出的所有動作,例如開啟遊戲或播放影片。

當您按下鍵盤上的按鈕或點擊螢幕上的項目時,訊息就會傳送到電腦。這些訊息會透過特殊的傳輸線 (稱為傳輸線) 傳輸至處理方。

處理器會讀出訊息並指示電腦該怎麼做。它可以開啟程式、顯示圖片,甚至代您播放音樂。

畫面上顯示的內容都是由圖形卡所生成,就像電腦的魔術藝術家。它會捕捉處理器的指示,將影像轉換成五花八門的相片和影片。

電腦會使用稱為硬碟的特殊儲存空間,儲存你喜愛的遊戲、影片或相片。就像一座巨型圖書館,用來存放您所有珍貴的東西。

當您想連線到網際網路,與好友玩遊戲或觀看有趣影片時,電腦就會使用所謂的網路卡片資訊,透過網際網路纜線或 Wi-Fi 訊號收發訊息。

因此,就像大腦會助您學習與玩樂一樣,電腦的處理器、記憶體、顯示卡、硬碟和網路卡會相輔相成,讓電腦成為超聰明的朋友,助您做很棒的事!

如要繼續對話,請新增回覆和其他訊息。

messages.append({'role':'model',
                 'parts':[response.text]})

messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})

response = model.generate_content(messages)

to_markdown(response.text)

本質上,電腦可透過程式來執行一系列指令的機器。由數個必要元件組成,共同處理、儲存和顯示資訊:

1. 處理器 (CPU): - 電腦的大腦。 - 執行指示並執行計算。 - 速度的測量單位為 GB。- GHz 越多,處理速度就越快。

2. 記憶體 (RAM):- 用於處理資料的臨時儲存空間。- 在程式執行期間保留操作說明和資料。 - 以 GB 為單位計算。 - 更大的 RAM 可同時執行更多程式。

3.儲存空間 (HDD/SSD):- 資料的永久儲存空間。- 儲存作業系統、程式和使用者檔案。 - 以 GB 或 TB 為單位計算。- 硬碟 (HDD) 是傳統、速度較慢且價格較便宜的磁碟。 - 固態硬碟 (SSD) 較新、速度更快,且價格較高。

4. 顯示卡 (GPU):處理及顯示圖片。- 無論是遊戲、影片編輯還是其他需要大量圖像的工作,都不可或缺。 - 測量單位為影片 RAM (VRAM) 和時脈速度。

5. 主機板: - 連結所有元件。 - 提供電源和通訊途徑。

6. 輸入/輸出 (I/O) 裝置: - 允許使用者與電腦互動。- 範例:鍵盤、滑鼠、螢幕、印表機。

7. 作業系統 (OS): - 管理電腦資源的軟體。 - 提供使用者介面和基本功能。 - 範例:Windows、macOS、Linux

在電腦上執行程式時,會發生以下情況:

  1. 程式操作說明會從儲存空間載入記憶體。
  2. 處理器會讀取記憶體中的指示,並逐一執行指令。
  3. 如果指示涉及計算,處理方會使用算術邏輯單位 (ALU) 來執行計算作業。
  4. 如果指示涉及資料,處理器會讀取或寫入記憶體。
  5. 計算結果或資料操縱結果會儲存在記憶體中。
  6. 如果程式需要顯示螢幕上顯示的內容,就會傳送必要的資料至圖形卡。
  7. 圖形卡會處理資料,並將資料傳送至監視器,螢幕上顯示資料。

這個流程會持續到程式完成任務或使用者終止為止。

產生設定

generation_config 引數可讓您修改生成參數。您傳送至模型的每個提示都含有參數值,用來控制模型產生回應的方式,

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    'Tell me a story about a magic backpack.',
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=['x'],
        max_output_tokens=20,
        temperature=1.0)
)
text = response.text

if response.candidates[0].finish_reason.name == "MAX_TOKENS":
    text += '...'

to_markdown(text)

很久以前,一位年輕女孩居住在一座小鎮上,四周環繞著蓊鬱的綠色丘陵...

後續步驟

  • 「提示設計」是指建立提示的過程,以便從語言模型產生期望的回覆。想確保語言模型生成準確且高品質的回覆,就必須撰寫條理分明的提示。瞭解提示撰寫最佳做法。
  • Gemini 提供多種不同模型版本,可滿足不同用途的需求,例如輸入類型和複雜度、聊天或其他對話語言任務的實作方式,以及大小限制。瞭解可用的 Gemini 模型