Tutoriel: Premiers pas avec l'API Gemini

<ph type="x-smartling-placeholder"></ph>


Afficher dans l'IA de Google Exécuter dans Google Colab Consulter le code source sur GitHub

Ce guide de démarrage rapide explique comment utiliser le SDK Python pour l'API Gemini, qui vous donne accès aux grands modèles de langage Gemini de Google. Dans ce guide de démarrage rapide, vous allez apprendre à:

  1. Configurez votre environnement de développement et l'accès à l'API pour utiliser Gemini.
  2. Générez des réponses textuelles à partir d'entrées textuelles.
  3. Générez des réponses textuelles à partir d'entrées multimodales (texte et images).
  4. Utilisez Gemini pour les conversations multitours (chat).
  5. Utilisez des représentations vectorielles continues pour les grands modèles de langage.

Prérequis

Vous pouvez exécuter ce guide de démarrage rapide Google Colab qui exécute ce notebook directement dans le navigateur et ne nécessite aucun configuration de l'environnement.

Si vous souhaitez suivre ce guide de démarrage rapide en local, vous devez également vous assurer que votre environnement de développement répond aux exigences suivantes:

  • Python 3.9 et versions ultérieures
  • Une installation de jupyter pour exécuter le notebook

Configuration

Installer le SDK Python

Le SDK Python pour l'API Gemini se trouve dans le Package google-generativeai. Installez la dépendance à l'aide de pip:

pip install -q -U google-generativeai

Importer des packages

Importez les packages nécessaires.

import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
# Used to securely store your API key
from google.colab import userdata

Configurer votre clé API

Avant de pouvoir utiliser l'API Gemini, vous devez obtenir une clé API. Si vous vous n'en avez pas encore, créez-en une en un clic dans Google AI Studio.

Obtenir une clé API

Dans Colab, ajoutez la clé au gestionnaire de secrets sous l'icône " succès" dans le panneau de gauche. Nommez-la GOOGLE_API_KEY.

Une fois que vous disposez de la clé API, transmettez-la au SDK. Pour cela, vous avez le choix entre deux méthodes :

  • Placez la clé dans la variable d'environnement GOOGLE_API_KEY (le SDK les récupère automatiquement à partir de là).
  • Transmettre la clé à genai.configure(api_key=...)
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

Répertorier les modèles

Vous êtes maintenant prêt à appeler l'API Gemini. Utilisez list_models pour afficher les Modèles Gemini:

  • gemini-1.5-flash: notre modèle multimodal le plus rapide
  • gemini-1.5-pro: notre modèle multimodal le plus performant et le plus intelligent
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

Générer du texte à partir d'entrées de texte

Pour les requêtes contenant uniquement du texte, utilisez un modèle Gemini 1.5 ou Gemini 1.0 Pro:

model = genai.GenerativeModel('gemini-1.5-flash')

La méthode generate_content peut gérer une grande variété de cas d'utilisation, y compris un chat multitour et une entrée multimodale, selon ce que le modèle sous-jacent compatibles. Les modèles disponibles n'acceptent que du texte et des images en entrée, et du texte en sortie.

Dans le cas le plus simple, vous pouvez transmettre une chaîne de requête au GenerativeModel.generate_content méthode:

%%time
response = model.generate_content("What is the meaning of life?")
CPU times: user 110 ms, sys: 12.3 ms, total: 123 ms
Wall time: 8.25 s

Dans les cas simples, l'accesseur response.text suffit. Pour afficher formaté en Markdown, utilisez la fonction to_markdown:

to_markdown(response.text)
The query of life's purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.

1.  **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and interests.

2.  **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3.  **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one's boundaries, confronting personal obstacles, and evolving as a person.

4.  **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one's moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.

5.  **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.

6.  **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.

7.  **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one's contributions, or inspiring and motivating others.

8.  **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one's values and beliefs.

Ultimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them.

Si l'API n'a pas pu renvoyer de résultat, utilisez GenerateContentResponse.prompt_feedback pour voir si elle a été bloquée en raison de problèmes de sécurité liés à la requête.

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Gemini peut générer plusieurs réponses possibles pour une même requête. Ces les réponses possibles sont appelées candidates. Vous pouvez les consulter pour sélectionner la plus appropriée comme réponse.

Affichez les réponses candidates avec GenerateContentResponse.candidates:

response.candidates
[
  content {
    parts {
      text: "The query of life\'s purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.\n\n1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one\'s physical and mental health, and pursuing personal goals and interests.\n\n2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.\n\n3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one\'s boundaries, confronting personal obstacles, and evolving as a person.\n\n4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one\'s moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.\n\n5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.\n\n6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.\n\n7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one\'s contributions, or inspiring and motivating others.\n\n8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one\'s values and beliefs.\n\nUltimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them."
    }
    role: "model"
  }
  finish_reason: STOP
  index: 0
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
]

Par défaut, le modèle renvoie une réponse à la fin de la génération processus. Vous pouvez également diffuser la réponse en flux continu au fur et à mesure qu'elle est générée. renvoie des fragments de la réponse dès qu'ils sont générés.

Pour diffuser les réponses, utilisez GenerativeModel.generate_content(..., stream=True).

%%time
response = model.generate_content("What is the meaning of life?", stream=True)
CPU times: user 102 ms, sys: 25.1 ms, total: 128 ms
Wall time: 7.94 s
for chunk in response:
  print(chunk.text)
  print("_"*80)
The query of life's purpose has perplexed people across centuries, cultures, and
________________________________________________________________________________
 continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences
________________________________________________________________________________
.

1.  **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and aspirations.

2.  **Meaning
________________________________________________________________________________
ful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3.  **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, exploring one's interests and abilities, overcoming obstacles, and becoming the best version of oneself.

4.  **Connection and Relationships:** For many individuals, the purpose of life is found in their relationships with others. This might entail building
________________________________________________________________________________
 strong bonds with family and friends, fostering a sense of community, and contributing to the well-being of those around them.

5.  **Spiritual Fulfillment:** For those with religious or spiritual beliefs, the purpose of life may be centered on seeking spiritual fulfillment or enlightenment. This might entail following religious teachings, engaging in spiritual practices, or seeking a deeper understanding of the divine.

6.  **Experiencing the Journey:** Some believe that the purpose of life is simply to experience the journey itself, with all its joys and sorrows. This perspective emphasizes embracing the present moment, appreciating life's experiences, and finding meaning in the act of living itself.

7.  **Legacy and Impact:** For others, the goal of life is to leave a lasting legacy or impact on the world. This might entail making a significant contribution to a particular field, leaving a positive mark on future generations, or creating something that will be remembered and cherished long after one's lifetime.

Ultimately, the meaning of life is a personal and subjective question, and there is no single, universally accepted answer. It is about discovering what brings you fulfillment, purpose, and meaning in your own life, and living in accordance with those values.
________________________________________________________________________________

Lors du traitement par flux, certains attributs de réponse ne sont pas disponibles tant que vous n'avez pas itéré tous les fragments de réponse. Ce processus est illustré ci-dessous:

response = model.generate_content("What is the meaning of life?", stream=True)

L'attribut prompt_feedback fonctionne:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

En revanche, les attributs tels que text ne permettent pas:

try:
  response.text
except Exception as e:
  print(f'{type(e).__name__}: {e}')
IncompleteIterationError: Please let the response complete iteration before accessing the final accumulated
attributes (or call `response.resolve()`)

Générer du texte à partir d'entrées d'image et de texte

Gemini propose différents modèles pouvant gérer la saisie multimodale (Gemini 1.5 ) afin de pouvoir saisir à la fois du texte et des images. N'oubliez pas de consulter les exigences concernant les images pour les requêtes.

Lorsque la requête inclut à la fois du texte et des images, utilisez Gemini 1.5 avec le GenerativeModel.generate_content pour générer une sortie textuelle:

Ajoutons une image:

curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  405k  100  405k    0     0  6982k      0 --:--:-- --:--:-- --:--:-- 7106k
import PIL.Image

img = PIL.Image.open('image.jpg')
img

png

Utilisez un modèle Gemini 1.5 et transmettez l'image au modèle avec generate_content.

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(img)

to_markdown(response.text)
Chicken Teriyaki Meal Prep Bowls with brown rice, roasted broccoli and bell peppers.

Pour fournir à la fois du texte et des images dans une requête, transmettez une liste contenant les chaînes et des images:

response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.", img], stream=True)
response.resolve()
to_markdown(response.text)
Meal prepping is a great way to save time and money, and it can also help you to eat healthier. This meal is a great example of a healthy and delicious meal that can be easily prepped ahead of time.

This meal features brown rice, roasted vegetables, and chicken teriyaki. The brown rice is a whole grain that is high in fiber and nutrients. The roasted vegetables are a great way to get your daily dose of vitamins and minerals. And the chicken teriyaki is a lean protein source that is also packed with flavor.

This meal is easy to prepare ahead of time. Simply cook the brown rice, roast the vegetables, and cook the chicken teriyaki. Then, divide the meal into individual containers and store them in the refrigerator. When you're ready to eat, simply grab a container and heat it up.

This meal is a great option for busy people who are looking for a healthy and delicious way to eat. It's also a great meal for those who are trying to lose weight or maintain a healthy weight.

If you're looking for a healthy and delicious meal that can be easily prepped ahead of time, this meal is a great option. Give it a try today!

Conversations de chat

Gemini vous permet de tenir des conversations au format libre dans plusieurs tours de discussion. La La classe ChatSession simplifie le processus en gérant l'état de la conversation. Contrairement à generate_content, il n'est donc pas nécessaire de stocker l'historique des conversations sous forme de liste.

Initialisez le chat:

model = genai.GenerativeModel('gemini-1.5-flash')
chat = model.start_chat(history=[])
chat
<google.generativeai.generative_models.ChatSession at 0x7b7b68250100>

La ChatSession.send_message renvoie le même type GenerateContentResponse que GenerativeModel.generate_content Il ajoute également votre message et votre réponse à l'historique du chat:

response = chat.send_message("In one sentence, explain how a computer works to a young child.")
to_markdown(response.text)
A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!
chat.history
[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model"
]

Vous pouvez continuer à envoyer des messages pour poursuivre la conversation. Utilisez le Argument stream=True pour diffuser le chat en streaming:

response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)

for chunk in response:
  print(chunk.text)
  print("_"*80)
A computer works by following instructions, called a program, which tells it what to
________________________________________________________________________________
 do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor
________________________________________________________________________________
, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a
________________________________________________________________________________
 chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
________________________________________________________________________________

Les objets glm.Content contiennent une liste d'objets glm.Part contenant chacun. soit du texte (chaîne), soit des données inline_data (glm.Blob), où un blob contient des données binaires et une mime_type. L'historique des discussions est disponible sous forme de liste de glm.Content objets dans ChatSession.history:

for message in chat.history:
  display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))
**user**: In one sentence, explain how a computer works to a young child.

**model**: A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!

**user**: Okay, how about a more detailed explanation to a high schooler?

**model**: A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.

Compter les jetons

Les grands modèles de langage ont une fenêtre de contexte, dont la longueur est souvent exprimé en nombre de jetons. Avec l'API Gemini, vous pouvez : déterminez le nombre de jetons par objet genai.protos.Content. Dans le plus simple possible, vous pouvez transmettre une chaîne de requête GenerativeModel.count_tokens comme suit:

model.count_tokens("What is the meaning of life?")
total_tokens: 7

De même, vous pouvez vérifier token_count pour votre ChatSession:

model.count_tokens(chat.history)
total_tokens: 501

Utiliser des représentations vectorielles continues

Intégration est une technique utilisée pour représenter des informations sous la forme d'une liste de nombres à virgule flottante. dans un tableau. Avec Gemini, vous pouvez représenter du texte (des mots, des phrases et des blocs de texte) sous forme vectorisée, pour faciliter la comparaison représentations vectorielles continues. Par exemple, deux textes qui partagent un sujet ou le sentiment doit avoir des représentations vectorielles continues similaires, qui peuvent être identifiées via des techniques de comparaison mathématique telles que la similarité cosinus. Pour savoir comment et pourquoi utiliser des représentations vectorielles continues, consultez Représentations vectorielles continues guide de démarrage.

Utilisez la méthode embed_content pour générer des représentations vectorielles continues. La méthode gère l'intégration pour les tâches suivantes (task_type):

Type de tâche Description
RETRIEVAL_QUERY Spécifie que le texte donné est une requête dans un contexte de recherche/récupération.
RETRIEVAL_DOCUMENT Spécifie que le texte donné est un document dans un contexte de recherche/récupération. L'utilisation de ce type de tâche nécessite un title.
SEMANTIC_SIMILARITY Indique que le texte donné sera utilisé pour la similarité textuelle sémantique (STS).
CLASSIFICATION Indique que les représentations vectorielles continues seront utilisées pour la classification.
CLUSTER Indique que les représentations vectorielles continues seront utilisées pour le clustering.

Le code suivant génère une représentation vectorielle continue pour une seule chaîne afin de récupérer des documents:

result = genai.embed_content(
    model="models/embedding-001",
    content="What is the meaning of life?",
    task_type="retrieval_document",
    title="Embedding of single string")

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')
[-0.003216741, -0.013358698, -0.017649598, -0.0091 ... TRIMMED]

Pour gérer des lots de chaînes, transmettez une liste de chaînes dans content:

result = genai.embed_content(
    model="models/embedding-001",
    content=[
      'What is the meaning of life?',
      'How much wood would a woodchuck chuck?',
      'How does the brain work?'],
    task_type="retrieval_document",
    title="Embedding of list of strings")

# A list of inputs > A list of vectors output
for v in result['embedding']:
  print(str(v)[:50], '... TRIMMED ...')
[0.0040260437, 0.004124458, -0.014209415, -0.00183 ... TRIMMED ...
[-0.004049845, -0.0075574904, -0.0073463684, -0.03 ... TRIMMED ...
[0.025310587, -0.0080734305, -0.029902633, 0.01160 ... TRIMMED ...

Bien que la fonction genai.embed_content accepte des chaînes ou des listes de chaînes, elle est en fait basé sur le type genai.protos.Content (comme GenerativeModel.generate_content). Les objets glm.Content sont les principales unités de conversation dans l'API.

Alors que l'objet genai.protos.Content est multimodal, embed_content n'accepte que les représentations vectorielles continues de texte. Cette conception donne à l'API possibilité d'obtenir des représentations vectorielles continues multimodales.

response.candidates[0].content
parts {
  text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
}
role: "model"
result = genai.embed_content(
    model = 'models/embedding-001',
    content = response.candidates[0].content)

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED ...

De même, l'historique des discussions contient une liste d'objets genai.protos.Content, que vous pouvez transmettre directement à la fonction embed_content:

chat.history
[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model",
  parts {
    text: "Okay, how about a more detailed explanation to a high schooler?"
  }
  role: "user",
  parts {
    text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
  }
  role: "model"
]
result = genai.embed_content(
    model = 'models/embedding-001',
    content = chat.history)

# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
  print(str(v)[:50], '... TRIMMED...')
[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...

Cas d'utilisation avancés

Les sections suivantes traitent des cas d'utilisation avancés et des détails de niveau inférieur des SDK Python pour l'API Gemini.

Paramètres de sécurité

L'argument safety_settings vous permet de configurer ce que le modèle bloque et dans les requêtes et les réponses. Par défaut, les paramètres de sécurité bloquent le contenu présentant une probabilité moyenne et/ou élevée d'être du contenu dangereux pour l'ensemble . En savoir plus sur la sécurité paramètres.

saisir une requête douteuse et exécuter le modèle avec les paramètres de sécurité par défaut, et ne renvoie aucun candidat:

response = model.generate_content('[Questionable prompt here]')
response.candidates
[
  content {
    parts {
      text: "I\'m sorry, but this prompt involves a sensitive topic and I\'m not allowed to generate responses that are potentially harmful or inappropriate."
    }
    role: "model"
  }
  finish_reason: STOP
  index: 0
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
]

prompt_feedback vous indique quel filtre de sécurité a bloqué l'invite:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Envoyez maintenant la même requête au modèle avec les nouveaux paramètres de sécurité configurés, et vous pouvez obtenir une réponse.

response = model.generate_content('[Questionable prompt here]',
                                  safety_settings={'HARASSMENT':'block_none'})
response.text

Notez également que chaque candidat a son propre safety_ratings, au cas où l'invite réussi, mais les réponses individuelles échouent aux contrôles de sécurité.

Encoder les messages

Les sections précédentes ont utilisé le SDK pour faciliter l'envoi de requêtes à l'API. Cette section propose une version complète de la requête précédente, ce qui vous permettra de mieux comprendre les détails de niveau inférieur concernant la façon dont Le SDK encode les messages.

Le SDK tente de convertir votre message en objet genai.protos.Content. qui contient une liste d'objets genai.protos.Part contenant chacun:

  1. a text (chaîne)
  2. inline_data (genai.protos.Blob), où un blob contient le binaire data et mime_type.
  3. ou d'autres types de données.

Vous pouvez également transmettre l'une de ces classes en tant que dictionnaire équivalent.

Ainsi, l'équivalent complet de l'exemple précédent est:

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    genai.protos.Content(
        parts = [
            genai.protos.Part(text="Write a short, engaging blog post based on this picture."),
            genai.protos.Part(
                inline_data=genai.protos.Blob(
                    mime_type='image/jpeg',
                    data=pathlib.Path('image.jpg').read_bytes()
                )
            ),
        ],
    ),
    stream=True)
response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")
Meal prepping is a great way to save time and money, and it can also help you to eat healthier. By ... [TRIMMED] ...

Conversations multitours

Bien que la classe genai.ChatSession présentée précédemment puisse gérer de nombreux cas d'utilisation, elle fait des suppositions. Si votre cas d'utilisation ne correspond pas à celui de cette discussion il est important de se rappeler que genai.ChatSession n'est qu'un wrapper. environ GenerativeModel.generate_content En plus des requêtes simples, il peut gérer les conversations multitours.

Les messages individuels sont des objets genai.protos.Content ou compatibles comme dans les sections précédentes. Sous forme de dictionnaire, le message nécessite les clés role et parts. L'élément role d'une conversation peut être soit user, qui fournit les invites, ou model, qui fournit les réponses.

Transmettez une liste d'objets genai.protos.Content. Elle est traitée comme chat multitours:

model = genai.GenerativeModel('gemini-1.5-flash')

messages = [
    {'role':'user',
     'parts': ["Briefly explain how a computer works to a young child."]}
]
response = model.generate_content(messages)

to_markdown(response.text)
Imagine a computer as a really smart friend who can help you with many things. Just like you have a brain to think and learn, a computer has a brain too, called a processor. It's like the boss of the computer, telling it what to do.

Inside the computer, there's a special place called memory, which is like a big storage box. It remembers all the things you tell it to do, like opening games or playing videos.

When you press buttons on the keyboard or click things on the screen with the mouse, you're sending messages to the computer. These messages travel through special wires, called cables, to the processor.

The processor reads the messages and tells the computer what to do. It can open programs, show you pictures, or even play music for you.

All the things you see on the screen are created by the graphics card, which is like a magic artist inside the computer. It takes the processor's instructions and turns them into colorful pictures and videos.

To save your favorite games, videos, or pictures, the computer uses a special storage space called a hard drive. It's like a giant library where the computer can keep all your precious things safe.

And when you want to connect to the internet to play games with friends or watch funny videos, the computer uses something called a network card to send and receive messages through the internet cables or Wi-Fi signals.

So, just like your brain helps you learn and play, the computer's processor, memory, graphics card, hard drive, and network card all work together to make your computer a super-smart friend that can help you do amazing things!

Pour poursuivre la conversation, ajoutez la réponse et un autre message.

messages.append({'role':'model',
                 'parts':[response.text]})

messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})

response = model.generate_content(messages)

to_markdown(response.text)
At its core, a computer is a machine that can be programmed to carry out a set of instructions. It consists of several essential components that work together to process, store, and display information:

**1. Processor (CPU):**
   -   The brain of the computer.
   -   Executes instructions and performs calculations.
   -   Speed measured in gigahertz (GHz).
   -   More GHz generally means faster processing.

**2. Memory (RAM):**
   -   Temporary storage for data being processed.
   -   Holds instructions and data while the program is running.
   -   Measured in gigabytes (GB).
   -   More GB of RAM allows for more programs to run simultaneously.

**3. Storage (HDD/SSD):**
   -   Permanent storage for data.
   -   Stores operating system, programs, and user files.
   -   Measured in gigabytes (GB) or terabytes (TB).
   -   Hard disk drives (HDDs) are traditional, slower, and cheaper.
   -   Solid-state drives (SSDs) are newer, faster, and more expensive.

**4. Graphics Card (GPU):**
   -   Processes and displays images.
   -   Essential for gaming, video editing, and other graphics-intensive tasks.
   -   Measured in video RAM (VRAM) and clock speed.

**5. Motherboard:**
   -   Connects all the components.
   -   Provides power and communication pathways.

**6. Input/Output (I/O) Devices:**
   -   Allow the user to interact with the computer.
   -   Examples: keyboard, mouse, monitor, printer.

**7. Operating System (OS):**
   -   Software that manages the computer's resources.
   -   Provides a user interface and basic functionality.
   -   Examples: Windows, macOS, Linux.

When you run a program on your computer, the following happens:

1.  The program instructions are loaded from storage into memory.
2.  The processor reads the instructions from memory and executes them one by one.
3.  If the instruction involves calculations, the processor performs them using its arithmetic logic unit (ALU).
4.  If the instruction involves data, the processor reads or writes to memory.
5.  The results of the calculations or data manipulation are stored in memory.
6.  If the program needs to display something on the screen, it sends the necessary data to the graphics card.
7.  The graphics card processes the data and sends it to the monitor, which displays it.

This process continues until the program has completed its task or the user terminates it.

Configuration de génération

L'argument generation_config vous permet de modifier les paramètres de génération. Chaque requête envoyée au modèle inclut des valeurs de paramètre qui contrôlent la façon dont le modèle génère des réponses.

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    'Tell me a story about a magic backpack.',
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=['x'],
        max_output_tokens=20,
        temperature=1.0)
)
text = response.text

if response.candidates[0].finish_reason.name == "MAX_TOKENS":
    text += '...'

to_markdown(text)
Once upon a time, in a small town nestled amidst lush green hills, lived a young girl named...

Étape suivante

  • La conception de requête consiste à créer des requêtes qui génèrent les résultats la réponse des modèles de langage. Rédiger des requêtes bien structurées est un est essentielle pour garantir des réponses précises et de haute qualité dans une langue. dans un modèle de ML. En savoir plus sur les bonnes pratiques concernant les requêtes écriture.
  • Gemini propose plusieurs variantes de modèles pour répondre aux besoins de différents usages les types d'entrées et leur complexité, les implémentations pour le chat ou d'autres les tâches liées au langage de boîte de dialogue et les contraintes de taille. En savoir plus sur les Modèles Gemini :