Tutorial: primeiros passos com a API Gemini


Ver na IA do Google Executar no Google Colab Veja o código-fonte no GitHub

Este guia de início rápido demonstra como usar o SDK do Python para a API Gemini, que dá acesso aos modelos de linguagem grandes do Gemini do Google. Neste guia de início rápido, você vai aprender a:

  1. Configure seu ambiente de desenvolvimento e o acesso à API para usar o Gemini.
  2. Gerar respostas de texto com base em entradas de texto.
  3. Gere respostas de texto a partir de entradas multimodais (texto e imagens).
  4. Usar o Gemini em conversas de vários turnos (chat).
  5. Use embeddings para modelos de linguagem grandes.

Pré-requisitos

É possível executar este guia de início rápido no Google Colab, que executa o notebook diretamente no navegador e não exige na configuração do ambiente de execução.

Como alternativa, para concluir este guia de início rápido localmente, assegure-se de que seu ambiente de nuvem atende aos seguintes requisitos:

  • Python 3.9 ou superior
  • Uma instalação de jupyter para executar o notebook.

Configuração

Instalar o SDK do Python

O SDK do Python para a API Gemini está contido no google-generativeai. Instale a dependência usando pip:

pip install -q -U google-generativeai

Importar pacotes

Importe os pacotes necessários.

import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
# Used to securely store your API key
from google.colab import userdata

Configurar sua chave de API

Antes de usar a API Gemini, você precisa de uma chave de API. Se você ainda não tiver uma, crie uma chave com um clique no Google AI Studio.

Gerar uma chave de API

No Colab, adicione a chave ao gerenciador de secrets no "HELP" no painel esquerdo. Nomeie como GOOGLE_API_KEY.

Quando você tiver a chave de API, transmita-a ao SDK. Faça isso de duas maneiras:

  • Coloque a chave na variável de ambiente GOOGLE_API_KEY (o SDK automaticamente a partir daí).
  • Transmita a chave para genai.configure(api_key=...)
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

Listar modelos

Agora você já pode chamar a API Gemini. Use list_models para conferir as opções Modelos do Gemini:

  • gemini-1.5-flash: nosso modelo multimodal mais rápido
  • gemini-1.5-pro: nosso modelo multimodal mais eficiente e inteligente
.
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

Gerar texto com base em entradas de texto

Para comandos somente de texto, use um modelo Gemini 1.5 ou Gemini 1.0 Pro:

model = genai.GenerativeModel('gemini-1.5-flash')

O método generate_content pode lidar com uma ampla variedade de casos de uso, incluindo chat multiturno e entrada multimodal, dependendo do modelo suporta. Os modelos disponíveis aceitam apenas texto e imagens como entrada, e texto como saída.

No caso mais simples, você pode passar uma string prompt para o GenerativeModel.generate_content :

%%time
response = model.generate_content("What is the meaning of life?")
CPU times: user 110 ms, sys: 12.3 ms, total: 123 ms
Wall time: 8.25 s

Em casos simples, o acessador response.text é tudo o que você precisa. Para exibir com o texto Markdown formatado, use a função to_markdown:

to_markdown(response.text)
The query of life's purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.

1.  **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and interests.

2.  **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3.  **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one's boundaries, confronting personal obstacles, and evolving as a person.

4.  **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one's moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.

5.  **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.

6.  **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.

7.  **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one's contributions, or inspiring and motivating others.

8.  **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one's values and beliefs.

Ultimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them.

Se a API não retornar um resultado, use GenerateContentResponse.prompt_feedback para saber se a solicitação foi bloqueada devido a preocupações de segurança.

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

O Gemini pode gerar várias respostas possíveis para um único comando. Esses possíveis respostas são chamadas candidates, e você pode revisá-las para selecionar a mais adequada como resposta.

Confira as respostas candidatas com GenerateContentResponse.candidates

response.candidates
[
  content {
    parts {
      text: "The query of life\'s purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.\n\n1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one\'s physical and mental health, and pursuing personal goals and interests.\n\n2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.\n\n3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one\'s boundaries, confronting personal obstacles, and evolving as a person.\n\n4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one\'s moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.\n\n5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.\n\n6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.\n\n7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one\'s contributions, or inspiring and motivating others.\n\n8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one\'s values and beliefs.\n\nUltimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them."
    }
    role: "model"
  }
  finish_reason: STOP
  index: 0
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
]

Por padrão, o modelo retorna uma resposta após a conclusão de toda a geração de desenvolvimento de software. Também é possível transmitir a resposta conforme ela é gerada e o vai retornar blocos de resposta assim que forem gerados.

Para transmitir as respostas, use GenerativeModel.generate_content(..., stream=True).

%%time
response = model.generate_content("What is the meaning of life?", stream=True)
CPU times: user 102 ms, sys: 25.1 ms, total: 128 ms
Wall time: 7.94 s
for chunk in response:
  print(chunk.text)
  print("_"*80)
The query of life's purpose has perplexed people across centuries, cultures, and
________________________________________________________________________________
 continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences
________________________________________________________________________________
.

1.  **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and aspirations.

2.  **Meaning
________________________________________________________________________________
ful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3.  **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, exploring one's interests and abilities, overcoming obstacles, and becoming the best version of oneself.

4.  **Connection and Relationships:** For many individuals, the purpose of life is found in their relationships with others. This might entail building
________________________________________________________________________________
 strong bonds with family and friends, fostering a sense of community, and contributing to the well-being of those around them.

5.  **Spiritual Fulfillment:** For those with religious or spiritual beliefs, the purpose of life may be centered on seeking spiritual fulfillment or enlightenment. This might entail following religious teachings, engaging in spiritual practices, or seeking a deeper understanding of the divine.

6.  **Experiencing the Journey:** Some believe that the purpose of life is simply to experience the journey itself, with all its joys and sorrows. This perspective emphasizes embracing the present moment, appreciating life's experiences, and finding meaning in the act of living itself.

7.  **Legacy and Impact:** For others, the goal of life is to leave a lasting legacy or impact on the world. This might entail making a significant contribution to a particular field, leaving a positive mark on future generations, or creating something that will be remembered and cherished long after one's lifetime.

Ultimately, the meaning of life is a personal and subjective question, and there is no single, universally accepted answer. It is about discovering what brings you fulfillment, purpose, and meaning in your own life, and living in accordance with those values.
________________________________________________________________________________

Durante o streaming, alguns atributos de resposta não ficam disponíveis até que você tenha iterado de todos os blocos de resposta. Isso é demonstrado abaixo:

response = model.generate_content("What is the meaning of life?", stream=True)

O atributo prompt_feedback funciona:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Mas atributos como text não:

try:
  response.text
except Exception as e:
  print(f'{type(e).__name__}: {e}')
IncompleteIterationError: Please let the response complete iteration before accessing the final accumulated
attributes (or call `response.resolve()`)

Gerar texto com base em entradas de imagem e texto

O Gemini oferece vários modelos que podem lidar com entradas multimodais (Gemini 1.5 para que você possa inserir texto e imagens. Não se esqueça de analisar requisitos de imagem para comandos.

Quando a entrada do comando incluir texto e imagens, use o Gemini 1.5 com o Método GenerativeModel.generate_content para gerar uma saída de texto:

Vamos incluir uma imagem:

curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  405k  100  405k    0     0  6982k      0 --:--:-- --:--:-- --:--:-- 7106k
import PIL.Image

img = PIL.Image.open('image.jpg')
img

png

Usar um modelo Gemini 1.5 e transmitir a imagem para o modelo com generate_content.

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(img)

to_markdown(response.text)
Chicken Teriyaki Meal Prep Bowls with brown rice, roasted broccoli and bell peppers.

Para incluir texto e imagens em um comando, transmita uma lista com as strings e imagens:

response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.", img], stream=True)
response.resolve()
to_markdown(response.text)
Meal prepping is a great way to save time and money, and it can also help you to eat healthier. This meal is a great example of a healthy and delicious meal that can be easily prepped ahead of time.

This meal features brown rice, roasted vegetables, and chicken teriyaki. The brown rice is a whole grain that is high in fiber and nutrients. The roasted vegetables are a great way to get your daily dose of vitamins and minerals. And the chicken teriyaki is a lean protein source that is also packed with flavor.

This meal is easy to prepare ahead of time. Simply cook the brown rice, roast the vegetables, and cook the chicken teriyaki. Then, divide the meal into individual containers and store them in the refrigerator. When you're ready to eat, simply grab a container and heat it up.

This meal is a great option for busy people who are looking for a healthy and delicious way to eat. It's also a great meal for those who are trying to lose weight or maintain a healthy weight.

If you're looking for a healthy and delicious meal that can be easily prepped ahead of time, this meal is a great option. Give it a try today!

Conversas por chat

Com o Gemini, você pode ter conversas livres em vários turnos. O A classe ChatSession simplifica o processo, gerenciando o estado da conversa. Portanto, ao contrário de generate_content, não é necessário armazenar o histórico de conversas como uma lista.

Inicialize o chat:

model = genai.GenerativeModel('gemini-1.5-flash')
chat = model.start_chat(history=[])
chat
<google.generativeai.generative_models.ChatSession at 0x7b7b68250100>

O ChatSession.send_message retorna o mesmo tipo de GenerateContentResponse que GenerativeModel.generate_content Sua mensagem e a resposta ao histórico de chat também são anexadas:

response = chat.send_message("In one sentence, explain how a computer works to a young child.")
to_markdown(response.text)
A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!
chat.history
[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model"
]

Você pode continuar enviando mensagens para continuar a conversa. Use o Argumento stream=True para fazer streaming do chat:

response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)

for chunk in response:
  print(chunk.text)
  print("_"*80)
A computer works by following instructions, called a program, which tells it what to
________________________________________________________________________________
 do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor
________________________________________________________________________________
, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a
________________________________________________________________________________
 chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
________________________________________________________________________________

Os objetos glm.Content contêm uma lista de objetos glm.Part que contêm um texto (string) ou inline_data (glm.Blob), em que um blob contém binário e um mime_type. O histórico de chat está disponível como uma lista de glm.Content objetos em ChatSession.history:

for message in chat.history:
  display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))
**user**: In one sentence, explain how a computer works to a young child.

**model**: A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!

**user**: Okay, how about a more detailed explanation to a high schooler?

**model**: A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.

Contar tokens

Modelos de linguagem grandes têm uma janela de contexto, e o tamanho do contexto geralmente é medido em termos do número de tokens. Com a API Gemini, você pode: determinar o número de tokens por qualquer objeto genai.protos.Content. Na caso mais simples, é possível passar uma string de consulta ao GenerativeModel.count_tokens da seguinte forma:

model.count_tokens("What is the meaning of life?")
total_tokens: 7

Da mesma forma, é possível verificar token_count para seu ChatSession:

model.count_tokens(chat.history)
total_tokens: 501

Usar embeddings

Incorporação é uma técnica usada para representar informações como uma lista de números de ponto flutuante em uma matriz. Com o Gemini, você pode representar textos (palavras, frases e blocos) de texto) em forma vetorial, facilitando a comparação e o contraste e embeddings. Por exemplo, dois textos que compartilham um assunto semelhante ou de sentimento deve ter embeddings semelhantes, que podem ser identificados pela de comparação matemática, como a similaridade de cossenos. Para mais informações sobre como e por que usar embeddings, consulte Embeddings guia.

Use o método embed_content para gerar embeddings. O método gerencia embedding para as tarefas a seguir (task_type):

Tipo de tarefa Descrição
RETRIEVAL_QUERY Especifica que o texto é uma consulta em uma configuração de pesquisa/recuperação.
RETRIEVAL_DOCUMENT Especifica que o texto é um documento em uma configuração de pesquisa/recuperação. O uso desse tipo de tarefa requer um title.
SEMANTIC_SIMILARITY Especifica o texto a ser usado para similaridade textual semântica (STS).
CLASSIFICAÇÃO Especifica que os embeddings serão usados para classificação.
CLUSTERING Especifica que os embeddings serão usados para clustering.

O exemplo a seguir gera um embedding de uma única string para recuperação de documentos:

result = genai.embed_content(
    model="models/embedding-001",
    content="What is the meaning of life?",
    task_type="retrieval_document",
    title="Embedding of single string")

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')
[-0.003216741, -0.013358698, -0.017649598, -0.0091 ... TRIMMED]

Para processar lotes de strings, transmita uma lista de strings em content:

result = genai.embed_content(
    model="models/embedding-001",
    content=[
      'What is the meaning of life?',
      'How much wood would a woodchuck chuck?',
      'How does the brain work?'],
    task_type="retrieval_document",
    title="Embedding of list of strings")

# A list of inputs > A list of vectors output
for v in result['embedding']:
  print(str(v)[:50], '... TRIMMED ...')
[0.0040260437, 0.004124458, -0.014209415, -0.00183 ... TRIMMED ...
[-0.004049845, -0.0075574904, -0.0073463684, -0.03 ... TRIMMED ...
[0.025310587, -0.0080734305, -0.029902633, 0.01160 ... TRIMMED ...

Embora a função genai.embed_content aceite strings ou listas de strings, ela é construído em torno do tipo genai.protos.Content (como GenerativeModel.generate_content). Os objetos glm.Content são as principais unidades de conversa da API.

Enquanto o objeto genai.protos.Content é multimodal, o embed_content oferece suporte apenas a embeddings de texto. Esse design oferece à API possibilidade de expandir para embeddings multimodais.

response.candidates[0].content
parts {
  text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
}
role: "model"
result = genai.embed_content(
    model = 'models/embedding-001',
    content = response.candidates[0].content)

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED ...

Da mesma forma, o histórico de chat contém uma lista de objetos genai.protos.Content. que você pode transmitir diretamente para a função embed_content:

chat.history
[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model",
  parts {
    text: "Okay, how about a more detailed explanation to a high schooler?"
  }
  role: "user",
  parts {
    text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
  }
  role: "model"
]
result = genai.embed_content(
    model = 'models/embedding-001',
    content = chat.history)

# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
  print(str(v)[:50], '... TRIMMED...')
[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...

Casos de uso avançado

As seções a seguir discutem casos de uso avançados e detalhes de nível inferior da SDK do Python para a API Gemini.

Configurações de segurança

O argumento safety_settings permite configurar o que o modelo bloqueia e permite em comandos e respostas. Por padrão, as configurações de segurança bloqueiam conteúdo com probabilidade média e/ou alta de conteúdo não seguro em todas dimensões. Saiba mais sobre Segurança padrão.

Insira um comando questionável e execute o modelo com as configurações de segurança padrão. e não retornará nenhum candidato:

response = model.generate_content('[Questionable prompt here]')
response.candidates
[
  content {
    parts {
      text: "I\'m sorry, but this prompt involves a sensitive topic and I\'m not allowed to generate responses that are potentially harmful or inappropriate."
    }
    role: "model"
  }
  finish_reason: STOP
  index: 0
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
]

O prompt_feedback vai informar qual filtro de segurança bloqueou o comando:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Agora envie o mesmo comando ao modelo com as configurações de segurança recém-definidas: e você pode receber uma resposta.

response = model.generate_content('[Questionable prompt here]',
                                  safety_settings={'HARASSMENT':'block_none'})
response.text

Além disso, cada candidato tem o próprio safety_ratings, caso o comando são aprovadas, mas as respostas individuais são reprovadas nas verificações de segurança.

Codificar mensagens

As seções anteriores dependiam do SDK para facilitar o envio de comandos à API. Esta seção oferece um equivalente totalmente digitado à seção exemplo, para que você possa entender melhor os detalhes mais específicos sobre como os O SDK codifica as mensagens.

O SDK tenta converter sua mensagem em um objeto genai.protos.Content. que contém uma lista de objetos genai.protos.Part, cada um contendo:

  1. um text (string)
  2. inline_data (genai.protos.Blob), em que um blob contém data e binários um mime_type.
  3. ou outros tipos de dados.

Também é possível transmitir qualquer uma dessas classes como um dicionário equivalente.

Portanto, o equivalente totalmente digitado do exemplo anterior é:

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    genai.protos.Content(
        parts = [
            genai.protos.Part(text="Write a short, engaging blog post based on this picture."),
            genai.protos.Part(
                inline_data=genai.protos.Blob(
                    mime_type='image/jpeg',
                    data=pathlib.Path('image.jpg').read_bytes()
                )
            ),
        ],
    ),
    stream=True)
response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")
Meal prepping is a great way to save time and money, and it can also help you to eat healthier. By ... [TRIMMED] ...

Conversas com vários turnos

Embora a classe genai.ChatSession mostrada anteriormente possa lidar com muitos casos de uso, ela faz algumas suposições. Caso seu caso de uso não se encaixe neste chat de implementação, é bom lembrar que genai.ChatSession é apenas um wrapper por perto GenerativeModel.generate_content. Além de solicitações únicas, ele pode processar conversas de vários turnos.

As mensagens individuais são objetos genai.protos.Content ou compatíveis dicionários, como vimos nas seções anteriores. Como um dicionário, a mensagem requer as chaves role e parts. O role em uma conversa pode ser o user, que fornece os comandos, ou model, que fornece as respostas.

Transmita uma lista de objetos genai.protos.Content e ela será tratada como chat multiturno:

model = genai.GenerativeModel('gemini-1.5-flash')

messages = [
    {'role':'user',
     'parts': ["Briefly explain how a computer works to a young child."]}
]
response = model.generate_content(messages)

to_markdown(response.text)
Imagine a computer as a really smart friend who can help you with many things. Just like you have a brain to think and learn, a computer has a brain too, called a processor. It's like the boss of the computer, telling it what to do.

Inside the computer, there's a special place called memory, which is like a big storage box. It remembers all the things you tell it to do, like opening games or playing videos.

When you press buttons on the keyboard or click things on the screen with the mouse, you're sending messages to the computer. These messages travel through special wires, called cables, to the processor.

The processor reads the messages and tells the computer what to do. It can open programs, show you pictures, or even play music for you.

All the things you see on the screen are created by the graphics card, which is like a magic artist inside the computer. It takes the processor's instructions and turns them into colorful pictures and videos.

To save your favorite games, videos, or pictures, the computer uses a special storage space called a hard drive. It's like a giant library where the computer can keep all your precious things safe.

And when you want to connect to the internet to play games with friends or watch funny videos, the computer uses something called a network card to send and receive messages through the internet cables or Wi-Fi signals.

So, just like your brain helps you learn and play, the computer's processor, memory, graphics card, hard drive, and network card all work together to make your computer a super-smart friend that can help you do amazing things!

Para continuar a conversa, adicione a resposta e outra mensagem.

messages.append({'role':'model',
                 'parts':[response.text]})

messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})

response = model.generate_content(messages)

to_markdown(response.text)
At its core, a computer is a machine that can be programmed to carry out a set of instructions. It consists of several essential components that work together to process, store, and display information:

**1. Processor (CPU):**
   -   The brain of the computer.
   -   Executes instructions and performs calculations.
   -   Speed measured in gigahertz (GHz).
   -   More GHz generally means faster processing.

**2. Memory (RAM):**
   -   Temporary storage for data being processed.
   -   Holds instructions and data while the program is running.
   -   Measured in gigabytes (GB).
   -   More GB of RAM allows for more programs to run simultaneously.

**3. Storage (HDD/SSD):**
   -   Permanent storage for data.
   -   Stores operating system, programs, and user files.
   -   Measured in gigabytes (GB) or terabytes (TB).
   -   Hard disk drives (HDDs) are traditional, slower, and cheaper.
   -   Solid-state drives (SSDs) are newer, faster, and more expensive.

**4. Graphics Card (GPU):**
   -   Processes and displays images.
   -   Essential for gaming, video editing, and other graphics-intensive tasks.
   -   Measured in video RAM (VRAM) and clock speed.

**5. Motherboard:**
   -   Connects all the components.
   -   Provides power and communication pathways.

**6. Input/Output (I/O) Devices:**
   -   Allow the user to interact with the computer.
   -   Examples: keyboard, mouse, monitor, printer.

**7. Operating System (OS):**
   -   Software that manages the computer's resources.
   -   Provides a user interface and basic functionality.
   -   Examples: Windows, macOS, Linux.

When you run a program on your computer, the following happens:

1.  The program instructions are loaded from storage into memory.
2.  The processor reads the instructions from memory and executes them one by one.
3.  If the instruction involves calculations, the processor performs them using its arithmetic logic unit (ALU).
4.  If the instruction involves data, the processor reads or writes to memory.
5.  The results of the calculations or data manipulation are stored in memory.
6.  If the program needs to display something on the screen, it sends the necessary data to the graphics card.
7.  The graphics card processes the data and sends it to the monitor, which displays it.

This process continues until the program has completed its task or the user terminates it.

Configuração de geração

O argumento generation_config permite modificar os parâmetros de geração. Cada comando que você envia ao modelo inclui valores de parâmetros que controlam como o modelo gera respostas.

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    'Tell me a story about a magic backpack.',
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=['x'],
        max_output_tokens=20,
        temperature=1.0)
)
text = response.text

if response.candidates[0].finish_reason.name == "MAX_TOKENS":
    text += '...'

to_markdown(text)
Once upon a time, in a small town nestled amidst lush green hills, lived a young girl named...

A seguir

  • O design de comandos é o processo de criação de comandos que despertam o interesse resposta de modelos de linguagem. Escrever comandos bem estruturados é essencial para garantir respostas precisas e de alta qualidade em um idioma um modelo de machine learning. Saiba mais sobre as práticas recomendadas para comandos gravação.
  • O Gemini oferece diversas variações de modelos para atender às necessidades de diferentes usos casos, como tipos de entrada e complexidade, implementações para chat ou outros tarefas de linguagem de caixas de diálogo e restrições de tamanho. Saiba mais sobre os recursos disponíveis Modelos do Gemini.