Tutorial: inizia a utilizzare l'API Gemini


Visualizza sull'IA di Google Esegui in Google Colab Visualizza il codice sorgente su GitHub

Questa guida rapida dimostra come utilizzare l'SDK Python per l'API Gemini, che ti dà accesso ai modelli linguistici di grandi dimensioni (LLM) Gemini di Google. In questa guida rapida, imparerai

  1. Configura il tuo ambiente di sviluppo e l'accesso all'API per utilizzare Gemini.
  2. Genera risposte di testo da input di testo.
  3. Genera risposte di testo da input multimodali (testo e immagini).
  4. Utilizza Gemini per le conversazioni in più passaggi (chat).
  5. Usare gli incorporamenti per i modelli linguistici di grandi dimensioni.

Prerequisiti

Puoi eseguire questa guida rapida in Google Colab, che esegue il blocco note direttamente nel browser e non richiede la configurazione dell'ambiente di gestione.

In alternativa, per completare questa guida rapida in locale, assicurati che il tuo team di sviluppo soddisfa i seguenti requisiti:

  • Python 3.9 o versioni successive
  • Un'installazione di jupyter per eseguire il blocco note.

Configurazione

Installa l'SDK Python

L'SDK Python per l'API Gemini è contenuto nel google-generativeai. Installa la dipendenza utilizzando pip:

pip install -q -U google-generativeai

Importa pacchetti

Importa i pacchetti necessari.

import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
# Used to securely store your API key
from google.colab import userdata

Configura la chiave API

Prima di poter utilizzare l'API Gemini, devi ottenere una chiave API. Se se non ne hai già una, crea una chiave con un solo clic in Google AI Studio.

Ottenere una chiave API

In Colab, aggiungi la chiave a Secret Manager nella sezione " {/8}" nel riquadro a sinistra. Assegnagli il nome GOOGLE_API_KEY.

Una volta ottenuta la chiave API, passala all'SDK. A tale scopo, puoi procedere in uno dei due seguenti modi:

  • Inserisci la chiave nella variabile di ambiente GOOGLE_API_KEY (l'SDK lo acquisirà automaticamente da lì).
  • Passa la chiave a genai.configure(api_key=...)
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

Elenco modelli

Ora puoi chiamare l'API Gemini. Usa list_models per vedere le opzioni disponibili Modelli Gemini:

  • gemini-1.5-flash: il nostro modello multimodale più veloce
  • gemini-1.5-pro: il nostro modello multimodale più avanzato e intelligente
di Gemini Advanced.
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

Genera testo da input di testo

Per i prompt di solo testo, utilizza un modello Gemini 1.5 o il modello Gemini 1.0 Pro:

model = genai.GenerativeModel('gemini-1.5-flash')

Il metodo generate_content è in grado di gestire un'ampia varietà di casi d'uso, tra cui chat in più passaggi e input multimodale, a seconda del modello sottostante Google Cloud. I modelli disponibili supportano solo testo e immagini come input e testo come output.

Nel caso più semplice, puoi passare una stringa di prompt GenerativeModel.generate_content :

%%time
response = model.generate_content("What is the meaning of life?")
CPU times: user 110 ms, sys: 12.3 ms, total: 123 ms
Wall time: 8.25 s

In casi semplici, la funzione di accesso response.text è tutto quello di cui hai bisogno. Per visualizzare testo Markdown formattato, utilizza la funzione to_markdown:

to_markdown(response.text)
The query of life's purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.

1.  **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and interests.

2.  **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3.  **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one's boundaries, confronting personal obstacles, and evolving as a person.

4.  **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one's moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.

5.  **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.

6.  **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.

7.  **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one's contributions, or inspiring and motivating others.

8.  **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one's values and beliefs.

Ultimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them.

Se l'API non riesce a restituire un risultato, utilizza GenerateContentResponse.prompt_feedback per verificare se è stato bloccato per problemi di sicurezza relativi alla richiesta.

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Gemini può generare più possibili risposte per un singolo prompt. Questi le possibili risposte si chiamano candidates e puoi esaminarle per selezionare il più adatto come risposta.

Visualizza i candidati di risposta con GenerateContentResponse.candidates:

response.candidates
[
  content {
    parts {
      text: "The query of life\'s purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.\n\n1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one\'s physical and mental health, and pursuing personal goals and interests.\n\n2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.\n\n3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one\'s boundaries, confronting personal obstacles, and evolving as a person.\n\n4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one\'s moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.\n\n5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.\n\n6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.\n\n7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one\'s contributions, or inspiring and motivating others.\n\n8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one\'s values and beliefs.\n\nUltimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them."
    }
    role: "model"
  }
  finish_reason: STOP
  index: 0
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
]

Per impostazione predefinita, il modello restituisce una risposta dopo aver completato l'intera generazione e il processo di sviluppo. Puoi anche trasmettere in streaming la risposta mentre viene generata restituirà dei blocchi della risposta appena verranno generati.

Per trasmettere le risposte in streaming, utilizza GenerativeModel.generate_content(..., stream=True).

%%time
response = model.generate_content("What is the meaning of life?", stream=True)
CPU times: user 102 ms, sys: 25.1 ms, total: 128 ms
Wall time: 7.94 s
for chunk in response:
  print(chunk.text)
  print("_"*80)
The query of life's purpose has perplexed people across centuries, cultures, and
________________________________________________________________________________
 continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences
________________________________________________________________________________
.

1.  **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and aspirations.

2.  **Meaning
________________________________________________________________________________
ful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.

3.  **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, exploring one's interests and abilities, overcoming obstacles, and becoming the best version of oneself.

4.  **Connection and Relationships:** For many individuals, the purpose of life is found in their relationships with others. This might entail building
________________________________________________________________________________
 strong bonds with family and friends, fostering a sense of community, and contributing to the well-being of those around them.

5.  **Spiritual Fulfillment:** For those with religious or spiritual beliefs, the purpose of life may be centered on seeking spiritual fulfillment or enlightenment. This might entail following religious teachings, engaging in spiritual practices, or seeking a deeper understanding of the divine.

6.  **Experiencing the Journey:** Some believe that the purpose of life is simply to experience the journey itself, with all its joys and sorrows. This perspective emphasizes embracing the present moment, appreciating life's experiences, and finding meaning in the act of living itself.

7.  **Legacy and Impact:** For others, the goal of life is to leave a lasting legacy or impact on the world. This might entail making a significant contribution to a particular field, leaving a positive mark on future generations, or creating something that will be remembered and cherished long after one's lifetime.

Ultimately, the meaning of life is a personal and subjective question, and there is no single, universally accepted answer. It is about discovering what brings you fulfillment, purpose, and meaning in your own life, and living in accordance with those values.
________________________________________________________________________________

Durante il flusso di dati, alcuni attributi di risposta non sono disponibili finché non esegui l'iterazione in tutti i blocchi di risposta. Ciò è dimostrato di seguito:

response = model.generate_content("What is the meaning of life?", stream=True)

L'attributo prompt_feedback funziona:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Tuttavia, gli attributi come text non:

try:
  response.text
except Exception as e:
  print(f'{type(e).__name__}: {e}')
IncompleteIterationError: Please let the response complete iteration before accessing the final accumulated
attributes (or call `response.resolve()`)

Genera testo da input di immagini e testo

Gemini fornisce vari modelli in grado di gestire l'input multimodale (Gemini 1.5 modelli) in modo da poter inserire sia testo che immagini. Assicurati di esaminare requisiti relativi alle immagini per i prompt.

Quando l'input del prompt include sia testo che immagini, utilizza Gemini 1.5 con Metodo GenerativeModel.generate_content per generare un output di testo:

Includiamo un'immagine:

curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  405k  100  405k    0     0  6982k      0 --:--:-- --:--:-- --:--:-- 7106k
import PIL.Image

img = PIL.Image.open('image.jpg')
img

png

Usa un modello Gemini 1.5 e passa l'immagine al modello con generate_content.

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(img)

to_markdown(response.text)
Chicken Teriyaki Meal Prep Bowls with brown rice, roasted broccoli and bell peppers.

Per fornire sia testo che immagini in un prompt, passa un elenco contenente le stringhe e immagini:

response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.", img], stream=True)
response.resolve()
to_markdown(response.text)
Meal prepping is a great way to save time and money, and it can also help you to eat healthier. This meal is a great example of a healthy and delicious meal that can be easily prepped ahead of time.

This meal features brown rice, roasted vegetables, and chicken teriyaki. The brown rice is a whole grain that is high in fiber and nutrients. The roasted vegetables are a great way to get your daily dose of vitamins and minerals. And the chicken teriyaki is a lean protein source that is also packed with flavor.

This meal is easy to prepare ahead of time. Simply cook the brown rice, roast the vegetables, and cook the chicken teriyaki. Then, divide the meal into individual containers and store them in the refrigerator. When you're ready to eat, simply grab a container and heat it up.

This meal is a great option for busy people who are looking for a healthy and delicious way to eat. It's also a great meal for those who are trying to lose weight or maintain a healthy weight.

If you're looking for a healthy and delicious meal that can be easily prepped ahead of time, this meal is a great option. Give it a try today!

Conversazioni in chat

Gemini ti consente di tenere conversazioni in formato libero durante più turni. La La classe ChatSession semplifica il processo gestendo lo stato della conversazione, quindi a differenza di generate_content, non devi archiviare la cronologia delle conversazioni sotto forma di elenco.

Inizializza la chat:

model = genai.GenerativeModel('gemini-1.5-flash')
chat = model.start_chat(history=[])
chat
<google.generativeai.generative_models.ChatSession at 0x7b7b68250100>

La ChatSession.send_message restituisce lo stesso tipo GenerateContentResponse di GenerativeModel.generate_content. Inoltre, aggiunge il tuo messaggio e la risposta alla cronologia chat:

response = chat.send_message("In one sentence, explain how a computer works to a young child.")
to_markdown(response.text)
A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!
chat.history
[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model"
]

Puoi continuare a inviare messaggi per continuare la conversazione. Utilizza la stream=True argomento per trasmettere la chat in streaming:

response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)

for chunk in response:
  print(chunk.text)
  print("_"*80)
A computer works by following instructions, called a program, which tells it what to
________________________________________________________________________________
 do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor
________________________________________________________________________________
, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a
________________________________________________________________________________
 chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
________________________________________________________________________________

glm.Content oggetti contengono un elenco di glm.Part oggetti, ciascuno dei quali contiene un testo (stringa) o inline_data (glm.Blob), dove un blob contiene codice binario e un mime_type. La cronologia chat è disponibile come elenco di glm.Content in ChatSession.history:

for message in chat.history:
  display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))
**user**: In one sentence, explain how a computer works to a young child.

**model**: A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!

**user**: Okay, how about a more detailed explanation to a high schooler?

**model**: A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.

To give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).

In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.

Conta token

I modelli linguistici di grandi dimensioni hanno una finestra di contesto e la lunghezza del contesto è spesso misurate in termini di numero di token. Con l'API Gemini puoi determinano il numero di token per qualsiasi oggetto genai.protos.Content. Nella nel caso più semplice, puoi passare una stringa di query GenerativeModel.count_tokens come segue:

model.count_tokens("What is the meaning of life?")
total_tokens: 7

Analogamente, puoi controllare token_count per ChatSession:

model.count_tokens(chat.history)
total_tokens: 501

Utilizza gli incorporamenti

Incorporamento è una tecnica utilizzata per rappresentare le informazioni come un elenco di numeri in virgola mobile in un array. Con Gemini, puoi rappresentare testo (parole, frasi e blocchi di testo) in formato vettoriale, semplificando il confronto e il contrasto incorporamenti. ad esempio due testi che condividono un oggetto simile o dovrebbe avere incorporamenti simili, che possono essere identificati tecniche di confronto matematiche come la somiglianza coseno. Per ulteriori informazioni e perché dovresti usare gli incorporamenti, consulta la sezione Incorporamenti .

Utilizza il metodo embed_content per generare incorporamenti. Il metodo gestisce per le seguenti attività (task_type):

Tipo di attività Descrizione
RETRIEVAL_QUERY Specifica che il testo specificato è una query in un'impostazione di ricerca/recupero.
RETRIEVAL_DOCUMENT Specifica che il testo specificato è un documento in un'impostazione di ricerca/recupero. L'utilizzo di questo tipo di attività richiede un title.
SEMANTIC_SIMILARITY Specifica il testo specificato che verrà utilizzato per la somiglianza testuale semantica (STS).
CLASSIFICAZIONE Specifica che gli incorporamenti verranno utilizzati per la classificazione.
CLUSTERING Specifica che gli incorporamenti verranno utilizzati per il clustering.

Di seguito viene generato un incorporamento per una singola stringa ai fini del recupero dei documenti:

result = genai.embed_content(
    model="models/embedding-001",
    content="What is the meaning of life?",
    task_type="retrieval_document",
    title="Embedding of single string")

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')
[-0.003216741, -0.013358698, -0.017649598, -0.0091 ... TRIMMED]

Per gestire batch di stringhe, passa un elenco di stringhe in content:

result = genai.embed_content(
    model="models/embedding-001",
    content=[
      'What is the meaning of life?',
      'How much wood would a woodchuck chuck?',
      'How does the brain work?'],
    task_type="retrieval_document",
    title="Embedding of list of strings")

# A list of inputs > A list of vectors output
for v in result['embedding']:
  print(str(v)[:50], '... TRIMMED ...')
[0.0040260437, 0.004124458, -0.014209415, -0.00183 ... TRIMMED ...
[-0.004049845, -0.0075574904, -0.0073463684, -0.03 ... TRIMMED ...
[0.025310587, -0.0080734305, -0.029902633, 0.01160 ... TRIMMED ...

La funzione genai.embed_content accetta stringhe o elenchi di stringhe, è basato sul tipo genai.protos.Content (come GenerativeModel.generate_content). Gli oggetti glm.Content sono le unità principali della conversazione nell'API.

Mentre l'oggetto genai.protos.Content è multimodale, l'elemento embed_content supporta solo gli incorporamenti di testo. Questa progettazione conferisce all'API possibilità di espandersi agli incorporamenti multimodali.

response.candidates[0].content
parts {
  text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
}
role: "model"
result = genai.embed_content(
    model = 'models/embedding-001',
    content = response.candidates[0].content)

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED ...

Analogamente, la cronologia chat contiene un elenco di genai.protos.Content oggetti, che puoi passare direttamente alla funzione embed_content:

chat.history
[
  parts {
    text: "In one sentence, explain how a computer works to a young child."
  }
  role: "user",
  parts {
    text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!"
  }
  role: "model",
  parts {
    text: "Okay, how about a more detailed explanation to a high schooler?"
  }
  role: "user",
  parts {
    text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
  }
  role: "model"
]
result = genai.embed_content(
    model = 'models/embedding-001',
    content = chat.history)

# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
  print(str(v)[:50], '... TRIMMED...')
[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...

Casi d'uso avanzati

Le seguenti sezioni illustrano casi d'uso avanzati e dettagli di livello inferiore SDK Python per l'API Gemini.

Impostazioni di sicurezza

L'argomento safety_settings consente di configurare ciò che il modello blocca sia nei prompt che nelle risposte. Per impostazione predefinita, le impostazioni di sicurezza bloccano i contenuti con probabilità media e/o alta di essere contenuti non sicuri in tutti dimensioni. Scopri di più sulla sicurezza impostazioni.

Inserisci un prompt contestabile ed esegui il modello con le impostazioni di sicurezza predefinite. e non verranno restituiti candidati:

response = model.generate_content('[Questionable prompt here]')
response.candidates
[
  content {
    parts {
      text: "I\'m sorry, but this prompt involves a sensitive topic and I\'m not allowed to generate responses that are potentially harmful or inappropriate."
    }
    role: "model"
  }
  finish_reason: STOP
  index: 0
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
]

prompt_feedback ti indicherà quale filtro di sicurezza ha bloccato la richiesta:

response.prompt_feedback
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

Ora fornisci lo stesso prompt al modello con le nuove impostazioni di sicurezza configurate, e potresti ricevere una risposta.

response = model.generate_content('[Questionable prompt here]',
                                  safety_settings={'HARASSMENT':'block_none'})
response.text

Tieni inoltre presente che ogni candidato ha il proprio safety_ratings, nel caso in cui la richiesta viene superato, ma le singole risposte non superano i controlli di sicurezza.

Codifica dei messaggi

Le sezioni precedenti si basavano sull'SDK per semplificare l'invio dei prompt all'API. Questa sezione offre un equivalente completamente digitato alla versione esempio, in modo da poter comprendere meglio i dettagli di livello inferiore relativi al L'SDK codifica i messaggi.

L'SDK tenta di convertire il messaggio in un oggetto genai.protos.Content, che contiene un elenco di genai.protos.Part oggetti, ognuno dei quali contiene:

  1. a text (stringa)
  2. inline_data (genai.protos.Blob), dove un blob contiene il codice binario data e un mime_type.
  3. o altri tipi di dati.

Puoi anche passare uno qualsiasi di questi corsi come dizionario equivalente.

Quindi, l'equivalente completamente digitato dell'esempio precedente è:

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    genai.protos.Content(
        parts = [
            genai.protos.Part(text="Write a short, engaging blog post based on this picture."),
            genai.protos.Part(
                inline_data=genai.protos.Blob(
                    mime_type='image/jpeg',
                    data=pathlib.Path('image.jpg').read_bytes()
                )
            ),
        ],
    ),
    stream=True)
response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")
Meal prepping is a great way to save time and money, and it can also help you to eat healthier. By ... [TRIMMED] ...

Conversazioni multi-turno

Sebbene la classe genai.ChatSession mostrata in precedenza possa gestire molti casi d'uso, fa alcune ipotesi. Se il tuo caso d'uso non rientra in questa chat implementazione è bene ricordare che genai.ChatSession è solo un wrapper intorno a GenerativeModel.generate_content Oltre alle richieste singole, può gestire le conversazioni in più passaggi.

I singoli messaggi sono oggetti genai.protos.Content o compatibili come mostrato nelle sezioni precedenti. Come dizionario, il messaggio richiede role e parts chiavi. Il role in una conversazione può essere user, che fornisce i prompt, o model, che fornisce le risposte.

Passa un elenco di genai.protos.Content oggetti, che verranno trattati come chat a più turni:

model = genai.GenerativeModel('gemini-1.5-flash')

messages = [
    {'role':'user',
     'parts': ["Briefly explain how a computer works to a young child."]}
]
response = model.generate_content(messages)

to_markdown(response.text)
Imagine a computer as a really smart friend who can help you with many things. Just like you have a brain to think and learn, a computer has a brain too, called a processor. It's like the boss of the computer, telling it what to do.

Inside the computer, there's a special place called memory, which is like a big storage box. It remembers all the things you tell it to do, like opening games or playing videos.

When you press buttons on the keyboard or click things on the screen with the mouse, you're sending messages to the computer. These messages travel through special wires, called cables, to the processor.

The processor reads the messages and tells the computer what to do. It can open programs, show you pictures, or even play music for you.

All the things you see on the screen are created by the graphics card, which is like a magic artist inside the computer. It takes the processor's instructions and turns them into colorful pictures and videos.

To save your favorite games, videos, or pictures, the computer uses a special storage space called a hard drive. It's like a giant library where the computer can keep all your precious things safe.

And when you want to connect to the internet to play games with friends or watch funny videos, the computer uses something called a network card to send and receive messages through the internet cables or Wi-Fi signals.

So, just like your brain helps you learn and play, the computer's processor, memory, graphics card, hard drive, and network card all work together to make your computer a super-smart friend that can help you do amazing things!

Per continuare la conversazione, aggiungi la risposta e un altro messaggio.

messages.append({'role':'model',
                 'parts':[response.text]})

messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})

response = model.generate_content(messages)

to_markdown(response.text)
At its core, a computer is a machine that can be programmed to carry out a set of instructions. It consists of several essential components that work together to process, store, and display information:

**1. Processor (CPU):**
   -   The brain of the computer.
   -   Executes instructions and performs calculations.
   -   Speed measured in gigahertz (GHz).
   -   More GHz generally means faster processing.

**2. Memory (RAM):**
   -   Temporary storage for data being processed.
   -   Holds instructions and data while the program is running.
   -   Measured in gigabytes (GB).
   -   More GB of RAM allows for more programs to run simultaneously.

**3. Storage (HDD/SSD):**
   -   Permanent storage for data.
   -   Stores operating system, programs, and user files.
   -   Measured in gigabytes (GB) or terabytes (TB).
   -   Hard disk drives (HDDs) are traditional, slower, and cheaper.
   -   Solid-state drives (SSDs) are newer, faster, and more expensive.

**4. Graphics Card (GPU):**
   -   Processes and displays images.
   -   Essential for gaming, video editing, and other graphics-intensive tasks.
   -   Measured in video RAM (VRAM) and clock speed.

**5. Motherboard:**
   -   Connects all the components.
   -   Provides power and communication pathways.

**6. Input/Output (I/O) Devices:**
   -   Allow the user to interact with the computer.
   -   Examples: keyboard, mouse, monitor, printer.

**7. Operating System (OS):**
   -   Software that manages the computer's resources.
   -   Provides a user interface and basic functionality.
   -   Examples: Windows, macOS, Linux.

When you run a program on your computer, the following happens:

1.  The program instructions are loaded from storage into memory.
2.  The processor reads the instructions from memory and executes them one by one.
3.  If the instruction involves calculations, the processor performs them using its arithmetic logic unit (ALU).
4.  If the instruction involves data, the processor reads or writes to memory.
5.  The results of the calculations or data manipulation are stored in memory.
6.  If the program needs to display something on the screen, it sends the necessary data to the graphics card.
7.  The graphics card processes the data and sends it to the monitor, which displays it.

This process continues until the program has completed its task or the user terminates it.

Configurazione di generazione

L'argomento generation_config consente di modificare i parametri di generazione. Ogni richiesta inviata al modello include valori parametro che controllano come il modello genera risposte.

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(
    'Tell me a story about a magic backpack.',
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=['x'],
        max_output_tokens=20,
        temperature=1.0)
)
text = response.text

if response.candidates[0].finish_reason.name == "MAX_TOKENS":
    text += '...'

to_markdown(text)
Once upon a time, in a small town nestled amidst lush green hills, lived a young girl named...

Passaggi successivi

  • La progettazione dei prompt è il processo di creazione dei prompt che la risposta dai modelli linguistici. Scrivere prompt ben strutturati è parte essenziale del garantire risposte accurate e di alta qualità da una lingua un modello di machine learning. Scopri le best practice per i prompt scrivere.
  • Gemini offre diverse varianti di modelli per soddisfare le esigenze di usi diversi come la complessità e i tipi di input, le implementazioni per la chat o altri le attività legate al linguaggio delle finestre di dialogo e i vincoli di dimensione. Scopri di più sui modelli Modelli Gemini.