Fine-tuning tutorial

This tutorial will help you get started with the Gemini API tuning service using either the Python SDK or the REST API using curl. The examples show how to tune the text model behind the Gemini API text generation service.

View on ai.google.dev Try a Colab notebook View notebook on GitHub

Limitations

Before tuning a model, you should be aware of the following limitations:

Fine-tuning datasets

Fine-tuning datasets for Gemini 1.5 Flash have the following limitations:

  • The maximum input size per example is 40,000 characters.
  • The maximum output size per example is 5,000 characters.
  • Only input-output pair examples are supported. Chat-style multi-turn conversations are not supported.

Tuned models

Tuned models have the following limitations:

  • The input limit of a tuned Gemini 1.5 Flash model is 40,000 characters.
  • JSON mode is not supported with tuned models.
  • Only text input is supported.

Before you begin: Set up your project and API key

Before calling the Gemini API, you need to set up your project and configure your API key.

List tuned models

You can check your existing tuned models with the tunedModels.list method.

from google import genai
client = genai.Client() # Get the key from the GOOGLE_API_KEY env variable

for model_info in client.models.list():
    print(model_info.name)

Create a tuned model

To create a tuned model, you need to pass your dataset to the model in the tunedModels.create method.

For this example, you will tune a model to generate the next number in the sequence. For example, if the input is 1, the model should output 2. If the input is one hundred, the output should be one hundred one.

# create tuning model
training_dataset =  [
    ["1", "2"],
    ["3", "4"],
    ["-3", "-2"],
    ["twenty two", "twenty three"],
    ["two hundred", "two hundred one"],
    ["ninety nine", "one hundred"],
    ["8", "9"],
    ["-98", "-97"],
    ["1,000", "1,001"],
    ["10,100,000", "10,100,001"],
    ["thirteen", "fourteen"],
    ["eighty", "eighty one"],
    ["one", "two"],
    ["three", "four"],
    ["seven", "eight"],
]
training_dataset=types.TuningDataset(
        examples=[
            types.TuningExample(
                text_input=i,
                output=o,
            )
            for i,o in training_dataset
        ],
    )
tuning_job = client.tunings.tune(
    base_model='models/gemini-1.5-flash-001-tuning',
    training_dataset=training_dataset,
    config=types.CreateTuningJobConfig(
        epoch_count= 5,
        batch_size=4,
        learning_rate=0.001,
        tuned_model_display_name="test tuned model"
    )
)

# generate content with the tuned model
response = client.models.generate_content(
    model=tuning_job.tuned_model.model,
    contents='III',
)

print(response.text)

The optimal values for epoch count, batch size, and learning rate are dependent on your dataset and other constraints of your use case. To learn more about these values, see Advanced tuning settings and Hyperparameters.

Try the model

You can use the tunedModels.generateContent method and specify the name of the tuned model to test its performance.

response = client.models.generate_content(
    model=tuning_job.tuned_model.model,
    contents='III'
)

Not implemented

Some features (progress reporting, updating the description, and deleting tuned models) has not yet been implemented in the new SDK.