Extrair dados estruturados usando chamadas de funções

Ver no Google AI Executar no Google Colab Consulte o código-fonte no GitHub

Neste tutorial, você trabalhará com um exemplo de extração de dados estruturados usando a API Gemini para extrair as listas de caracteres, relacionamentos, coisas e lugares de uma história.

Configuração

pip install -U -q google-generativeai
import pathlib
import textwrap

import google.generativeai as genai
import google.ai.generativelanguage as glm


from IPython.display import display
from IPython.display import Markdown

from google.api_core import retry

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

Quando você tiver a chave de API, transmita-a ao SDK. Faça isso de duas maneiras:

  • Coloque a chave na variável de ambiente GOOGLE_API_KEY. O SDK vai selecioná-la automaticamente de lá.
  • Transmita a chave para genai.configure(api_key=...)
try:
    # Used to securely store your API key
    from google.colab import userdata

    # Or use `os.getenv('API_KEY')` to fetch an environment variable.
    GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
except ImportError:
    import os
    GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']

genai.configure(api_key=GOOGLE_API_KEY)

A tarefa de exemplo

Para este tutorial, você vai extrair entidades de histórias em linguagem natural. Como exemplo, confira abaixo uma história escrita pelo Gemini.

new_story = False

if new_story:
  model = genai.GenerativeModel(model_name='models/gemini-1.5-pro-latest')

  response = model.generate_content("""
      Write a long story about a girl with magic backpack, her family, and at
      least one other charater. Make sure everyone has names. Don't forget to
      describe the contents of the backpack, and where everyone and everything
      starts and ends up.""", request_options={'retry': retry.Retry()})
  story = response.text
  print(response.candidates[0].citation_metadata)
else:
  story = """In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n\nHanded down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n\nAnya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. "Remember, my dear," whispered her mother, "use your magic wisely and for good." Her father added, "Always seek knowledge, and let the backpack be your trusted companion."\n\nWith a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. "Hey, Anya," he called out. "Can I see your backpack?"\n\nAnya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n\nTogether, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n\n"What's wrong?" she asked.\n\nA tall, lanky boy stepped forward. "There's a monster in the forest," he stammered. "It's been terrorizing the town, attacking animals and even people."\n\nAnya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n\nWithout a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. "Don't worry," she said, her voice steady. "I'll take care of it."\n\nWith Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n\nSuddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n\nFear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n\nWhen the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n\nAs she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time."""
to_markdown(story)

Na pitoresca cidade de Willow Creek, aninhada entre colinas e salgueiros sussurrantes, residiu uma menina chamada Anya. Ao sair da porta de madeira estalada de sua modesta cabana, seu coração pulou uma batida de empolgação e expectativa. Hoje foi seu primeiro dia de aula, e ela mal podia esperar para mostrar seu tesouro mais valioso: uma mochila mágica.

Passada da avó dela para ela, a mochila não era comum. Seu tecido macio e verde-esmeralda brilhava com um brilho etéreo e suas alças de couro guardavam segredos que apenas Anya conhecia. Dentro dessa tela, havia um mundo encantado, cheio de maravilhas que despertariam a imaginação dela e mudariam a vida dela para sempre.

Os pais de Anya, a Elisa gentil e o sábio Edward, se despediram com abraços calorosos. "Lembre-se, minha querida", sussurrou a mãe dela, "use sua magia com sabedoria e para o bem". Seu pai acrescentou: "Sempre busque conhecimento e deixe a mochila ser sua companheira de confiança".

Com um salto em sua caminhada, Anya saiu em direção à única escola da cidade. No caminho, ela passou seu melhor amigo, Samuel, um menino curioso e aventureiro com um sorriso malicioso. "Oi, Anya", ele disse. "Posso ver sua mochila?"

Anya hesitou por um momento antes de descompactar a aba e revelar seu conteúdo. Os olhos de Samuel se arregalaram de surpresa quando ele espiou lá dentro. Ali, entre lápis e cadernos, havia uma espada cintilante, um livro de feitiços antigos, uma pequena bússola que sempre apontava para o norte e uma chave mágica que poderia abrir qualquer fechadura.

Juntos, eles se maravilharam com as maravilhas da mochila, prometendo manter os segredos em segurança. Ao se aproximar da escola, Anya notou um grupo de crianças mais velhas amontoadas, seus rostos marcados pelo medo. Por curiosidade, ela abordou com cautela.

"O que há de errado?", ela perguntou.

Um menino alto e esguio deu um passo para frente. "Há um monstro na floresta", ele gaguei. "Está aterrorizando a cidade, atacando animais e até pessoas."

O coração de Anya se afundou. A cidade de Willow Creek era pequena e pacífica, e a ideia de um monstro causou um calafrio na espinha. Ela sabia que precisava fazer algo para proteger sua família e amigos.

Sem hesitação, Anya abriu sua mochila e pegou a espada brilhante. Com um brilho determinado em seus olhos, ela se virou para seus colegas aterrorizados. "Não se preocupe", ela disse, com a voz firme. "Vou cuidar disso."

Com Samuel perto atrás dela, Anya se aventurou nas profundezas sombrias da floresta. As árvores pareciam sussurrar segredos enquanto ela passava, e a vegetação rufava com criaturas invisíveis. Conforme eles caminhavam mais fundo na floresta, o ar ficava pesado e o chão sob seus pés tremia.

De repente, eles chegaram a uma clareira e ali, diante de seus olhos, estava o monstro: uma fera enorme com dentes afiados, olhos vermelhos brilhantes e garras que poderiam esmagar um ser humano com facilidade. A criatura rugiu, um som de trovoada que sacudiu toda a floresta.

O medo surgiu por Anya, mas ela se recusou a deixar isso consumi-la. Ela puxou a espada de sua bainha e atacou o monstro. A lâmina brilhava sob a luz do sol e, ao atingir o esconderijo da fera, uma luz ofuscava irrompia-se, envolvendo tudo em sua radiância.

Quando a luz se esmaeceu, o monstro já havia saído e havia uma pilha de cristais no lugar dele. Anya derrotou a criatura com a magia de sua mochila, provando que até mesmo os menores objetos podiam conter o maior dos poderes.

Quando ela e Samuel voltaram à cidade, eles foram saudados como heróis. O povo de Willow Creek ficou radiante, e a lenda de Anya, a garota com a mochila mágica, foi repassada por gerações. Assim, Anya continuou suas aventuras, usando as maravilhas da mochila para tornar o mundo um lugar melhor, um passo mágico de cada vez.

Como usar a linguagem natural

Os modelos de linguagem grandes são ferramentas poderosas para várias tarefas. Muitas vezes, basta pedir ao Gemini o que você quer, e tudo vai dar certo.

A API Gemini não tem um modo JSON, então há alguns pontos a serem observados ao gerar estruturas de dados dessa maneira:

  • Às vezes, a análise falha.
  • Não é possível aplicar o esquema de maneira estrita.

Você resolverá esses problemas na próxima seção. Primeiro, teste um comando simples em linguagem natural com o esquema escrito como texto. Isso não foi otimizado:

model = model = model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest')

response = model.generate_content(
  textwrap.dedent("""\
    Please return JSON describing the the people, places, things and relationships from this story using the following schema:

    {"people": list[PERSON], "places":list[PLACE], "things":list[THING], "relationships": list[RELATIONSHIP]}

    PERSON = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    PLACE = {"name": str, "description": str}
    THING = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    RELATIONSHIP = {"person_1_name": str, "person_2_name": str, "relationship": str}

    All fields are required.

    Important: Only return a single piece of valid JSON text.

    Here is the story:

    """) + story,
  generation_config={'response_mime_type':'application/json'}
)
response.text
'{"people": [\n    {\n        "name": "Anya",\n        "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Elise",\n        "description": "Anya\'s kind-hearted mother",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Edward",\n        "description": "Anya\'s wise-bearded father",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Samuel",\n        "description": "Anya\'s best friend, a curious and adventurous boy with a mischievous grin.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Monster",\n        "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "places": [\n    {\n        "name": "Willow Creek",\n        "description": "A quaint town nestled amidst rolling hills and whispering willows."\n    },\n    {\n        "name": "Forest",\n        "description": "A shadowy place with rustling undergrowth and whispering trees."\n    },\n    {\n        "name": "Schoolhouse",\n        "description": "The only school in the town of Willow Creek."\n    },\n    {\n        "name": "Anya\'s home",\n        "description": "A modest cottage with a creaky wooden door."\n    }\n], "things": [\n    {\n        "name": "Magic backpack",\n        "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",\n        "start_place_name": "Anya\'s home",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shimmering sword",\n        "description": "A sword that shimmered in the sunlight and could strike with blinding light.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Book of ancient spells",\n        "description": "A book that contained ancient spells.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Tiny compass",\n        "description": "A compass that always pointed north.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Magical key",\n        "description": "A key that could open any lock.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shattered crystals",\n        "description": "The remains of the monster after it was defeated by Anya\'s magic backpack.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "relationships": [\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Elise",\n        "relationship": "mother-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Edward",\n        "relationship": "father-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Samuel",\n        "relationship": "best friends"\n    }\n]}'

Isso retornou uma string JSON. Tente analisar:

import json

print(json.dumps(json.loads(response.text), indent=4))
{
    "people": [
        {
            "name": "Anya",
            "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Elise",
            "description": "Anya's kind-hearted mother",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Edward",
            "description": "Anya's wise-bearded father",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Samuel",
            "description": "Anya's best friend, a curious and adventurous boy with a mischievous grin.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Monster",
            "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "places": [
        {
            "name": "Willow Creek",
            "description": "A quaint town nestled amidst rolling hills and whispering willows."
        },
        {
            "name": "Forest",
            "description": "A shadowy place with rustling undergrowth and whispering trees."
        },
        {
            "name": "Schoolhouse",
            "description": "The only school in the town of Willow Creek."
        },
        {
            "name": "Anya's home",
            "description": "A modest cottage with a creaky wooden door."
        }
    ],
    "things": [
        {
            "name": "Magic backpack",
            "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",
            "start_place_name": "Anya's home",
            "end_place_name": "Forest"
        },
        {
            "name": "Shimmering sword",
            "description": "A sword that shimmered in the sunlight and could strike with blinding light.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Book of ancient spells",
            "description": "A book that contained ancient spells.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Tiny compass",
            "description": "A compass that always pointed north.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Magical key",
            "description": "A key that could open any lock.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Shattered crystals",
            "description": "The remains of the monster after it was defeated by Anya's magic backpack.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "relationships": [
        {
            "person_1_name": "Anya",
            "person_2_name": "Elise",
            "relationship": "mother-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Edward",
            "relationship": "father-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Samuel",
            "relationship": "best friends"
        }
    ]
}

Isso é relativamente simples e muitas vezes funciona, mas é possível tornar o processo mais rigoroso/robusto definindo o esquema usando o recurso de chamada de função da API.

Usar chamada de funções

Se você ainda não concluiu o tutorial Noções básicas sobre chamadas de funções, faça isso primeiro.

Quando a função chama sua função e os parâmetros dela são descritos para a API como um glm.FunctionDeclaration. Em casos básicos, o SDK pode criar a FunctionDeclaration usando a função e as anotações dela. No momento, o SDK não processa a descrição dos parâmetros OBJECT (dict) aninhados. Portanto, você precisará defini-los explicitamente, por enquanto.

Definir o esquema

Comece definindo person como um objeto com os campos de string name, description, start_place_name e end_place_name.

person = glm.Schema(
    type = glm.Type.OBJECT,
    properties = {
        'name':  glm.Schema(type=glm.Type.STRING),
        'description':  glm.Schema(type=glm.Type.STRING),
        'start_place_name': glm.Schema(type=glm.Type.STRING),
        'end_place_name': glm.Schema(type=glm.Type.STRING)
    },
    required=['name', 'description', 'start_place_name', 'end_place_name']
)

Em seguida, defina pessoas como uma ARRAY de objetos person:

people = glm.Schema(
    type=glm.Type.ARRAY,
    items=person
)

Em seguida, faça o mesmo para cada uma das entidades que você está tentando extrair:

place = glm.Schema(
    type = glm.Type.OBJECT,
    properties = {
        'name':  glm.Schema(type=glm.Type.STRING),
        'description':  glm.Schema(type=glm.Type.STRING),
    }
)

places = glm.Schema(
    type=glm.Type.ARRAY,
    items=place
)
thing = glm.Schema(
  type = glm.Type.OBJECT,
  properties = {
      'name':  glm.Schema(type=glm.Type.STRING),
      'description':  glm.Schema(type=glm.Type.STRING),
  }
)

things = glm.Schema(
    type=glm.Type.ARRAY,
    items=thing
)
relationship = glm.Schema(
    type = glm.Type.OBJECT,
    properties = {
        'person_1_name':  glm.Schema(type=glm.Type.STRING),
        'person_2_name':  glm.Schema(type=glm.Type.STRING),
        'relationship':  glm.Schema(type=glm.Type.STRING),
    }
)

relationships = glm.Schema(
    type=glm.Type.ARRAY,
    items=relationship
)

Agora, crie a FunctionDeclaration:

add_to_database = glm.FunctionDeclaration(
    name="add_to_database",
    description=textwrap.dedent("""\
        Adds entities to the database.
        """),
    parameters=glm.Schema(
        type=glm.Type.OBJECT,
        properties = {
            'people': people,
            'places': places,
            'things': things,
            'relationships': relationships
        }
    )
)

Chamar a API

Como você viu em Noções básicas de chamada de função, agora é possível transmitir esse FunctionDeclaration para o argumento tools do construtor genai.GenerativeModel. O construtor também aceita uma representação JSON equivalente da declaração da função:

model = model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    tools = [add_to_database])

Toda vez que você chamar a API, o SDK enviará as ferramentas com seu comando, e o modelo deve chamar a função definida:

result = model.generate_content(f"""
Please add the people, places, things, and relationships from this story to the database:

{story}
""",
# Force a function call
tool_config={'function_calling_config':'ANY'})

Agora não há texto para analisar. O resultado é uma estrutura de dados.

'text' in result.candidates[0].content.parts[0]
False
'function_call' in result.candidates[0].content.parts[0]
True
fc = result.candidates[0].content.parts[0].function_call
print(type(fc))
<class 'google.ai.generativelanguage_v1beta.types.content.FunctionCall'>

A classe glm.FunctionCall é baseada nos buffers de protocolo do Google. Converta-a em um objeto compatível com JSON mais conhecido:

print(json.dumps(type(fc).to_dict(fc), indent=4))
{
    "name": "add_to_database",
    "args": {
        "things": [
            {
                "name": "Magical Backpack",
                "description": "Anya's prized possession, the Magical Backpack, is no ordinary satchel. Its soft, emerald-green fabric shimmers with an ethereal glow, and its leather straps have secrets that only Anya knows. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever."
            },
            {
                "name": "Shimmering Sword",
                "description": "Among the wonders in Anya's Magical Backpack, lies a shimmering sword. With a determined gleam in her eye, she retrieved the shimmering sword and charged towards the monster."
            },
            {
                "description": "Residing within the Magical Backpack, the Book of Ancient Spells holds secrets untold.",
                "name": "Book of Ancient Spells"
            },
            {
                "description": "Tucked away in the Magical Backpack is a tiny compass that always points north.",
                "name": "Tiny Compass that Always Points North"
            },
            {
                "description": "Hidden within the Magical Backpack is a magical key that can open any lock.",
                "name": "Magical Key that Can Open Any Lock"
            }
        ],
        "relationships": [
            {
                "relationship": "Mother-Daughter",
                "person_1_name": "Anya",
                "person_2_name": "Elise"
            },
            {
                "person_2_name": "Edward",
                "relationship": "Father-Daughter",
                "person_1_name": "Anya"
            },
            {
                "person_2_name": "Samuel",
                "person_1_name": "Anya",
                "relationship": "Best Friends"
            }
        ],
        "people": [
            {
                "name": "Anya",
                "description": "Anya, the main character of the story, is a young girl with a magical backpack.",
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown"
            },
            {
                "name": "Elise",
                "description": "Anya's mother, Elise is a kind-hearted woman.",
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek"
            },
            {
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown",
                "name": "Edward",
                "description": "Anya's father, Edward is a wise-bearded man."
            },
            {
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek",
                "description": "Anya's best friend, Samuel is a curious and adventurous boy with a mischievous grin.",
                "name": "Samuel"
            }
        ],
        "places": [
            {
                "description": "The quaint town of Willow Creek is nestled amidst rolling hills and whispering willows.",
                "name": "Willow Creek"
            },
            {
                "description": "The town's only schoolhouse.",
                "name": "Schoolhouse"
            },
            {
                "description": "A shadowy place filled with secrets and dangers, the Forest is home to a terrifying monster.",
                "name": "Forest"
            }
        ]
    }
}

Conclusão

Embora a API possa lidar com problemas de extração de dados estruturados com entrada de texto puro e saída de texto, o uso da chamada de função provavelmente é mais confiável, porque permite definir um esquema rígido e elimina uma etapa de análise potencialmente propensa a erros.