使用函数调用提取结构化数据

在 Google AI 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码

在本教程中,您将完成一个结构化数据提取示例,了解如何使用 Gemini API 从故事中提取角色、关系、事物和地点的列表。

设置

pip install -U -q google-generativeai
import pathlib
import textwrap

import google.generativeai as genai


from IPython.display import display
from IPython.display import Markdown

from google.api_core import retry

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

获得 API 密钥后,将其传递给 SDK。可以通过以下两种方法实现此目的:

  • 将密钥放在 GOOGLE_API_KEY 环境变量中(SDK 会自动从中选取密钥)。
  • 将密钥传递给 genai.configure(api_key=...)
try:
    # Used to securely store your API key
    from google.colab import userdata

    # Or use `os.getenv('API_KEY')` to fetch an environment variable.
    GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
except ImportError:
    import os
    GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']

genai.configure(api_key=GOOGLE_API_KEY)

示例任务

在本教程中,您将从自然语言故事中提取实体。例如,下面是 Gemini 撰写的故事。

new_story = False

if new_story:
  model = genai.GenerativeModel(model_name='models/gemini-1.5-pro-latest')

  response = model.generate_content("""
      Write a long story about a girl with magic backpack, her family, and at
      least one other charater. Make sure everyone has names. Don't forget to
      describe the contents of the backpack, and where everyone and everything
      starts and ends up.""", request_options={'retry': retry.Retry()})
  story = response.text
  print(response.candidates[0].citation_metadata)
else:
  story = """In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n\nHanded down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n\nAnya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. "Remember, my dear," whispered her mother, "use your magic wisely and for good." Her father added, "Always seek knowledge, and let the backpack be your trusted companion."\n\nWith a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. "Hey, Anya," he called out. "Can I see your backpack?"\n\nAnya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n\nTogether, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n\n"What's wrong?" she asked.\n\nA tall, lanky boy stepped forward. "There's a monster in the forest," he stammered. "It's been terrorizing the town, attacking animals and even people."\n\nAnya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n\nWithout a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. "Don't worry," she said, her voice steady. "I'll take care of it."\n\nWith Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n\nSuddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n\nFear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n\nWhen the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n\nAs she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time."""
to_markdown(story)

在古雅的柳溪小镇,坐落在连绵起伏的山丘和低语的柳树中,住着一位名叫安雅的小女孩。当她走出一间质朴小屋的木屋时,会发出咔哒声。这时,她的心脏因兴奋和期待而跳过了一阵。今天是她上学的第一天,她迫不及待地想要炫耀一下她的珍贵财产 - 神奇的背包。

从祖母那里继承下来,这个背包不是普通的书包。它采用柔软的翡翠绿织物,闪烁着迷人的光芒,皮革表带中藏着只有 Anya 才知道的秘密。在宽敞的空间内,有一个神奇的世界,充满了奇迹,可以激发她的想象力,并永远改变她的生活。

安雅的父母、善良的 Elise 和聪明的爱德华,用温暖的拥抱为她道别了。“记住,亲爱的,”她的母亲轻声说道,“明智地使用你的魔法,使之受益”。她的父亲补充道:“始终寻求知识,让背包成为您信赖的伴侣。”

安雅跳着脚步,朝镇上唯一的校舍出发。在途中,她遇到了她最好的朋友 Samuel,他是一个好奇又喜欢冒险的男孩,一直顽皮地咧嘴笑。“嘿,Anya,”他大声喊道。“我能看到你的背包吗?”

阿雅犹豫了一会儿,然后才解开封盖,露出里面的内容。Samuel 透过内部窥探,惊讶地睁大眼睛。那里藏在铅笔和笔记本中,有一把闪闪发光的宝剑、一本古代咒语、一把始终指向北方的小罗盘,还有一把可以打开任何锁的魔法钥匙。

他们齐心协力,惊叹这个背包的奇妙,并承诺保护背包的秘密。当他们接近校舍时,Anya 注意到一群大孩子聚在一起,他们的脸上打着害怕的痕迹。好奇心越来越好,她小心翼翼地走近了。

她问:“怎么了?”

一个高个瘦小的男孩向前走去。“森林里有个怪物。”他结结在了眼里。“它一直在恐吓城镇、攻击动物甚至人类。”

Anya 的心沉了柳溪小镇平静平静,一想到怪兽,她的脊柱就会发抖。她知道,她必须做点什么来保护自己的家人和朋友。

Anya 毫不犹豫地打开背包,取回了闪闪发光的宝剑。她双眼坚定地闪着光芒,转向被惊恐的同伴们。她说:“别担心,”她的声音很稳定。“我会帮你搞定的。”

安雅就在塞缪尔的身后,冒险进入了森林的阴暗深处。她经过时,树木似乎在悄悄地传来秘密,灌木丛中散落着不为人知的生物。随着它们向森林深处走去,空气变大了,脚下的地面也开始颤动。

突然,他们来到一片空地,眼前是怪物,这是一头长着锋利牙齿、发光红眼和爪子的怪物,它们可以轻易压碎人类。生物大吼大叫,雷声将森林完整地震动。

Anya 的恐惧激增,但她不愿让它吞噬她。她从剑扣中拔出剑,向怪物发起冲刺。刀刃在阳光下闪闪发光,在它击打怪兽的藏身时,突然喷射了一道刺眼的光,把一切都包围在了光芒中。

夜幕降临后,怪兽消失了,取而代之的是一堆破碎的水晶。阿雅凭借背包的魔力击败了怪物,证明即使是最小的物体也能拥有最强大的力量。

当她和塞缪尔回到小镇时,受到了英勇的欢迎。柳溪的民众欢呼雀跃,背着神奇背包的女孩 Anya 的传奇传奇世代相传。于是,Anya 继续了她的冒险,用背包中的奇迹让世界变得更美好,一步一步,变得神奇。

使用自然语言

大语言模型是一种强大的多任务工具。通常情况下,你可以直接让 Gemini 回答你的问题,就行了。

Gemini API 没有 JSON 模式,因此以这种方式生成数据结构时需要注意以下几点:

  • 有时解析会失败。
  • 无法严格强制执行架构。

您将在下一部分中解决这些问题。首先,尝试用简单的自然语言提示将架构写成文本。这尚未进行优化:

model = model = model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest')

response = model.generate_content(
  textwrap.dedent("""\
    Please return JSON describing the the people, places, things and relationships from this story using the following schema:

    {"people": list[PERSON], "places":list[PLACE], "things":list[THING], "relationships": list[RELATIONSHIP]}

    PERSON = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    PLACE = {"name": str, "description": str}
    THING = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    RELATIONSHIP = {"person_1_name": str, "person_2_name": str, "relationship": str}

    All fields are required.

    Important: Only return a single piece of valid JSON text.

    Here is the story:

    """) + story,
  generation_config={'response_mime_type':'application/json'}
)
response.text
'{"people": [\n    {\n        "name": "Anya",\n        "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Elise",\n        "description": "Anya\'s kind-hearted mother",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Edward",\n        "description": "Anya\'s wise-bearded father",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Samuel",\n        "description": "Anya\'s best friend, a curious and adventurous boy with a mischievous grin.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Monster",\n        "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "places": [\n    {\n        "name": "Willow Creek",\n        "description": "A quaint town nestled amidst rolling hills and whispering willows."\n    },\n    {\n        "name": "Forest",\n        "description": "A shadowy place with rustling undergrowth and whispering trees."\n    },\n    {\n        "name": "Schoolhouse",\n        "description": "The only school in the town of Willow Creek."\n    },\n    {\n        "name": "Anya\'s home",\n        "description": "A modest cottage with a creaky wooden door."\n    }\n], "things": [\n    {\n        "name": "Magic backpack",\n        "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",\n        "start_place_name": "Anya\'s home",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shimmering sword",\n        "description": "A sword that shimmered in the sunlight and could strike with blinding light.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Book of ancient spells",\n        "description": "A book that contained ancient spells.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Tiny compass",\n        "description": "A compass that always pointed north.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Magical key",\n        "description": "A key that could open any lock.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shattered crystals",\n        "description": "The remains of the monster after it was defeated by Anya\'s magic backpack.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "relationships": [\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Elise",\n        "relationship": "mother-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Edward",\n        "relationship": "father-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Samuel",\n        "relationship": "best friends"\n    }\n]}'

返回了 json 字符串。请尝试解析:

import json

print(json.dumps(json.loads(response.text), indent=4))
{
    "people": [
        {
            "name": "Anya",
            "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Elise",
            "description": "Anya's kind-hearted mother",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Edward",
            "description": "Anya's wise-bearded father",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Samuel",
            "description": "Anya's best friend, a curious and adventurous boy with a mischievous grin.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Monster",
            "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "places": [
        {
            "name": "Willow Creek",
            "description": "A quaint town nestled amidst rolling hills and whispering willows."
        },
        {
            "name": "Forest",
            "description": "A shadowy place with rustling undergrowth and whispering trees."
        },
        {
            "name": "Schoolhouse",
            "description": "The only school in the town of Willow Creek."
        },
        {
            "name": "Anya's home",
            "description": "A modest cottage with a creaky wooden door."
        }
    ],
    "things": [
        {
            "name": "Magic backpack",
            "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",
            "start_place_name": "Anya's home",
            "end_place_name": "Forest"
        },
        {
            "name": "Shimmering sword",
            "description": "A sword that shimmered in the sunlight and could strike with blinding light.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Book of ancient spells",
            "description": "A book that contained ancient spells.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Tiny compass",
            "description": "A compass that always pointed north.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Magical key",
            "description": "A key that could open any lock.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Shattered crystals",
            "description": "The remains of the monster after it was defeated by Anya's magic backpack.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "relationships": [
        {
            "person_1_name": "Anya",
            "person_2_name": "Elise",
            "relationship": "mother-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Edward",
            "relationship": "father-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Samuel",
            "relationship": "best friends"
        }
    ]
}

这相对简单,并且通常也有效,但您可以使用该 API 的函数调用功能定义架构,从而使此做法更严格/更可靠。

使用函数调用

如果您尚未学习函数调用基础知识教程,请务必先完成此教程。

通过函数调用您的函数及其参数以 genai.protos.FunctionDeclaration 的形式向 API 进行描述。基本情况下,SDK 可以通过该函数及其注解构建 FunctionDeclaration。因此,现在您需要明确定义它们。

定义架构

首先,使用字符串字段 namedescriptionstart_place_nameend_place_nameperson 定义为一个对象。

person = genai.protos.Schema(
    type = genai.protos.Type.OBJECT,
    properties = {
        'name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'description':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'start_place_name': genai.protos.Schema(type=genai.protos.Type.STRING),
        'end_place_name': genai.protos.Schema(type=genai.protos.Type.STRING)
    },
    required=['name', 'description', 'start_place_name', 'end_place_name']
)

然后,将联系人定义为由 person 对象组成的 ARRAY

people = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=person
)

然后,对要提取的每个实体执行相同的操作:

place = genai.protos.Schema(
    type = genai.protos.Type.OBJECT,
    properties = {
        'name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'description':  genai.protos.Schema(type=genai.protos.Type.STRING),
    }
)

places = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=place
)
thing = genai.protos.Schema(
  type = genai.protos.Type.OBJECT,
  properties = {
      'name':  genai.protos.Schema(type=genai.protos.Type.STRING),
      'description':  genai.protos.Schema(type=genai.protos.Type.STRING),
  }
)

things = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=thing
)
relationship = genai.protos.Schema(
    type = genai.protos.Type.OBJECT,
    properties = {
        'person_1_name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'person_2_name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'relationship':  genai.protos.Schema(type=genai.protos.Type.STRING),
    }
)

relationships = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=relationship
)

现在,构建 FunctionDeclaration

add_to_database = genai.protos.FunctionDeclaration(
    name="add_to_database",
    description=textwrap.dedent("""\
        Adds entities to the database.
        """),
    parameters=genai.protos.Schema(
        type=genai.protos.Type.OBJECT,
        properties = {
            'people': people,
            'places': places,
            'things': things,
            'relationships': relationships
        }
    )
)

调用该 API

函数调用基础知识中所述,现在您可以将此 FunctionDeclaration 传递给 genai.GenerativeModel 构造函数的 tools 参数(构造函数也将接受函数声明的等效 JSON 表示法):

model = model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    tools = [add_to_database])

每次调用 API 时,SDK 都会随提示一起发送相应工具,并且模型应调用您定义的函数:

result = model.generate_content(f"""
Please add the people, places, things, and relationships from this story to the database:

{story}
""",
# Force a function call
tool_config={'function_calling_config':'ANY'})

现在没有要解析的文本。结果就是一个数据结构。

'text' in result.candidates[0].content.parts[0]
False
'function_call' in result.candidates[0].content.parts[0]
True
fc = result.candidates[0].content.parts[0].function_call
print(type(fc))
<class 'google.ai.generativelanguage_v1beta.types.content.FunctionCall'>

genai.protos.FunctionCall 类基于 Google Protocol Buffers,请将其转换为更熟悉的兼容 JSON 的对象:

print(json.dumps(type(fc).to_dict(fc), indent=4))
{
    "name": "add_to_database",
    "args": {
        "things": [
            {
                "name": "Magical Backpack",
                "description": "Anya's prized possession, the Magical Backpack, is no ordinary satchel. Its soft, emerald-green fabric shimmers with an ethereal glow, and its leather straps have secrets that only Anya knows. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever."
            },
            {
                "name": "Shimmering Sword",
                "description": "Among the wonders in Anya's Magical Backpack, lies a shimmering sword. With a determined gleam in her eye, she retrieved the shimmering sword and charged towards the monster."
            },
            {
                "description": "Residing within the Magical Backpack, the Book of Ancient Spells holds secrets untold.",
                "name": "Book of Ancient Spells"
            },
            {
                "description": "Tucked away in the Magical Backpack is a tiny compass that always points north.",
                "name": "Tiny Compass that Always Points North"
            },
            {
                "description": "Hidden within the Magical Backpack is a magical key that can open any lock.",
                "name": "Magical Key that Can Open Any Lock"
            }
        ],
        "relationships": [
            {
                "relationship": "Mother-Daughter",
                "person_1_name": "Anya",
                "person_2_name": "Elise"
            },
            {
                "person_2_name": "Edward",
                "relationship": "Father-Daughter",
                "person_1_name": "Anya"
            },
            {
                "person_2_name": "Samuel",
                "person_1_name": "Anya",
                "relationship": "Best Friends"
            }
        ],
        "people": [
            {
                "name": "Anya",
                "description": "Anya, the main character of the story, is a young girl with a magical backpack.",
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown"
            },
            {
                "name": "Elise",
                "description": "Anya's mother, Elise is a kind-hearted woman.",
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek"
            },
            {
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown",
                "name": "Edward",
                "description": "Anya's father, Edward is a wise-bearded man."
            },
            {
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek",
                "description": "Anya's best friend, Samuel is a curious and adventurous boy with a mischievous grin.",
                "name": "Samuel"
            }
        ],
        "places": [
            {
                "description": "The quaint town of Willow Creek is nestled amidst rolling hills and whispering willows.",
                "name": "Willow Creek"
            },
            {
                "description": "The town's only schoolhouse.",
                "name": "Schoolhouse"
            },
            {
                "description": "A shadowy place filled with secrets and dangers, the Forest is home to a terrifying monster.",
                "name": "Forest"
            }
        ]
    }
}

总结

虽然该 API 可以处理纯文本输入和文本输出的结构化数据提取问题,但使用函数调用可能会更可靠,因为它可让您定义严格的架构,并避免可能容易出错的解析步骤。