使用函数调用提取结构化数据

在 Google AI 上查看 在 Google Colab 中运行 查看 GitHub 上的源代码

在本教程中,您将完成一个结构化数据提取示例,使用 Gemini API 从故事中提取角色、关系、事物和地点的列表。

初始设置

pip install -U -q google-generativeai
import pathlib
import textwrap

import google.generativeai as genai
import google.ai.generativelanguage as glm


from IPython.display import display
from IPython.display import Markdown

from google.api_core import retry

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

获得 API 密钥后,将其传递给 SDK。可以通过以下两种方法实现此目的:

  • 将密钥放在 GOOGLE_API_KEY 环境变量中(SDK 会自动从该变量中获取密钥)。
  • 将密钥传递给 genai.configure(api_key=...)
try:
    # Used to securely store your API key
    from google.colab import userdata

    # Or use `os.getenv('API_KEY')` to fetch an environment variable.
    GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
except ImportError:
    import os
    GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']

genai.configure(api_key=GOOGLE_API_KEY)

示例任务

在本教程中,您将从自然语言故事中提取实体。例如,下面这个故事是由 Gemini 撰写的。

new_story = False

if new_story:
  model = genai.GenerativeModel(model_name='models/gemini-1.5-pro-latest')

  response = model.generate_content("""
      Write a long story about a girl with magic backpack, her family, and at
      least one other charater. Make sure everyone has names. Don't forget to
      describe the contents of the backpack, and where everyone and everything
      starts and ends up.""", request_options={'retry': retry.Retry()})
  story = response.text
  print(response.candidates[0].citation_metadata)
else:
  story = """In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n\nHanded down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n\nAnya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. "Remember, my dear," whispered her mother, "use your magic wisely and for good." Her father added, "Always seek knowledge, and let the backpack be your trusted companion."\n\nWith a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. "Hey, Anya," he called out. "Can I see your backpack?"\n\nAnya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n\nTogether, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n\n"What's wrong?" she asked.\n\nA tall, lanky boy stepped forward. "There's a monster in the forest," he stammered. "It's been terrorizing the town, attacking animals and even people."\n\nAnya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n\nWithout a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. "Don't worry," she said, her voice steady. "I'll take care of it."\n\nWith Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n\nSuddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n\nFear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n\nWhen the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n\nAs she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time."""
to_markdown(story)

在古朴的柳溪小镇,她坐落在连绵起伏的山丘和低声的柳树之间,她住着一位名叫 Anya 的女孩。走出简朴乡间别墅里发出咔哒声的木门时,她的心跳跳了,既激动又期待。今天是她上学的第一天,她迫不及待地想要炫耀一下自己的好礼 - 一个神奇的背包。

这个背包不是普通的书包,而是从祖母那里传给她。它柔软、翡翠绿色的面料闪烁着空灵的光芒,皮革表带中藏着只有 Anya 才知道的秘密。它宽敞开阔的内饰是一个迷人的世界,里面充满了奇观,可以激发她的想象力,并永远改变她的人生。

Anya 的父母、善良的 Elise 和聪明的爱德华在向她告别时热烈拥抱。“记住,亲爱的,”她妈妈轻声地说,“你应该明智地运用你的魔法,造福福祉。”她的父亲补充道:“始终寻求知识,让背包成为你信赖的伙伴。”

Anya 跳过了一会儿,前往镇上唯一的校舍。在路上,她认识了她最好的朋友 Samuel,这个好奇心强且喜欢冒险的男孩带着调皮的笑脸。他大声喊道,“嗨,Anya”。“我能看见你的背包吗?”

安雅犹豫了一会儿,才解开翻盖,露出里面的内容。Samuel 窥视内部时,眼睛惊呆了。然后,还有一把闪闪发光的利剑、一本古代咒语、一个始终指向北方的小罗盘和一个可以打开任意锁的魔法钥匙,各种物品里散落着这把铅笔和笔记本。

他们齐心协力,惊奇地发现背包的奇妙,并承诺保守其中的秘密。当他们接近校舍时,Anya 注意到一群年长的孩子挤在一起,他们的脸上刻满了恐惧。由于好奇心越来越强,她谨慎地走了过来。

“怎么了?”她问道。

一个瘦长的男孩向前走了。“森林里有个怪物,”他吃了一下。“它一直在袭击小镇、攻击动物,甚至还有人。”

Anya 的心坠落了。柳溪镇面积不大,很平静,一想到怪兽,她脑子里便不禁颤抖。她知道,她必须采取措施保护自己的家人和朋友。

Anya 没有一刻犹豫,打开了背包,拿起了闪闪发光的宝剑。她眼中露出坚定的一丝光芒,转而看着那些害怕的同伴。“别担心,”她说,声音平稳。“我会帮你处理的。”

在 Samuel 的身后,Anya 冒险进入了森林深处。当她经过时,这些树似乎在悄悄地悄悄告诉她秘密,而丛林里有看不见的生物。走进森林深处时,空气变得越来越厚,他们脚下的地面在颤抖。

突然,他们来到一块空地,眼前出现了一个怪物:一只巨大的野兽,牙齿锋利,眼睛发红,爪子可以轻而易举地压倒一个人。怪物发出了吼声,可是震撼到树核的雷声。

Anya 让恐惧愈发强烈,但她拒绝让这种现象吞噬。她从剑中拔出利剑,朝怪物冲击。刀刃在阳光下闪闪发光,在击打怪物的藏身时,一道闪烁的闪光点亮了,笼罩在它的光芒中。

光线昏暗时,怪兽消失了,取而代之的是一堆碎碎的水晶。Anya 借助背包的魔力击败了怪物,证明即使是最小的物体也能拥有最大的力量。

当她和 Samuel 返回小镇时,他们被视作英雄。柳溪的人们感到欣喜若狂,背着魔法背包的女孩安雅的传奇故事世世代代流传下来。就这样,Anya 继续踏上了冒险之旅,利用背包的魔法让世界变得更美好,一步一步地进行。

使用 Natural Language

大语言模型是一种强大的多任务工具。通常,你可以直接问 Gemini 你想要的东西,没关系。

Gemini API 没有 JSON 模式,因此以这种方式生成数据结构时需要注意以下几点:

  • 有时解析会失败。
  • 无法严格强制执行此架构。

您将在下一部分中解决这些问题。首先,尝试使用简单的自然语言提示,以文本形式写出架构。此配置尚未优化:

model = model = model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest')

response = model.generate_content(
  textwrap.dedent("""\
    Please return JSON describing the the people, places, things and relationships from this story using the following schema:

    {"people": list[PERSON], "places":list[PLACE], "things":list[THING], "relationships": list[RELATIONSHIP]}

    PERSON = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    PLACE = {"name": str, "description": str}
    THING = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    RELATIONSHIP = {"person_1_name": str, "person_2_name": str, "relationship": str}

    All fields are required.

    Important: Only return a single piece of valid JSON text.

    Here is the story:

    """) + story,
  generation_config={'response_mime_type':'application/json'}
)
response.text
'{"people": [\n    {\n        "name": "Anya",\n        "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Elise",\n        "description": "Anya\'s kind-hearted mother",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Edward",\n        "description": "Anya\'s wise-bearded father",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Samuel",\n        "description": "Anya\'s best friend, a curious and adventurous boy with a mischievous grin.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Monster",\n        "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "places": [\n    {\n        "name": "Willow Creek",\n        "description": "A quaint town nestled amidst rolling hills and whispering willows."\n    },\n    {\n        "name": "Forest",\n        "description": "A shadowy place with rustling undergrowth and whispering trees."\n    },\n    {\n        "name": "Schoolhouse",\n        "description": "The only school in the town of Willow Creek."\n    },\n    {\n        "name": "Anya\'s home",\n        "description": "A modest cottage with a creaky wooden door."\n    }\n], "things": [\n    {\n        "name": "Magic backpack",\n        "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",\n        "start_place_name": "Anya\'s home",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shimmering sword",\n        "description": "A sword that shimmered in the sunlight and could strike with blinding light.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Book of ancient spells",\n        "description": "A book that contained ancient spells.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Tiny compass",\n        "description": "A compass that always pointed north.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Magical key",\n        "description": "A key that could open any lock.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shattered crystals",\n        "description": "The remains of the monster after it was defeated by Anya\'s magic backpack.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "relationships": [\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Elise",\n        "relationship": "mother-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Edward",\n        "relationship": "father-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Samuel",\n        "relationship": "best friends"\n    }\n]}'

返回了一个 json 字符串。请尝试解析:

import json

print(json.dumps(json.loads(response.text), indent=4))
{
    "people": [
        {
            "name": "Anya",
            "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Elise",
            "description": "Anya's kind-hearted mother",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Edward",
            "description": "Anya's wise-bearded father",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Samuel",
            "description": "Anya's best friend, a curious and adventurous boy with a mischievous grin.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Monster",
            "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "places": [
        {
            "name": "Willow Creek",
            "description": "A quaint town nestled amidst rolling hills and whispering willows."
        },
        {
            "name": "Forest",
            "description": "A shadowy place with rustling undergrowth and whispering trees."
        },
        {
            "name": "Schoolhouse",
            "description": "The only school in the town of Willow Creek."
        },
        {
            "name": "Anya's home",
            "description": "A modest cottage with a creaky wooden door."
        }
    ],
    "things": [
        {
            "name": "Magic backpack",
            "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",
            "start_place_name": "Anya's home",
            "end_place_name": "Forest"
        },
        {
            "name": "Shimmering sword",
            "description": "A sword that shimmered in the sunlight and could strike with blinding light.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Book of ancient spells",
            "description": "A book that contained ancient spells.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Tiny compass",
            "description": "A compass that always pointed north.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Magical key",
            "description": "A key that could open any lock.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Shattered crystals",
            "description": "The remains of the monster after it was defeated by Anya's magic backpack.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "relationships": [
        {
            "person_1_name": "Anya",
            "person_2_name": "Elise",
            "relationship": "mother-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Edward",
            "relationship": "father-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Samuel",
            "relationship": "best friends"
        }
    ]
}

这种方法相对简单,并且通常有效,但您可以使用 API 的函数调用功能定义架构,使这种方法更加严格/稳健。

使用函数调用

如果您尚未学完函数调用基础知识教程,请务必先学完。

通过函数调用时,您的函数及其参数会以 glm.FunctionDeclaration 的形式描述给 API。基本情况下,SDK 可以根据函数及其注解构建 FunctionDeclaration。SDK 目前不处理嵌套 OBJECT (dict) 参数的说明。因此,目前您需要明确定义它们。

定义架构

首先,将 person 定义为具有字符串字段 namedescriptionstart_place_nameend_place_name 的对象。

person = glm.Schema(
    type = glm.Type.OBJECT,
    properties = {
        'name':  glm.Schema(type=glm.Type.STRING),
        'description':  glm.Schema(type=glm.Type.STRING),
        'start_place_name': glm.Schema(type=glm.Type.STRING),
        'end_place_name': glm.Schema(type=glm.Type.STRING)
    },
    required=['name', 'description', 'start_place_name', 'end_place_name']
)

然后,将人物定义为 person 对象的 ARRAY

people = glm.Schema(
    type=glm.Type.ARRAY,
    items=person
)

然后,对您尝试提取的每个实体执行相同的操作:

place = glm.Schema(
    type = glm.Type.OBJECT,
    properties = {
        'name':  glm.Schema(type=glm.Type.STRING),
        'description':  glm.Schema(type=glm.Type.STRING),
    }
)

places = glm.Schema(
    type=glm.Type.ARRAY,
    items=place
)
thing = glm.Schema(
  type = glm.Type.OBJECT,
  properties = {
      'name':  glm.Schema(type=glm.Type.STRING),
      'description':  glm.Schema(type=glm.Type.STRING),
  }
)

things = glm.Schema(
    type=glm.Type.ARRAY,
    items=thing
)
relationship = glm.Schema(
    type = glm.Type.OBJECT,
    properties = {
        'person_1_name':  glm.Schema(type=glm.Type.STRING),
        'person_2_name':  glm.Schema(type=glm.Type.STRING),
        'relationship':  glm.Schema(type=glm.Type.STRING),
    }
)

relationships = glm.Schema(
    type=glm.Type.ARRAY,
    items=relationship
)

现在,构建 FunctionDeclaration

add_to_database = glm.FunctionDeclaration(
    name="add_to_database",
    description=textwrap.dedent("""\
        Adds entities to the database.
        """),
    parameters=glm.Schema(
        type=glm.Type.OBJECT,
        properties = {
            'people': people,
            'places': places,
            'things': things,
            'relationships': relationships
        }
    )
)

调用该 API

就像您在函数调用基础知识中看到的一样,现在您可以将此 FunctionDeclaration 传递给 genai.GenerativeModel 构造函数的 tools 参数(该构造函数还将接受函数声明的等效 JSON 表示法):

model = model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    tools = [add_to_database])

每次您调用 API 时,SDK 都会将工具随提示一起发送,并且模型应调用您定义的函数:

result = model.generate_content(f"""
Please add the people, places, things, and relationships from this story to the database:

{story}
""",
# Force a function call
tool_config={'function_calling_config':'ANY'})

现在没有要解析的文本。结果就是一个数据结构。

'text' in result.candidates[0].content.parts[0]
False
'function_call' in result.candidates[0].content.parts[0]
True
fc = result.candidates[0].content.parts[0].function_call
print(type(fc))
<class 'google.ai.generativelanguage_v1beta.types.content.FunctionCall'>

glm.FunctionCall 类基于 Google Protocol Buffers,将其转换为更熟悉的兼容 JSON 的对象:

print(json.dumps(type(fc).to_dict(fc), indent=4))
{
    "name": "add_to_database",
    "args": {
        "things": [
            {
                "name": "Magical Backpack",
                "description": "Anya's prized possession, the Magical Backpack, is no ordinary satchel. Its soft, emerald-green fabric shimmers with an ethereal glow, and its leather straps have secrets that only Anya knows. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever."
            },
            {
                "name": "Shimmering Sword",
                "description": "Among the wonders in Anya's Magical Backpack, lies a shimmering sword. With a determined gleam in her eye, she retrieved the shimmering sword and charged towards the monster."
            },
            {
                "description": "Residing within the Magical Backpack, the Book of Ancient Spells holds secrets untold.",
                "name": "Book of Ancient Spells"
            },
            {
                "description": "Tucked away in the Magical Backpack is a tiny compass that always points north.",
                "name": "Tiny Compass that Always Points North"
            },
            {
                "description": "Hidden within the Magical Backpack is a magical key that can open any lock.",
                "name": "Magical Key that Can Open Any Lock"
            }
        ],
        "relationships": [
            {
                "relationship": "Mother-Daughter",
                "person_1_name": "Anya",
                "person_2_name": "Elise"
            },
            {
                "person_2_name": "Edward",
                "relationship": "Father-Daughter",
                "person_1_name": "Anya"
            },
            {
                "person_2_name": "Samuel",
                "person_1_name": "Anya",
                "relationship": "Best Friends"
            }
        ],
        "people": [
            {
                "name": "Anya",
                "description": "Anya, the main character of the story, is a young girl with a magical backpack.",
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown"
            },
            {
                "name": "Elise",
                "description": "Anya's mother, Elise is a kind-hearted woman.",
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek"
            },
            {
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown",
                "name": "Edward",
                "description": "Anya's father, Edward is a wise-bearded man."
            },
            {
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek",
                "description": "Anya's best friend, Samuel is a curious and adventurous boy with a mischievous grin.",
                "name": "Samuel"
            }
        ],
        "places": [
            {
                "description": "The quaint town of Willow Creek is nestled amidst rolling hills and whispering willows.",
                "name": "Willow Creek"
            },
            {
                "description": "The town's only schoolhouse.",
                "name": "Schoolhouse"
            },
            {
                "description": "A shadowy place filled with secrets and dangers, the Forest is home to a terrifying monster.",
                "name": "Forest"
            }
        ]
    }
}

总结

虽然此 API 可以处理纯文本输入和文本输出的结构化数据提取问题,但使用函数调用可能会更可靠,因为它允许您定义严格的架构,并消除了可能容易出错的解析步骤。