関数呼び出しを使用して構造化データを抽出する

Google AI で表示 Colab ノートブックを試す GitHub のノートブックを表示

このチュートリアルでは、Gemini API を使用してストーリーからキャラクター、関係、もの、場所のリストを抽出する構造化データ抽出の例を見ていきます。

セットアップ

pip install -U -q google-generativeai
import pathlib
import textwrap

import google.generativeai as genai


from IPython.display import display
from IPython.display import Markdown

from google.api_core import retry

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

API キーを取得したら、SDK に渡します。作成する方法は次の 2 つです。

  • 鍵を GOOGLE_API_KEY 環境変数に設定します(SDK はそこから自動的に取得します)。
  • 鍵を genai.configure(api_key=...) に渡す
genai.configure(api_key=GOOGLE_API_KEY)

タスクの例

このチュートリアルでは、自然言語のストーリーからエンティティを抽出します。として 下は Gemini が執筆した文章です

new_story = False

if new_story:
  model = genai.GenerativeModel(model_name='models/gemini-1.5-pro-latest')

  response = model.generate_content("""
      Write a long story about a girl with magic backpack, her family, and at
      least one other charater. Make sure everyone has names. Don't forget to
      describe the contents of the backpack, and where everyone and everything
      starts and ends up.""", request_options={'retry': retry.Retry()})
  story = response.text
  print(response.candidates[0].citation_metadata)
else:
  story = """In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n\nHanded down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n\nAnya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. "Remember, my dear," whispered her mother, "use your magic wisely and for good." Her father added, "Always seek knowledge, and let the backpack be your trusted companion."\n\nWith a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. "Hey, Anya," he called out. "Can I see your backpack?"\n\nAnya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n\nTogether, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n\n"What's wrong?" she asked.\n\nA tall, lanky boy stepped forward. "There's a monster in the forest," he stammered. "It's been terrorizing the town, attacking animals and even people."\n\nAnya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n\nWithout a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. "Don't worry," she said, her voice steady. "I'll take care of it."\n\nWith Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n\nSuddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n\nFear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n\nWhen the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n\nAs she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time."""
to_markdown(story)

なだらかな丘とささやきのヤナギに囲まれ、ウィロー クリークの趣のある町に、アーニャという少女がいました。簡素な家屋のきしむような木造のドアから出ていくと、彼女は興奮と期待でドキドキしました。今日は学校の初日で、彼女は貴重な財産である魔法のバックパックを披露するのが待ちきれませんでした。

祖母から彼女に伝えられたバックパックは、普通のバッグではなかった。柔らかなエメラルドグリーンの生地は、幻想的な輝きできらめき、革製のストラップには、アニヤだけが知っている秘密が隠されています。その広々とした内部には、彼女の想像力を刺激し、彼女の人生を永遠に変えてしまう不思議に満ちた魔法の世界が広がっています。

アーニャの両親、優しいエリーズと賢いひげのエドワードは、温かい抱きしめながら別れを告げました。「忘れて」母親が「魔法を賢く、良いために使って」とささやきました。彼女の父親は「常に知識を求め、バックパックを信頼できる仲間にしよう」と付け加えました。

アーニャは歩数を足さず、町で唯一の学校へ向かって出発しました。途中で、彼女は親友のサミュエルを通り過ぎました。サミュエルは、好奇心が強く、冒険好きな男の子で、いたずら好きな笑顔の少年です。「こんにちは、アーヤ」と述べました。「バックパックを見せて」

アヤはしばらくためらいがあったが、その前にフラップを開けて中身を見せた。サミュエルは目を見開き、驚きで中を見つめました。そこには、鉛筆とノートの間に、きらめく剣、古代の呪文の本、常に北を指す小さなコンパス、あらゆる鍵を開ける魔法の鍵があった。

2 人は一体となり、バックパックの不思議に驚き、秘密を守ることを約束しました。校舎に近づくと、年上の子どもたちのグループが並んで集まり、その顔が恐怖で刻まれていることに気づきました。好奇心がどんどん上がってきて、彼女は慎重にアプローチした。

「どうしたんですか?」と尋ねました。

背の高い、おしゃれした少年が前に一歩進んでいきました。「森にモンスターがいる」彼は口を開けました。「町を恐怖に陥れ、動物、さらには人を襲っています。」

アーニャの心臓が落ちてしまいました。ウィロー クリークの町は小さく穏やかな街で、怪物を思い浮かべたことで背骨が震えました。彼女は、自分の家族や友人を守るために何かをしなければならないとわかっていました。

Anya はためらうことなくバックパックを開け、きらめく剣を手に入れました。彼女は決意の光を放つ目で、恐れている仲間たちに目を向けました。「ご安心ください」彼女はこう言った。「私が担当します。」

サミュエルを背後に置いて、アーニャは森の奥深くに入り込みました。通り過ぎるとき、木々は秘密をささやいているように見え、下草は見知らぬ生き物であざやかにあらわれました。彼らが森の奥深くに歩いてくると、空気は重くなり、足元の地面は震えました。

突然、彼らは空き地にやってきました。目の前には怪物がありました。鋭い歯、光る赤い目、そして容易に人間を倒すことができる爪を持つ巨大な野獣です。この生き物は、森を真ん中に揺らすような音を立てました。

アニヤは恐怖があふれましたが、彼女はそれを食べることを拒否しました。彼女はさばから剣を抜き、怪物に向かって突っ込んだ。その刃が太陽の光に照らされてきらめき、それが獣の皮に当たると、まばたきの光が輝き、すべてを包んでいました。

光が消えると、怪物は消え、代わりに粉々になった水晶の山がありました。アーニャはリュックサックの魔法でその生き物を倒し、ごく小さな物でも最大の力を持てることを証明しました。

彼女とサミュエルが町に戻ると、彼らは英雄として迎えられました。ウィロー クリークの人々は歓喜し、魔法のバックパックを持つ少女、アーニャの伝説は世代を超えて受け継がれました。それでアーニャは冒険を続け、バックパックの素晴らしさを使って世界をより良い場所にするために、魔法のように一歩ずつ着実に歩みを進めました。

自然言語の使用

大規模言語モデルは、パワフルなマルチタスク ツールです。多くの場合、求めているものを Gemini に尋ねるだけで、うまく機能します。

Gemini API には JSON モードがないため、この方法でデータ構造を生成する際は、以下の点に注意する必要があります。

  • 解析が失敗することがあります。
  • スキーマを厳密に適用することはできません。

次のセクションでは、これらの問題を解決します。まず、スキーマをテキストとして記述した、簡単な自然言語プロンプトを試してみましょう。これは最適化されていません。

model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest')

response = model.generate_content(
  textwrap.dedent("""\
    Please return JSON describing the the people, places, things and relationships from this story using the following schema:

    {"people": list[PERSON], "places":list[PLACE], "things":list[THING], "relationships": list[RELATIONSHIP]}

    PERSON = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    PLACE = {"name": str, "description": str}
    THING = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
    RELATIONSHIP = {"person_1_name": str, "person_2_name": str, "relationship": str}

    All fields are required.

    Important: Only return a single piece of valid JSON text.

    Here is the story:

    """) + story,
  generation_config={'response_mime_type':'application/json'}
)
response.text
'{"people": [\n    {\n        "name": "Anya",\n        "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Elise",\n        "description": "Anya\'s kind-hearted mother",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Edward",\n        "description": "Anya\'s wise-bearded father",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Samuel",\n        "description": "Anya\'s best friend, a curious and adventurous boy with a mischievous grin.",\n        "start_place_name": "Willow Creek",\n        "end_place_name": "Willow Creek"\n    },\n    {\n        "name": "Monster",\n        "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "places": [\n    {\n        "name": "Willow Creek",\n        "description": "A quaint town nestled amidst rolling hills and whispering willows."\n    },\n    {\n        "name": "Forest",\n        "description": "A shadowy place with rustling undergrowth and whispering trees."\n    },\n    {\n        "name": "Schoolhouse",\n        "description": "The only school in the town of Willow Creek."\n    },\n    {\n        "name": "Anya\'s home",\n        "description": "A modest cottage with a creaky wooden door."\n    }\n], "things": [\n    {\n        "name": "Magic backpack",\n        "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",\n        "start_place_name": "Anya\'s home",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shimmering sword",\n        "description": "A sword that shimmered in the sunlight and could strike with blinding light.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Book of ancient spells",\n        "description": "A book that contained ancient spells.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Tiny compass",\n        "description": "A compass that always pointed north.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Magical key",\n        "description": "A key that could open any lock.",\n        "start_place_name": "Magic backpack",\n        "end_place_name": "Forest"\n    },\n    {\n        "name": "Shattered crystals",\n        "description": "The remains of the monster after it was defeated by Anya\'s magic backpack.",\n        "start_place_name": "Forest",\n        "end_place_name": "Forest"\n    }\n], "relationships": [\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Elise",\n        "relationship": "mother-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Edward",\n        "relationship": "father-daughter"\n    },\n    {\n        "person_1_name": "Anya",\n        "person_2_name": "Samuel",\n        "relationship": "best friends"\n    }\n]}'

これで JSON 文字列が返されました。これを解析してみましょう。

import json

print(json.dumps(json.loads(response.text), indent=4))
{
    "people": [
        {
            "name": "Anya",
            "description": "A young girl who lives in the town of Willow Creek with her parents, Elise and Edward. She possesses a magical backpack that was handed down to her from her grandmother.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Elise",
            "description": "Anya's kind-hearted mother",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Edward",
            "description": "Anya's wise-bearded father",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Samuel",
            "description": "Anya's best friend, a curious and adventurous boy with a mischievous grin.",
            "start_place_name": "Willow Creek",
            "end_place_name": "Willow Creek"
        },
        {
            "name": "Monster",
            "description": "A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "places": [
        {
            "name": "Willow Creek",
            "description": "A quaint town nestled amidst rolling hills and whispering willows."
        },
        {
            "name": "Forest",
            "description": "A shadowy place with rustling undergrowth and whispering trees."
        },
        {
            "name": "Schoolhouse",
            "description": "The only school in the town of Willow Creek."
        },
        {
            "name": "Anya's home",
            "description": "A modest cottage with a creaky wooden door."
        }
    ],
    "things": [
        {
            "name": "Magic backpack",
            "description": "A magical backpack that was handed down to Anya from her grandmother. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew.",
            "start_place_name": "Anya's home",
            "end_place_name": "Forest"
        },
        {
            "name": "Shimmering sword",
            "description": "A sword that shimmered in the sunlight and could strike with blinding light.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Book of ancient spells",
            "description": "A book that contained ancient spells.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Tiny compass",
            "description": "A compass that always pointed north.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Magical key",
            "description": "A key that could open any lock.",
            "start_place_name": "Magic backpack",
            "end_place_name": "Forest"
        },
        {
            "name": "Shattered crystals",
            "description": "The remains of the monster after it was defeated by Anya's magic backpack.",
            "start_place_name": "Forest",
            "end_place_name": "Forest"
        }
    ],
    "relationships": [
        {
            "person_1_name": "Anya",
            "person_2_name": "Elise",
            "relationship": "mother-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Edward",
            "relationship": "father-daughter"
        },
        {
            "person_1_name": "Anya",
            "person_2_name": "Samuel",
            "relationship": "best friends"
        }
    ]
}

これは比較的シンプルであり、多くの場合はうまくいきますが、API の関数呼び出し機能を使用してスキーマを定義することで、より厳格かつ堅牢にできる可能性もあります。

関数呼び出しを使用する

関数呼び出しの基本チュートリアルをまだ完了していない場合は、まず完了してください。

関数呼び出しでは、関数とそのパラメータが API に記述される genai.protos.FunctionDeclaration として。基本的なケースでは、SDK はコンテナを 関数とそのアノテーションからの FunctionDeclaration。まず 現時点では、これらを明示的に定義する必要があります。

スキーマを定義する

まず、文字列フィールド namedescriptionstart_place_nameend_place_name を含むオブジェクトとして person を定義します。

person = genai.protos.Schema(
    type = genai.protos.Type.OBJECT,
    properties = {
        'name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'description':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'start_place_name': genai.protos.Schema(type=genai.protos.Type.STRING),
        'end_place_name': genai.protos.Schema(type=genai.protos.Type.STRING)
    },
    required=['name', 'description', 'start_place_name', 'end_place_name']
)

次に、人を person オブジェクトの ARRAY として定義します。

people = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=person
)

次に、抽出するエンティティごとに同じことを行います。

place = genai.protos.Schema(
    type = genai.protos.Type.OBJECT,
    properties = {
        'name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'description':  genai.protos.Schema(type=genai.protos.Type.STRING),
    }
)

places = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=place
)
thing = genai.protos.Schema(
  type = genai.protos.Type.OBJECT,
  properties = {
      'name':  genai.protos.Schema(type=genai.protos.Type.STRING),
      'description':  genai.protos.Schema(type=genai.protos.Type.STRING),
  }
)

things = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=thing
)
relationship = genai.protos.Schema(
    type = genai.protos.Type.OBJECT,
    properties = {
        'person_1_name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'person_2_name':  genai.protos.Schema(type=genai.protos.Type.STRING),
        'relationship':  genai.protos.Schema(type=genai.protos.Type.STRING),
    }
)

relationships = genai.protos.Schema(
    type=genai.protos.Type.ARRAY,
    items=relationship
)

FunctionDeclaration をビルドします。

add_to_database = genai.protos.FunctionDeclaration(
    name="add_to_database",
    description=textwrap.dedent("""\
        Adds entities to the database.
        """),
    parameters=genai.protos.Schema(
        type=genai.protos.Type.OBJECT,
        properties = {
            'people': people,
            'places': places,
            'things': things,
            'relationships': relationships
        }
    )
)

API を呼び出す

関数呼び出しの基本で説明したように、この FunctionDeclarationgenai.GenerativeModel コンストラクタの tools 引数に渡すことができるようになりました(コンストラクタは、関数宣言の同等の JSON 表現も受け入れます)。

model = genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    tools = [add_to_database])

API を呼び出すたびに、SDK はプロンプトとともにツールを送信し、モデルは定義した関数を呼び出します。

result = model.generate_content(f"""
Please add the people, places, things, and relationships from this story to the database:

{story}
""",
# Force a function call
tool_config={'function_calling_config':'ANY'})

これで、解析するテキストはなくなります。結果はデータ構造です。

'text' in result.candidates[0].content.parts[0]
False
'function_call' in result.candidates[0].content.parts[0]
True
fc = result.candidates[0].content.parts[0].function_call
print(type(fc))
<class 'google.ai.generativelanguage_v1beta.types.content.FunctionCall'>

genai.protos.FunctionCall クラスは Google Protocol Buffers に基づいています。 おなじみの JSON 互換オブジェクトに変換します。

print(json.dumps(type(fc).to_dict(fc), indent=4))
{
    "name": "add_to_database",
    "args": {
        "things": [
            {
                "name": "Magical Backpack",
                "description": "Anya's prized possession, the Magical Backpack, is no ordinary satchel. Its soft, emerald-green fabric shimmers with an ethereal glow, and its leather straps have secrets that only Anya knows. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever."
            },
            {
                "name": "Shimmering Sword",
                "description": "Among the wonders in Anya's Magical Backpack, lies a shimmering sword. With a determined gleam in her eye, she retrieved the shimmering sword and charged towards the monster."
            },
            {
                "description": "Residing within the Magical Backpack, the Book of Ancient Spells holds secrets untold.",
                "name": "Book of Ancient Spells"
            },
            {
                "description": "Tucked away in the Magical Backpack is a tiny compass that always points north.",
                "name": "Tiny Compass that Always Points North"
            },
            {
                "description": "Hidden within the Magical Backpack is a magical key that can open any lock.",
                "name": "Magical Key that Can Open Any Lock"
            }
        ],
        "relationships": [
            {
                "relationship": "Mother-Daughter",
                "person_1_name": "Anya",
                "person_2_name": "Elise"
            },
            {
                "person_2_name": "Edward",
                "relationship": "Father-Daughter",
                "person_1_name": "Anya"
            },
            {
                "person_2_name": "Samuel",
                "person_1_name": "Anya",
                "relationship": "Best Friends"
            }
        ],
        "people": [
            {
                "name": "Anya",
                "description": "Anya, the main character of the story, is a young girl with a magical backpack.",
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown"
            },
            {
                "name": "Elise",
                "description": "Anya's mother, Elise is a kind-hearted woman.",
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek"
            },
            {
                "start_place_name": "Willow Creek",
                "end_place_name": "Unknown",
                "name": "Edward",
                "description": "Anya's father, Edward is a wise-bearded man."
            },
            {
                "end_place_name": "Unknown",
                "start_place_name": "Willow Creek",
                "description": "Anya's best friend, Samuel is a curious and adventurous boy with a mischievous grin.",
                "name": "Samuel"
            }
        ],
        "places": [
            {
                "description": "The quaint town of Willow Creek is nestled amidst rolling hills and whispering willows.",
                "name": "Willow Creek"
            },
            {
                "description": "The town's only schoolhouse.",
                "name": "Schoolhouse"
            },
            {
                "description": "A shadowy place filled with secrets and dangers, the Forest is home to a terrifying monster.",
                "name": "Forest"
            }
        ]
    }
}

まとめ

API は純粋なテキスト入力とテキスト出力で構造化データ抽出の問題を処理できますが、関数呼び出しを使用すると、厳密なスキーマを定義でき、エラーが発生しやすい解析手順が不要になるため、信頼性が高くなる傾向があります。