Research Agent có tích hợp Gemini 2.5 Pro và LlamaIndex

LlamaIndex là một khung để xây dựng các tác nhân tri thức bằng cách sử dụng các mô hình ngôn ngữ lớn (LLM) được kết nối với dữ liệu của bạn. Ví dụ này cho thấy cách tạo quy trình làm việc có nhiều tác nhân cho một Tác nhân nghiên cứu. Trong LlamaIndex, Workflows là các khối xây dựng của hệ thống tác nhân hoặc hệ thống nhiều tác nhân.

Bạn cần có khoá Gemini API. Nếu chưa có, bạn có thể tạo một khoá API trong Google AI Studio. Trước tiên, hãy cài đặt tất cả các thư viện LlamaIndex cần thiết.LlamaIndex sử dụng gói google-genai ở chế độ nền.

pip install llama-index llama-index-utils-workflow llama-index-llms-google-genai llama-index-tools-google

Thiết lập Gemini 2.5 Pro trong LlamaIndex

Công cụ của mọi tác nhân LlamaIndex là một LLM xử lý hoạt động suy luận và xử lý văn bản. Ví dụ này sử dụng Gemini 2.5 Pro. Đảm bảo rằng bạn đặt khoá API làm biến môi trường.

from llama_index.llms.google_genai import GoogleGenAI

llm = GoogleGenAI(model="gemini-2.5-pro")

Công cụ xây dựng

Các tác nhân sử dụng công cụ để tương tác với thế giới bên ngoài, chẳng hạn như tìm kiếm trên web hoặc lưu trữ thông tin. Các công cụ trong LlamaIndex có thể là các hàm Python thông thường hoặc được nhập từ ToolSpecs có sẵn. Gemini có một công cụ tích hợp để sử dụng Google Tìm kiếm và công cụ này được dùng ở đây.

from google.genai import types

google_search_tool = types.Tool(
    google_search=types.GoogleSearch()
)

llm_with_search = GoogleGenAI(
    model="gemini-2.5-pro",
    generation_config=types.GenerateContentConfig(tools=[google_search_tool])
)

Bây giờ, hãy kiểm thử thực thể LLM bằng một truy vấn yêu cầu tìm kiếm:

response = llm_with_search.complete("What's the weather like today in Biarritz?")
print(response)

Research Agent sẽ sử dụng các hàm Python làm công cụ. Có rất nhiều cách để bạn xây dựng một hệ thống thực hiện tác vụ này. Trong ví dụ này, bạn sẽ sử dụng những thông tin sau:

search_web sử dụng Gemini với Google Tìm kiếm để tìm kiếm thông tin trên web về chủ đề đã cho.
record_notes lưu trữ thông tin nghiên cứu tìm thấy trên web vào trạng thái để các công cụ khác có thể sử dụng thông tin đó.
write_report viết báo cáo bằng thông tin mà ResearchAgent tìm thấy
review_report xem xét báo cáo và đưa ra ý kiến phản hồi.

Lớp Context truyền trạng thái giữa các tác nhân/công cụ và mỗi tác nhân sẽ có quyền truy cập vào trạng thái hiện tại của hệ thống.

from llama_index.core.workflow import Context

async def search_web(ctx: Context, query: str) -> str:
    """Useful for searching the web about a specific query or topic"""
    response = await llm_with_search.acomplete(f"""Please research given this query or topic,
    and return the result\n<query_or_topic>{query}</query_or_topic>""")
    return response

async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
    """Useful for recording notes on a given topic."""
    current_state = await ctx.store.get("state")
    if "research_notes" not in current_state:
        current_state["research_notes"] = {}
    current_state["research_notes"][notes_title] = notes
    await ctx.store.set("state", current_state)
    return "Notes recorded."

async def write_report(ctx: Context, report_content: str) -> str:
    """Useful for writing a report on a given topic."""
    current_state = await ctx.store.get("state")
    current_state["report_content"] = report_content
    await ctx.store.set("state", current_state)
    return "Report written."

async def review_report(ctx: Context, review: str) -> str:
    """Useful for reviewing a report and providing feedback."""
    current_state = await ctx.store.get("state")
    current_state["review"] = review
    await ctx.store.set("state", current_state)
    return "Report reviewed."

Tạo một trợ lý đa tác nhân

Để xây dựng một hệ thống đa tác nhân, bạn xác định các tác nhân và hoạt động tương tác của chúng. Hệ thống của bạn sẽ có 3 tác nhân:

ResearchAgent tìm kiếm thông tin trên web về chủ đề đã cho.
WriteAgent viết báo cáo bằng thông tin mà ResearchAgent tìm thấy.
ReviewAgent xem xét báo cáo và đưa ra ý kiến phản hồi.

Ví dụ này sử dụng lớp AgentWorkflow để tạo một hệ thống đa tác nhân sẽ thực thi các tác nhân này theo thứ tự. Mỗi tác nhân sẽ lấy một system_prompt cho biết tác nhân đó nên làm gì và đề xuất cách làm việc với các tác nhân khác.

Bạn có thể giúp hệ thống đa tác nhân bằng cách chỉ định những tác nhân khác mà hệ thống có thể giao tiếp bằng cách sử dụng can_handoff_to (nếu không, hệ thống sẽ tự tìm hiểu).

from llama_index.core.agent.workflow import (
    AgentInput,
    AgentOutput,
    ToolCall,
    ToolCallResult,
    AgentStream,
)
from llama_index.core.agent.workflow import FunctionAgent, ReActAgent

research_agent = FunctionAgent(
    name="ResearchAgent",
    description="Useful for searching the web for information on a given topic and recording notes on the topic.",
    system_prompt=(
        "You are the ResearchAgent that can search the web for information on a given topic and record notes on the topic. "
        "Once notes are recorded and you are satisfied, you should hand off control to the WriteAgent to write a report on the topic."
    ),
    llm=llm,
    tools=[search_web, record_notes],
    can_handoff_to=["WriteAgent"],
)

write_agent = FunctionAgent(
    name="WriteAgent",
    description="Useful for writing a report on a given topic.",
    system_prompt=(
        "You are the WriteAgent that can write a report on a given topic. "
        "Your report should be in a markdown format. The content should be grounded in the research notes. "
        "Once the report is written, you should get feedback at least once from the ReviewAgent."
    ),
    llm=llm,
    tools=[write_report],
    can_handoff_to=["ReviewAgent", "ResearchAgent"],
)

review_agent = FunctionAgent(
    name="ReviewAgent",
    description="Useful for reviewing a report and providing feedback.",
    system_prompt=(
        "You are the ReviewAgent that can review a report and provide feedback. "
        "Your feedback should either approve the current report or request changes for the WriteAgent to implement."
    ),
    llm=llm,
    tools=[review_report],
    can_handoff_to=["ResearchAgent","WriteAgent"],
)

Các Agent đã được xác định, giờ đây bạn có thể tạo AgentWorkflow và chạy nó.

from llama_index.core.agent.workflow import AgentWorkflow

agent_workflow = AgentWorkflow(
    agents=[research_agent, write_agent, review_agent],
    root_agent=research_agent.name,
    initial_state={
        "research_notes": {},
        "report_content": "Not written yet.",
        "review": "Review required.",
    },
)

Trong quá trình thực thi quy trình công việc, bạn có thể truyền trực tuyến các sự kiện, lệnh gọi công cụ và nội dung cập nhật đến bảng điều khiển.

from llama_index.core.agent.workflow import (
    AgentInput,
    AgentOutput,
    ToolCall,
    ToolCallResult,
    AgentStream,
)

research_topic = """Write me a report on the history of the web.
Briefly describe the history of the world wide web, including
the development of the internet and the development of the web,
including 21st century developments"""

handler = agent_workflow.run(
    user_msg=research_topic
)

current_agent = None
current_tool_calls = ""
async for event in handler.stream_events():
    if (
        hasattr(event, "current_agent_name")
        and event.current_agent_name != current_agent
    ):
        current_agent = event.current_agent_name
        print(f"\n{'='*50}")
        print(f"🤖 Agent: {current_agent}")
        print(f"{'='*50}\n")
    elif isinstance(event, AgentOutput):
        if event.response.content:
            print("📤 Output:", event.response.content)
        if event.tool_calls:
            print(
                "🛠️  Planning to use tools:",
                [call.tool_name for call in event.tool_calls],
            )
    elif isinstance(event, ToolCallResult):
        print(f"🔧 Tool Result ({event.tool_name}):")
        print(f"  Arguments: {event.tool_kwargs}")
        print(f"  Output: {event.tool_output}")
    elif isinstance(event, ToolCall):
        print(f"🔨 Calling Tool: {event.tool_name}")
        print(f"  With arguments: {event.tool_kwargs}")

Sau khi quy trình hoàn tất, bạn có thể in đầu ra cuối cùng của báo cáo, cũng như trạng thái đánh giá cuối cùng của nhân viên đánh giá.

state = await handler.ctx.store.get("state")
print("Report Content:\n", state["report_content"])
print("\n------------\nFinal Review:\n", state["review"])

Tận hưởng nhiều đặc quyền hơn nhờ quy trình công việc tuỳ chỉnh

AgentWorkflow là một cách hay để bắt đầu sử dụng hệ thống nhiều tác nhân. Nhưng nếu bạn cần có thêm quyền kiểm soát thì sao? Bạn có thể tạo quy trình công việc từ đầu. Dưới đây là một số lý do khiến bạn nên tạo quy trình làm việc của riêng mình:

Kiểm soát quy trình tốt hơn: Bạn có thể quyết định chính xác đường dẫn mà các tác nhân sẽ thực hiện. Ví dụ: tạo vòng lặp, đưa ra quyết định tại một số điểm nhất định hoặc để các nhân viên hỗ trợ làm việc song song trên nhiều nhiệm vụ.
Sử dụng dữ liệu phức tạp: Vượt ra ngoài văn bản đơn giản. Quy trình tuỳ chỉnh cho phép bạn sử dụng nhiều dữ liệu có cấu trúc hơn (chẳng hạn như đối tượng JSON hoặc các lớp tuỳ chỉnh) cho dữ liệu đầu vào và đầu ra.
Xử lý nhiều loại nội dung nghe nhìn: Xây dựng các tác nhân có thể hiểu và xử lý không chỉ văn bản mà còn cả hình ảnh, âm thanh và video.
Lập kế hoạch thông minh hơn: Bạn có thể thiết kế một quy trình công việc, trong đó trước tiên, quy trình này sẽ tạo ra một kế hoạch chi tiết trước khi các nhân viên bắt đầu làm việc. Điều này hữu ích cho những tác vụ phức tạp đòi hỏi nhiều bước.
Bật tính năng tự sửa lỗi: Tạo các tác nhân có thể xem xét công việc của chính mình. Nếu đầu ra chưa đủ tốt, tác nhân có thể thử lại, tạo ra một vòng lặp cải tiến cho đến khi đạt được kết quả hoàn hảo.

Để tìm hiểu thêm về LlamaIndex Workflows, hãy xem Tài liệu về LlamaIndex Workflows.