אנחנו גאים להציג את Google AI Edge Portal: כלי למדידת ביצועים של AI לקצה (Edge AI) בקנה מידה נרחב. להירשם כדי לבקש גישה במהלך התצוגה המקדימה הפרטית.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

LiteRT-LM Swift API

ממשק ה-API של Swift של LiteRT-LM מאפשר לשלב מודלים גדולים של שפה באופן מקורי באפליקציות ל-iOS ול-macOS. יש תמיכה מלאה בתכונות כמו ריבוי אופנים, שימוש בכלי והאצת GPU (באמצעות Metal).

מבוא

דוגמה לשימוש ב-Swift API כדי לאתחל מודל ולשלוח הודעה:

import LiteRTLM

// 1. Initialize the Engine with your model
let config = try EngineConfig(
  modelPath: "path/to/model.litertlm",
  backend: .gpu, // Use .cpu() for CPU execution
  cacheDir: NSTemporaryDirectory()
)
let engine = Engine(engineConfig: config)
try await engine.initialize()

// 2. Start a new Conversation
let conversation = try await engine.createConversation()

// 3. Send a message and print the response
let response = try await conversation.sendMessage(Message("What is the capital of France?"))
print(response.toString)

תחילת העבודה

בקטע הזה מוסבר איך לשלב את LiteRT-LM Swift API באפליקציה.

Swift Package Manager‏ (SPM)

אפשר לשלב את LiteRT-LM בפרויקט Xcode באמצעות Swift Package Manager.

פותחים את הפרויקט ב-Xcode ועוברים אל File > Add Package Dependencies...‎ (קובץ > הוספת תלות בחבילה).
מזינים את כתובת ה-URL של מאגר החבילות: https://github.com/google-ai-edge/LiteRT-LM
בוחרים בספרייה LiteRTLM כדי להוסיף אותה ליעד האפליקציה.

אם אתם מפתחים חבילה באמצעות Package.swift, מוסיפים אותה ליחסי התלות:

dependencies: [
  .package(url: "https://github.com/google-ai-edge/LiteRT-LM", from: "0.12.0")
]

מדריך Core API

בקטע הזה מפורטים הרכיבים הבסיסיים ותהליכי העבודה לשימוש ב-LiteRT-LM Swift API, כולל אתחול המנוע, ניהול שיחות ושליחת הודעות.

הפעלת המנוע

‫Engine אחראי לטעינת המודל, להקצאת משאבים ולניהול מחזור החיים.

import LiteRTLM

let engineConfig = try EngineConfig(
  modelPath: "path/to/your/model.litertlm",
  backend: .gpu, // Use .gpu for Metal hardware acceleration
  maxNumTokens: 512, // Size of the KV-cache
  cacheDir: NSTemporaryDirectory() // Writable directory for compilation cache
)

let engine = Engine(engineConfig: engineConfig)
try await engine.initialize()

יצירת שיחה

‫Conversation A מנהל את היסטוריית הצ'אטים, את ההוראות למערכת ואת ההגדרות של הדוגם.

// Configure custom sampling parameters
let samplerConfig = try SamplerConfig(
  topK: 40,
  topP: 0.95,
  temperature: 0.7
)

// Create the conversation config with system instructions
let config = ConversationConfig(
  systemMessage: Message("You are a helpful assistant."),
  samplerConfig: samplerConfig
)

let conversation = try await engine.createConversation(with: config)

שליחת הודעות

אפשר לבצע אינטראקציה עם המודל באופן סינכרוני או אסינכרוני (סטרימינג).

דוגמה סינכרונית

let response = try await conversation.sendMessage(Message("Hello!"))
print(response.toString)

דוגמה אסינכרונית (סטרימינג)

let message = Message("Tell me a long story.")

for try await chunk in conversation.sendMessageStream(message) {
  // Output response chunks in real-time
  print(chunk.toString, terminator: "")
}
print()

מולטי-מודאליות

כדי להשתמש בתכונות של ראייה או אודיו, צריך להגדיר את ה-backends המיוחדים במהלך ההפעלה של המנוע.

let engineConfig = try EngineConfig(
  modelPath: "path/to/multimodal_model.litertlm",
  backend: .gpu,
  visionBackend: .cpu(), // Enable CPU vision executor
  audioBackend: .cpu(), // Enable CPU audio executor
  cacheDir: NSTemporaryDirectory()
)
let engine = Engine(engineConfig: engineConfig)
try await engine.initialize()

קלט תמונה (Vision)

הוספת תמונה כנתיב או כבייטים גולמיים:

let imagePath = Bundle.main.path(forResource: "scenery", ofType: "jpg")!

let message = Message(contents: [
  Content.imageFile(imagePath),
  Content.text("Describe this image.")
])

let response = try await conversation.sendMessage(message)
print(response.toString)

קלט אודיו

צריך לספק נתיב אודיו:

let audioPath = Bundle.main.path(forResource: "recording", ofType: "wav")!

let message = Message(contents: [
  Content.audioFile(audioPath),
  Content.text("Transcribe this recording.")
])

let response = try await conversation.sendMessage(message)
print(response.toString)

‫🔴 חדש: חיזוי מרובה טוקנים (MTP)

תחזית מרובת טוקנים (MTP) היא אופטימיזציה של הביצועים שמאיצה באופן משמעותי את מהירויות הפענוח. מומלץ להשתמש בו בכל המשימות שמשתמשות ב-GPU/Metal backends.

כדי להשתמש ב-MTP, צריך להפעיל את התכונה 'פענוח ספקולטיבי' בדגלים ניסיוניים לפני שמפעילים את המנוע.

import LiteRTLM

// Opt into experimental APIs to configure MTP
ExperimentalFlags.optIntoExperimentalAPIs()
ExperimentalFlags.enableSpeculativeDecoding = true

let engineConfig = try EngineConfig(
  modelPath: "path/to/model.litertlm",
  backend: .gpu,
  cacheDir: NSTemporaryDirectory()
)
let engine = Engine(engineConfig: engineConfig)
try await engine.initialize()

הגדרה ושימוש בכלים

אפשר להגדיר מבני Swift ככלים שהמודל יכול להפעיל באופן אוטומטי כדי לבצע לוגיקה.

התאמה לפרוטוקול Tool.
מגדירים פרמטרים באמצעות עטיפת המאפיינים @ToolParam.
מטמיעים את ה-method‏ run().

import LiteRTLM

// 1. Define your custom tool
struct GetCurrentWeatherTool: Tool {
  static let name = "get_current_weather"
  static let description = "Get the current weather for a location."

  @ToolParam(description: "The city and state, e.g. San Francisco, CA")
  var location: String

  @ToolParam(description: "The temperature unit to use (celsius or fahrenheit)")
  var unit: String = "celsius"

  func run() async throws -> Any {
    // Call your weather API here
    return [
      "location": location,
      "temperature": "22",
      "unit": unit,
      "condition": "sunny"
    ]
  }
}

// 2. Register the tool in your conversation configuration
let config = ConversationConfig(
  tools: [GetCurrentWeatherTool()]
)

let conversation = try await engine.createConversation(with: config)

// 3. The model will invoke the tool automatically if needed
let response = try await conversation.sendMessage(Message("What is the weather in Paris right now?"))
print(response.toString)