LiteRT-LM 的 Swift API 可讓您將大型語言模型原生整合至 iOS 和 macOS 應用程式。完全支援多模態、工具使用和 GPU 加速 (透過 Metal) 等功能。
簡介
以下範例說明如何使用 Swift API 初始化模型並傳送訊息:
import LiteRTLM
// 1. Initialize the Engine with your model
let config = try EngineConfig(
modelPath: "path/to/model.litertlm",
backend: .gpu, // Use .cpu() for CPU execution
cacheDir: NSTemporaryDirectory()
)
let engine = Engine(engineConfig: config)
try await engine.initialize()
// 2. Start a new Conversation
let conversation = try await engine.createConversation()
// 3. Send a message and print the response
let response = try await conversation.sendMessage(Message("What is the capital of France?"))
print(response.toString)
開始使用
本節提供相關操作說明,協助您將 LiteRT-LM Swift API 整合至應用程式。
Swift Package Manager (SPM)
您可以使用 Swift Package Manager,將 LiteRT-LM 整合至 Xcode 專案。
- 在 Xcode 中開啟專案,然後依序前往「File」(檔案) >「Add Package Dependencies…」(新增套件依附元件…)
- 輸入套件存放區網址:
https://github.com/google-ai-edge/LiteRT-LM - 選取 LiteRTLM 程式庫,將其新增至應用程式目標。
如果您使用 Package.swift 開發套件,請將其新增至依附元件:
dependencies: [
.package(url: "https://github.com/google-ai-edge/LiteRT-LM", from: "0.12.0")
]
核心 API 指南
本節詳細說明使用 LiteRT-LM Swift API 的基本元件和工作流程,包括引擎初始化、對話管理和傳送訊息。
初始化引擎
Engine 會處理模型載入、資源分配和生命週期管理。
import LiteRTLM
let engineConfig = try EngineConfig(
modelPath: "path/to/your/model.litertlm",
backend: .gpu, // Use .gpu for Metal hardware acceleration
maxNumTokens: 512, // Size of the KV-cache
cacheDir: NSTemporaryDirectory() // Writable directory for compilation cache
)
let engine = Engine(engineConfig: engineConfig)
try await engine.initialize()
建立對話
Conversation 可管理對話記錄、系統指示和取樣器設定。
// Configure custom sampling parameters
let samplerConfig = try SamplerConfig(
topK: 40,
topP: 0.95,
temperature: 0.7
)
// Create the conversation config with system instructions
let config = ConversationConfig(
systemMessage: Message("You are a helpful assistant."),
samplerConfig: samplerConfig
)
let conversation = try await engine.createConversation(with: config)
可傳送訊息
您可以同步或非同步 (串流) 與模型互動。
同步範例
let response = try await conversation.sendMessage(Message("Hello!"))
print(response.toString)
非同步 (串流) 範例
let message = Message("Tell me a long story.")
for try await chunk in conversation.sendMessageStream(message) {
// Output response chunks in real-time
print(chunk.toString, terminator: "")
}
print()
多模態
如要使用影像或音訊功能,請務必在引擎初始化期間設定專用後端。
let engineConfig = try EngineConfig(
modelPath: "path/to/multimodal_model.litertlm",
backend: .gpu,
visionBackend: .cpu(), // Enable CPU vision executor
audioBackend: .cpu(), // Enable CPU audio executor
cacheDir: NSTemporaryDirectory()
)
let engine = Engine(engineConfig: engineConfig)
try await engine.initialize()
輸入圖片 (Vision)
以路徑或原始位元組的形式提供圖片:
let imagePath = Bundle.main.path(forResource: "scenery", ofType: "jpg")!
let message = Message(contents: [
Content.imageFile(imagePath),
Content.text("Describe this image.")
])
let response = try await conversation.sendMessage(message)
print(response.toString)
音訊輸入
提供音訊路徑:
let audioPath = Bundle.main.path(forResource: "recording", ofType: "wav")!
let message = Message(contents: [
Content.audioFile(audioPath),
Content.text("Transcribe this recording.")
])
let response = try await conversation.sendMessage(message)
print(response.toString)
🔴 新功能:多權杖預測 (MTP)
多權杖預測 (MTP) 是一項效能最佳化功能,可大幅提升解碼速度。建議所有使用 GPU/Metal 後端的作業都採用這個方法。
如要使用 MTP,請先在實驗旗標中啟用推測解碼,再初始化引擎。
import LiteRTLM
// Opt into experimental APIs to configure MTP
ExperimentalFlags.optIntoExperimentalAPIs()
ExperimentalFlags.enableSpeculativeDecoding = true
let engineConfig = try EngineConfig(
modelPath: "path/to/model.litertlm",
backend: .gpu,
cacheDir: NSTemporaryDirectory()
)
let engine = Engine(engineConfig: engineConfig)
try await engine.initialize()
定義及使用工具
您可以將 Swift 結構體定義為模型可自動呼叫的工具,藉此執行邏輯。
- 遵守
Tool通訊協定。 - 使用
@ToolParam屬性包裝函式宣告參數。 - 實作
run()方法。
import LiteRTLM
// 1. Define your custom tool
struct GetCurrentWeatherTool: Tool {
static let name = "get_current_weather"
static let description = "Get the current weather for a location."
@ToolParam(description: "The city and state, e.g. San Francisco, CA")
var location: String
@ToolParam(description: "The temperature unit to use (celsius or fahrenheit)")
var unit: String = "celsius"
func run() async throws -> Any {
// Call your weather API here
return [
"location": location,
"temperature": "22",
"unit": unit,
"condition": "sunny"
]
}
}
// 2. Register the tool in your conversation configuration
let config = ConversationConfig(
tools: [GetCurrentWeatherTool()]
)
let conversation = try await engine.createConversation(with: config)
// 3. The model will invoke the tool automatically if needed
let response = try await conversation.sendMessage(Message("What is the weather in Paris right now?"))
print(response.toString)