Google AI Edge Portal 소개: 대규모로 엣지 AI를 벤치마킹합니다. 비공개 미리보기 기간에 액세스 권한을 요청하려면 가입하세요.

LiteRT-LM CLI

명령줄 인터페이스 (CLI)를 사용하면 코드를 작성하지 않고도 모델을 즉시 테스트할 수 있습니다.

지원되는 플랫폼:

Linux
macOS
Windows (WSL을 통해)
Raspberry Pi

설치

방법 1: `uv` (권장)

litert-lm을 시스템 전체 바이너리로 설치합니다. uv 가 필요합니다.

uv tool install litert-lm-nightly

방법 2: `pip`

가상 환경 내의 표준 설치입니다.

python3 -m venv .venv
source .venv/bin/activate
pip install litert-lm-nightly

채팅

Hugging Face에서 모델을 다운로드하고 실행합니다.

litert-lm run  \
  --from-huggingface-repo=google/gemma-3n-E2B-it-litert-lm \
  gemma-3n-E2B-it-int4 \
  --prompt="What is the capital of France?"

함수 호출 / 도구

사전 설정으로 도구를 실행할 수 있습니다. preset.py를 만듭니다.

import datetime
import base64

def get_current_time() -> str:
    """Returns the current date and time."""
    return datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")

system_instruction = "You are a helpful assistant with access to tools."
tools = [get_current_time]

사전 설정으로 실행:

litert-lm run  \
  --from-huggingface-repo=google/gemma-3n-E2B-it-litert-lm \
  gemma-3n-E2B-it-int4 \
  --preset=preset.py

샘플 프롬프트 및 대화형 출력:

> what will the time be in two hours?
[tool_call] {"arguments": {}, "name": "get_current_time"}
[tool_response] {"name": "get_current_time", "response": "2026-03-25 21:54:07"}
The current time is 2026-03-25 21:54:07.

In two hours, it will be **2026-03-25 23:54:07**.

어떤 곳인가요?

현재 시간과 같은 외부 정보가 필요한 질문을 하면 모델은 도구를 호출해야 한다는 것을 인식합니다.

모델이 tool_call을 내보냄: 모델은 get_current_time 함수를 호출하기 위한 JSON 요청을 출력합니다.
CLI가 도구 실행: LiteRT-LM CLI가 이 호출을 가로채고 preset.py에 정의된 상응하는 Python 함수를 실행합니다.
CLI가 tool_response 전송: CLI가 결과를 모델에 다시 전송합니다.
모델이 최종 답변 생성: 모델은 도구 응답을 사용하여 사용자를 위한 최종 답변을 계산하고 생성합니다.

이 '함수 호출' 루프는 CLI 내에서 자동으로 발생하므로 복잡한 오케스트레이션 코드를 작성하지 않고도 Python 기능을 사용하여 로컬 LLM을 보강할 수 있습니다.

Python, C++, 및 Kotlin API에서도 동일한 기능을 사용할 수 있습니다.