LiteRT-LM CLI

The LiteRT-LM Command Line Interface (CLI) lets you run models and interact with them using the terminal.

Installation

Follow the uv installation guide to install uv.

uv tool install litert-lm-nightly

Using pip

python3 -m venv .venv
source .venv/bin/activate
pip install litert-lm-nightly

Chat

Run the model using the CLI:

litert-lm run google/gemma-3n-E2B-it-litert-lm/gemma-3n-E2B-it-int4 --prompt="What is the capital of France?"

Function Calling / Tools

You can run tools with presets. Create a preset.py:

import datetime
import base64

def get_current_time() -> str:
    """Returns the current date and time."""
    return datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")

system_instruction = "You are a helpful assistant with access to tools."
tools = [get_current_time]

Run with preset:

litert-lm run google/gemma-3n-E2B-it-litert-lm/gemma-3n-E2B-it-int4 --preset=preset.py

Sample prompts and interactive output:

> what will the time be in two hours?
[tool_call] {"arguments": {}, "name": "get_current_time"}
[tool_response] {"name": "get_current_time", "response": "2026-03-25 21:54:07"}
The current time is 2026-03-25 21:54:07.

In two hours, it will be **2026-03-25 23:54:07**.

What is Happening Here?

When you ask a question that requires external information (like the current time), the model recognizes that it needs to call a tool.

  1. Model Emits tool_call: The model outputs a JSON request to call the get_current_time function.
  2. CLI Executes Tool: The LiteRT-LM CLI intercepts this call and executes the corresponding Python function defined in your preset.py.
  3. CLI Sends tool_response: The CLI sends the result back to the model.
  4. Model Generates Final Answer: The model use the tool response to compute and generate the final answer for the user.

This "Function Calling" loop happens automatically within the CLI, allowing you to augment local LLMs with Python capabilities without writing any complex orchestration code.

The same capabilities are available from the Python, C++, and Kotlin APIs.