Calls the API and returns a types.ChatResponse containing the response.

model Which model to call, as a string or a types.Model.
context Text that should be provided to the model first, to ground the response.

If not empty, this context will be given to the model first before the examples and messages.

This field can be a description of your prompt to the model to help provide context and guide the responses.


  • "Translate the phrase from English to French."
  • "Given a statement, classify the sentiment as happy, sad or neutral."

Anything included in this field will take precedence over history in messages if the total input size exceeds the model's Model.input_token_limit.

examples Examples of what the model should generate.

This includes both the user input and the response that the model should emulate.

These examples are treated identically to conversation messages except that they take precedence over the history in messages: If the total input size exceeds the model's input_token_limit the input will be truncated. Items will be dropped from messages before examples

messages A snapshot of the conversation history sorted chronologically.

Turns alternate between two authors.

If the total input size exceeds the model's input_token_limit the input will be truncated: The oldest items will be dropped from messages.

temperature Controls the randomness of the output. Must be positive.

Typical values are in the range: [0.0,1.0]. Higher values produce a more random and varied response. A temperature of zero will be deterministic.

candidate_count The maximum number of generated response messages to return.

This value must be between [1, 8], inclusive. If unset, this will default to 1.

top_k The API uses combined nucleus and top-k sampling.

top_k sets the maximum number of tokens to sample from on each step.

top_p The API uses combined nucleus and top-k sampling.

top_p configures the nucleus sampling. It sets the maximum cumulative probability of tokens to sample from.

For example, if the sorted probabilities are [0.5, 0.2, 0.1, 0.1, 0.05, 0.05] a top_p of 0.8 will sample as [0.625, 0.25, 0.125, 0, 0, 0].

Typical values are in the [0.9, 1.0] range.

prompt You may pass a types.MessagePromptOptions instead of a setting context/examples/messages, but not both.
client If you're not relying on the default client, you pass a glm.DiscussServiceClient instead.
request_options Options for the request.

A types.ChatResponse containing the model's reply.