The Semantic Retrieval API provides a hosted question answering service for building Retrieval Augmented Generation (RAG) systems using Google's infrastructure. For a detailed walkthrough, check out the Semantic retrieval guide.
Method: models.generateAnswer
- Endpoint
- Path parameters
- Request body
- Response body
- Authorization scopes
- GroundingPassages
- GroundingPassage
- SemanticRetrieverConfig
- AnswerStyle
- InputFeedback
- BlockReason
Generates a grounded answer from the model given an input GenerateAnswerRequest
.
Endpoint
post https://generativelanguage.googleapis.com/v1beta/{model=models/*}:generateAnswerPath parameters
model
string
Required. The name of the Model
to use for generating the grounded response.
Format: model=models/{model}
. It takes the form models/{model}
.
Request body
The request body contains data with the following structure:
Required. The content of the current conversation with the Model
. For single-turn queries, this is a single question to answer. For multi-turn queries, this is a repeated field that contains conversation history and the last Content
in the list containing the question.
Note: models.generateAnswer
only supports queries in English.
Required. Style in which answers should be returned.
Optional. A list of unique SafetySetting
instances for blocking unsafe content.
This will be enforced on the GenerateAnswerRequest.contents
and GenerateAnswerResponse.candidate
. There should not be more than one setting for each SafetyCategory
type. The API will block any contents and responses that fail to meet the thresholds set by these settings. This list overrides the default settings for each SafetyCategory
specified in the safetySettings. If there is no SafetySetting
for a given SafetyCategory
provided in the list, the API will use the default safety setting for that category. Harm categories HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_DANGEROUS_CONTENT, HARM_CATEGORY_HARASSMENT are supported. Refer to the guide for detailed information on available safety settings. Also refer to the Safety guidance to learn how to incorporate safety considerations in your AI applications.
grounding_source
. The sources in which to ground the answer. grounding_source
can be only one of the following:Passages provided inline with the request.
Content retrieved from resources created via the Semantic Retriever API.
temperature
number
Optional. Controls the randomness of the output.
Values can range from [0.0,1.0], inclusive. A value closer to 1.0 will produce responses that are more varied and creative, while a value closer to 0.0 will typically result in more straightforward responses from the model. A low temperature (~0.2) is usually recommended for Attributed-Question-Answering use cases.
Response body
Response from the model for a grounded answer.
If successful, the response body contains data with the following structure:
Candidate answer from the model.
Note: The model always attempts to provide a grounded answer, even when the answer is unlikely to be answerable from the given passages. In that case, a low-quality or ungrounded answer may be provided, along with a low answerableProbability
.
answerableProbability
number
Output only. The model's estimate of the probability that its answer is correct and grounded in the input passages.
A low answerableProbability
indicates that the answer might not be grounded in the sources.
When answerableProbability
is low, you may want to:
- Display a message to the effect of "We couldn’t answer that question" to the user.
- Fall back to a general-purpose LLM that answers the question from world knowledge. The threshold and nature of such fallbacks will depend on individual use cases.
0.5
is a good starting threshold.
Output only. Feedback related to the input data used to answer the question, as opposed to the model-generated response to the question.
The input data can be one or more of the following:
- Question specified by the last entry in
GenerateAnswerRequest.content
- Conversation history specified by the other entries in
GenerateAnswerRequest.content
- Grounding sources (
GenerateAnswerRequest.semantic_retriever
orGenerateAnswerRequest.inline_passages
)
JSON representation |
---|
{ "answer": { object ( |
GroundingPassages
A repeated list of passages.
List of passages.
JSON representation |
---|
{
"passages": [
{
object ( |
GroundingPassage
SemanticRetrieverConfig
Configuration for retrieving grounding content from a Corpus
or Document
created using the Semantic Retriever API.
source
string
Required. Name of the resource for retrieval. Example: corpora/123
or corpora/123/documents/abc
.
Required. Query to use for matching Chunk
s in the given resource by similarity.
Optional. Filters for selecting Document
s and/or Chunk
s from the resource.
maxChunksCount
integer
Optional. Maximum number of relevant Chunk
s to retrieve.
minimumRelevanceScore
number
Optional. Minimum relevance score for retrieved relevant Chunk
s.
JSON representation |
---|
{ "source": string, "query": { object ( |
AnswerStyle
Style for grounded answers.
Enums | |
---|---|
ANSWER_STYLE_UNSPECIFIED |
Unspecified answer style. |
ABSTRACTIVE |
Succint but abstract style. |
EXTRACTIVE |
Very brief and extractive style. |
VERBOSE |
Verbose style including extra details. The response may be formatted as a sentence, paragraph, multiple paragraphs, or bullet points, etc. |
InputFeedback
Feedback related to the input data used to answer the question, as opposed to the model-generated response to the question.
Ratings for safety of the input. There is at most one rating per category.
Optional. If set, the input was blocked and no candidates are returned. Rephrase the input.
JSON representation |
---|
{ "safetyRatings": [ { object ( |
BlockReason
Specifies what was the reason why input was blocked.
Enums | |
---|---|
BLOCK_REASON_UNSPECIFIED |
Default value. This value is unused. |
SAFETY |
Input was blocked due to safety reasons. Inspect safetyRatings to understand which safety category blocked it. |
OTHER |
Input was blocked due to other reasons. |