Search re-ranking using Gemini embeddings

View on Google AI Run in Google Colab View source on GitHub

This notebook demonstrates the use of embeddings to re-rank search results. This walkthrough will focus on the following objectives:

  1. Setting up your development environment and API access to use Gemini.
  2. Using Gemini's function calling support to access the Wikipedia API.
  3. Embedding content via Gemini API.
  4. Re-ranking the search results.

This is how you will implement search re-ranking:

  1. User will query the model.
  2. You will use Wikipedia API to return relevant search results.
  3. The search results will be embedded and their relevance will be evaluated by calculating distance metrics like cosine similarity, dot product, etc.
  4. Most relevant result will be returned as the final answer.

Prerequisites

You can run this quickstart in Google Colab, which runs this notebook directly in the browser and does not require additional environment configuration.

Setup

The Python SDK for the Gemini API, is contained in the google-generativeai package. You will also need to install the Wikipedia API.

pip install -q google-generativeai
pip install -q wikipedia

Import the necessary packages.

import json
import textwrap

import google.generativeai as genai
import google.ai.generativelanguage as glm

import wikipedia
from wikipedia.exceptions import DisambiguationError, PageError

import numpy as np

from IPython.display import Markdown

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

Grab an API key

Before you can use the Gemini API, you must first obtain an API key. If you don't already have one, create a key with one click in Google AI Studio.

Get an API key

In Colab, add the key to the secrets manager under the "🔑" in the left panel. Give it the name GOOGLE_API_KEY.

Once you have the API key, pass it to the SDK. You can do this in two ways:

  • Put the key in the GOOGLE_API_KEY environment variable (the SDK will automatically pick it up from there).
  • Pass the key to genai.configure(api_key=...)
try:
    from google.colab import userdata
    GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
except ImportError:
    import os
    GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']

genai.configure(api_key=GOOGLE_API_KEY)

Define tools

As stated earlier, this tutorial uses Gemini's function calling support to access the Wikipedia API. Please refer to the docs to learn more about function calling.

Define the search function

To cater to the search engine needs, you will design this function in the following way:

  • For each search query, the search engine will use the wikipedia.search method to get relevant topics.
  • From the relevant topics, the engine will choose n_topics(int) top candidates and will use gemini-pro to extract relevant information from the page.
  • The engine will avoid duplicate entries by maintaining a search history.
def wikipedia_search(search_queries: list[str]) -> list[str]:
  """Search wikipedia for each query and summarize relevant docs."""
  n_topics=3
  search_history = set() # tracking search history
  search_urls = []
  mining_model = genai.GenerativeModel('gemini-pro')
  summary_results = []

  for query in search_queries:
    print(f'Searching for "{query}"')
    search_terms = wikipedia.search(query)

    print(f"Related search terms: {search_terms[:n_topics]}")
    for search_term in search_terms[:n_topics]: # select first `n_topics` candidates
      if search_term in search_history: # check if the topic is already covered
        continue

      print(f'Fetching page: "{search_term}"')
      search_history.add(search_term) # add to search history

      try:
        # extract the relevant data by using `gemini-pro` model
        page = wikipedia.page(search_term, auto_suggest=False)
        url = page.url
        print(f"Information Source: {url}")
        search_urls.append(url)
        page = page.content
        response = mining_model.generate_content(textwrap.dedent(f"""\
            Extract relevant information
            about user's query: {query}
            From this source:

            {page}

            Note: Do not summarize. Only Extract and return the relevant information
        """))

        urls = [url]
        if response.candidates[0].citation_metadata:
          extra_citations = response.candidates[0].citation_metadata.citation_sources
          extra_urls = [source.url for source in extra_citations]
          urls.extend(extra_urls)
          search_urls.extend(extra_urls)
          print("Additional citations:", response.candidates[0].citation_metadata.citation_sources)
        try:
          text = response.text
        except ValueError:
          pass
        else:
          summary_results.append(text + "\n\nBased on:\n  " + ',\n  '.join(urls))

      except DisambiguationError:
        print(f"""Results when searching for "{search_term}" (originally for "{query}")
        were ambiguous, hence skipping""")

      except PageError:
        print(f'{search_term} did not match with any page id, hence skipping.')

  print(f"Information Sources:")
  for url in search_urls:
    print('    ', url)

  return summary_results
example = wikipedia_search(["What are LLMs?"])
Searching for "What are LLMs?"
Related search terms: ['Large language model', 'Prompt engineering', 'Language model']
Fetching page: "Large language model"
Information Source: https://en.wikipedia.org/wiki/Large_language_model
Fetching page: "Prompt engineering"
Information Source: https://en.wikipedia.org/wiki/Prompt_engineering
Fetching page: "Language model"
Information Source: https://en.wikipedia.org/wiki/Language_model
Information Sources:
     https://en.wikipedia.org/wiki/Large_language_model
     https://en.wikipedia.org/wiki/Prompt_engineering
     https://en.wikipedia.org/wiki/Language_model

Here is what the search results look like:

from IPython.display import display

for e in example:
  display(to_markdown(e))

Relevant information about LLMs:

  • LLMs are language models notable for their ability to achieve general-purpose language generation and understanding.
  • LLMs are artificial neural networks, the largest and most capable of which are built with a decoder-only transformer-based architecture.
  • LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word.
  • Some notable LLMs are OpenAI's GPT series of models, Google's PaLM and Gemini, Meta's LLaMA family of open-source models, and Anthropic's Claude models.
  • LLMs are trained using statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process.
  • LLMs are thought to acquire knowledge about syntax, semantics, and "ontology" inherent in human language corpora, but also inaccuracies and biases present in the corpora.
  • LLMs can be used for a variety of tasks, including text generation, language translation, question answering, and summarization.
  • LLMs have a number of advantages over traditional language models, including their ability to handle longer sequences of text, their ability to learn from unlabeled data, and their ability to generate more coherent and fluent text.
  • LLMs also have a number of limitations, including their tendency to hallucinate facts, their lack of common sense knowledge, and their potential for bias.
  • LLMs are still under development, but they have the potential to revolutionize a wide range of industries, including natural language processing, customer service, and education.

Based on: https://en.wikipedia.org/wiki/Large_language_model

LLMs (Large Language Models) are powerful AI models that can understand and generate text. They are designed to perform various language-related tasks, such as answering questions, summarizing documents, translating languages, writing different forms of text, and generating code. LLMs have been significantly improved through techniques such as in-context learning, which allows them to temporarily learn from specific prompts.

Prompt engineering involves structuring text prompts to optimize the performance of an LLM. Effective prompts can guide the model's reasoning, provide context, and specify the desired output. Various prompt engineering techniques have been developed, including chain-of-thought prompting, generated knowledge prompting, and complexity-based prompting, each tailored to specific tasks and models.

LLMs have also been adapted to generate images and videos. Text-to-image models like DALL-E 2 create art based on textual descriptions, while text-to-video models generate videos from textual prompts. These models require specialized prompting techniques that account for their unique capabilities and limitations.

Non-text prompts are also used to guide LLMs. Image prompting allows users to provide images or image-based information as input, while gradient descent-based techniques enable the optimization of soft prompt tokens to enhance model performance.

Prompt injection is a security concern where malicious users craft prompts to trick LLMs into performing unintended actions or revealing sensitive information. Mitigation strategies include input filtering, output filtering, and prompt engineering techniques to separate user input from instructions.

LLMs are continually evolving, and new techniques and applications are being developed. They have the potential to revolutionize various industries by automating language-related tasks and enabling novel forms of creativity and communication.

Based on: https://en.wikipedia.org/wiki/Prompt_engineering

  • Language models are probabilistic models of natural language.
  • Large language models are a combination of larger datasets, feedforward neural networks, and transformers.
  • Large language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation, optical character recognition, handwriting recognition, grammar induction, and information retrieval.
  • Evaluation of the quality of language models is mostly done by comparison to human-created sample benchmarks created from typical language-oriented tasks.

Based on: https://en.wikipedia.org/wiki/Language_model

Pass the tools to the model

If you pass a list of functions to the GenerativeModel's tools argument, it will extract a schema from the function's signature and type hints, and then pass schema along to the API calls. In response the model may return a FunctionCall object asking to call the function.

The GenerativeModel will keep a reference to the function inself, so that it can execute the function locally later.

model = genai.GenerativeModel(
    'gemini-pro',
    tools=[wikipedia_search],
    generation_config={'temperature': 0.6})

Generate supporting search queries

In order to have multiple supporting search queries to the user's original query, you will ask the model to generate more such queries. This would help the engine to cover the asked question on comprehensive levels.

instructions = """You have access to the Wikipedia API which you will be using
to answer a user's query. Your job is to generate a list of search queries which
might answer a user's question. Be creative by using various key-phrases from
the user's query. To generate variety of queries, ask questions which are
related to  the user's query that might help to find the answer. The more
queries you generate the better are the odds of you finding the correct answer.
Here is an example:

user: Tell me about Cricket World cup 2023 winners.

function_call: wikipedia_search(['What is the name of the team that
won the Cricket World Cup 2023?', 'Who was the captain of the Cricket World Cup
2023 winning team?', 'Which country hosted the Cricket World Cup 2023?', 'What
was the venue of the Cricket World Cup 2023 final match?', 'Cricket World cup 2023',
'Who lifted the Cricket World Cup 2023 trophy?'])

The search function will return a list of article summaries, use these to
answer the  user's question.

Here is the user's query: {query}
"""

In order to yield creative and a more random variety of questions, you will set the model's temperature parameter to a value higher. Values can range from [0.0,1.0], inclusive. A value closer to 1.0 will produce responses that are more varied and creative, while a value closer to 0.0 will typically result in more straightforward responses from the model.

Enable automatic function calling and call the API

Now start a new chat with enable_automatic_function_calling=True. With it enabled, the genai.ChatSession will handle the back and forth required to call the function, and return the final response:

model = genai.GenerativeModel(
    'gemini-pro', tools=[wikipedia_search], generation_config={'temperature': 0.6})

chat = model.start_chat(enable_automatic_function_calling=True)

query = "Explain how deep-sea life survives."

res = chat.send_message(instructions.format(query=query))
Searching for "How does deep-sea life survive?"
Related search terms: ['Deep sea', 'Deep-sea community', 'Hydrothermal vent']
Fetching page: "Deep sea"
Information Source: https://en.wikipedia.org/wiki/Deep_sea
Fetching page: "Deep-sea community"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_community
Fetching page: "Hydrothermal vent"
Information Source: https://en.wikipedia.org/wiki/Hydrothermal_vent
Searching for "What adaptations have deep-sea life developed to survive?"
Related search terms: ['Deep sea', 'Deep-sea fish', 'Deep-sea community']
Fetching page: "Deep-sea fish"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish
Searching for "What are the unique characteristics of deep-sea life?"
Related search terms: ['Deep-sea fish', 'Deep-sea community', 'Deep sea']
Searching for "What are the challenges deep-sea life faces?"
Related search terms: ['Deep sea', 'Marine habitat', 'Sea']
Fetching page: "Marine habitat"
Information Source: https://en.wikipedia.org/wiki/Marine_habitat
Fetching page: "Sea"
Information Source: https://en.wikipedia.org/wiki/Sea
Searching for "How has deep-sea life evolved to cope with the extreme conditions?"
Related search terms: ['Deep-sea fish', 'Marine life', 'Hydrothermal vent microbial communities']
Fetching page: "Marine life"
Information Source: https://en.wikipedia.org/wiki/Marine_life
Fetching page: "Hydrothermal vent microbial communities"
Information Source: https://en.wikipedia.org/wiki/Hydrothermal_vent_microbial_communities
Information Sources:
     https://en.wikipedia.org/wiki/Deep_sea
     https://en.wikipedia.org/wiki/Deep-sea_community
     https://en.wikipedia.org/wiki/Hydrothermal_vent
     https://en.wikipedia.org/wiki/Deep-sea_fish
     https://en.wikipedia.org/wiki/Marine_habitat
     https://en.wikipedia.org/wiki/Sea
     https://en.wikipedia.org/wiki/Marine_life
     https://en.wikipedia.org/wiki/Hydrothermal_vent_microbial_communities
to_markdown(res.text)

Deep-sea life has evolved remarkable adaptations to survive the extreme conditions of the deep ocean. They have adapted to withstand high pressure, cold temperatures, and low oxygen levels. They have also developed unique ways to find food and communicate in the darkness of the deep sea.

Some of the adaptations of deep-sea life include:

  • High pressure tolerance: Deep-sea organisms have evolved strong bodies to withstand the immense pressure of the deep ocean. Their bodies are often filled with a gelatinous substance that helps them to withstand the pressure.
  • Cold tolerance: Deep-sea organisms have adapted to the cold temperatures of the deep ocean. They have enzymes that function at low temperatures, and their bodies are often covered in a thick layer of insulation.
  • Low oxygen tolerance: Deep-sea organisms have adapted to the low oxygen levels of the deep ocean. They have evolved efficient respiratory systems that allow them to extract oxygen from the water.
  • ** Bioluminescence:** Many deep-sea organisms produce their own light, a process called bioluminescence. They use bioluminescence to attract prey, communicate with each other, and defend themselves from predators.
  • Chemosynthesis: Some deep-sea organisms do not rely on sunlight for food. Instead, they use a process called chemosynthesis to create food from chemicals in the water.

These are just a few of the adaptations that deep-sea life has evolved to survive in the extreme conditions of the deep ocean. These adaptations are a testament to the resilience and adaptability of life on Earth.

Check for additional citations:

res.candidates[0].citation_metadata or 'No citations found'
'No citations found'

That looks like it worked. You can go through the chat history to see the details of what was sent and received in the function calls:

for content in chat.history:
  part = content.parts[0]

  print(f'{content.role} -> ', end='')
  print(json.dumps(type(part).to_dict(part), indent=2))
  print('---' * 20)
user -> {
  "text": "You have access to the Wikipedia API which you will be using\nto answer a user's query. Your job is to generate a list of search queries which\nmight answer a user's question. Be creative by using various key-phrases from\nthe user's query. To generate variety of queries, ask questions which are\nrelated to  the user's query that might help to find the answer. The more\nqueries you generate the better are the odds of you finding the correct answer.\nHere is an example:\n\nuser: Tell me about Cricket World cup 2023 winners.\n\nfunction_call: wikipedia_search(['What is the name of the team that\nwon the Cricket World Cup 2023?', 'Who was the captain of the Cricket World Cup\n2023 winning team?', 'Which country hosted the Cricket World Cup 2023?', 'What\nwas the venue of the Cricket World Cup 2023 final match?', 'Cricket World cup 2023',\n'Who lifted the Cricket World Cup 2023 trophy?'])\n\nThe search function will return a list of article summaries, use these to\nanswer the  user's question.\n\nHere is the user's query: Explain how deep-sea life survives.\n"
}
------------------------------------------------------------
model -> {
  "function_call": {
    "name": "wikipedia_search",
    "args": {
      "search_queries": [
        "How does deep-sea life survive?",
        "What adaptations have deep-sea life developed to survive?",
        "What are the unique characteristics of deep-sea life?",
        "What are the challenges deep-sea life faces?",
        "How has deep-sea life evolved to cope with the extreme conditions?"
      ]
    }
  }
}
------------------------------------------------------------
user -> {
  "function_response": {
    "name": "wikipedia_search",
    "response": {
      "result": [
        "**Environmental Characteristics**\n- Pressure: Pressure increases by about 1 atmosphere for every 10 meters of depth. Deep-sea organisms must have adaptations to withstand this pressure.\n- Salinity: Salinity is remarkably constant throughout the deep sea, with no significant ecological differences.\n- Temperature: The two areas of greatest temperature gradient are the transition zone between the surface waters and the deep waters (the thermocline) and the transition between the deep-sea floor and the hot water flows at the hydrothermal vents.\n- Light: Natural light does not penetrate the deep ocean, except for the upper parts of the mesopelagic. Organisms must rely on energy sources from elsewhere, such as organic material drifting down from the photic zone.\n\n**Biology**\n- Regions below the epipelagic are divided into further zones: bathyal zone (200-3000 meters), abyssal zone (3000-6000 meters), and hadal zone (6000-11,000 meters).\n- Food: Deep-sea organisms rely on falling organic matter known as 'marine snow' and carcasses derived from the productive zone above.\n- Adaptations: Deep-sea organisms have various adaptations to survive in extreme conditions, including: \n 1. Jelly-like flesh to provide buoyancy\n 2. Floaters filled with ammonium chloride that are lighter than the surrounding water\n 3. Small size, slow metabolism, and elongated bodies\n 4. Enhanced eyesight, such as larger eyes and rod cells for detecting light in low-light conditions\n 5. Bioluminescence for camouflage and attracting prey\n 6. Modifications in proteins, anatomical structures and metabolic systems to cope with high hydrostatic pressure\n\n**Chemosynthesis**\n- Some species in the deep sea do not rely on dissolved organic matter for their food.\n- These species form communities around hydrothermal vents and rely on chemosynthesis, a process where bacteria use chemical energy to produce organic matter. The tube worm Riftia is an example of an organism that benefits from this process.\n\n**Adaptation to Hydrostatic Pressure:**\n- Deep-sea organisms have developed unique adaptations to survive hydrostatic pressure.\n- Proteins can be affected by hydrostatic pressure, so deep-sea organisms have specific substitutions in the active sites of proteins like actin.\n- These substitutions allow for better stabilization in ATP binding and subunit arrangement.\n- Osmolytes like Trimethylamine N-oxide (TMAO) are adjusted in deep-sea fish to assist in protein stabilization.\n- Molecular adaptations include modified Osteocalcin genes, which lead to open skulls and cartilage-based bone formation in species like the Mariana hadal snailfish. These adaptations are crucial for withstanding high pressure in the deep sea.\n\nBased on:\n  https://en.wikipedia.org/wiki/Deep_sea",
        "Deep-sea life survives due to various adaptations and energy sources:\n\n**Adaptations:**\n- Size: Smaller size to withstand pressure.\n- Gelatinous flesh and minimal skeletal structure.\n- Elimination of excess cavities to prevent collapse.\n- Eyes adapted for low light conditions.\n- Tolerance to cold temperatures and low oxygen levels.\n\n**Energy Sources:**\n\n**Marine Snow:**\n- Repackaged organic matter that sinks quickly, providing food for bottom-dwelling organisms.\n\n**Whale Falls:**\n- Dead whales provide a significant amount of organic matter, supporting a diverse community of scavengers and other organisms.\n- Stages of whale fall progression: mobile scavenger, opportunistic, and sulfophilic.\n\n**Chemosynthesis:**\n\n**Hydrothermal Vents:**\n- Spew forth chemicals that bacteria can transform into energy.\n- Support giant tube worms and other unique species.\n- Entire ecosystems independent from sunlight.\n\n**Cold Seeps:**\n- Hydrogen sulfide, methane, and other hydrocarbon-rich fluids provide energy for chemosynthetic organisms.\n\nBased on:\n  https://en.wikipedia.org/wiki/Deep-sea_community",
        "**How does deep-sea life survive?**\n\n* Life around hydrothermal vents is based on chemosynthesis, where organisms use chemical compounds as energy sources instead of sunlight.\n* Chemosynthetic bacteria form the base of the food chain and support diverse organisms.\n* Specialized adaptations allow organisms to withstand extreme conditions, such as high temperatures and pressure, and toxic chemicals.\n* They have symbiotic relationships with chemoautotrophic microbial symbionts that convert inorganic molecules into organic molecules for nutrition.\n* Their metabolism allows them to survive in environments where sunlight is absent and oxygen is limited.\n\nBased on:\n  https://en.wikipedia.org/wiki/Hydrothermal_vent",
        "**Adaptations of Deep-Sea Fish:**\n\n* **Vision**:\n    * Large, sensitive eyes for low-light environments\n    * Bioluminescence to attract prey or illuminate the area\n\n* **Sensory adaptations**:\n    * Enhanced sensitivity to pressure and smell\n    * Loss of eyesight in some species\n\n* **Buoyancy control**:\n    * Reduction in swim bladders (in bathypelagic fish)\n    * Hydrofoils to provide lift\n    * High fat content and low bone density to reduce buoyancy\n\n* **Metabolic adaptations**:\n    * Slow metabolism\n    * Increased proportion of unsaturated fatty acids in cell membranes for fluidity\n\n* **Adaptations to high pressure**:\n    * Gelatinous layer for buoyancy\n    * Modifications in protein structure and reaction criteria\n    * Rigid proteins to resist pressure\n    * High tolerance of Na+/K+-ATPase to hydrostatic pressure\n\n* **Feeding adaptations**:\n    * Long feelers to locate prey\n    * Large mouths with sharp teeth for consuming large prey\n    * Expandable bodies to accommodate large prey items\n\n* **Mating and reproduction**:\n    * Bioluminescence to attract mates\n    * Hermaphroditism in some species\n    * Extreme sexual dimorphism in anglerfish (male attached to female)\n\nBased on:\n  https://en.wikipedia.org/wiki/Deep-sea_fish",
        "**Challenges deep-sea life faces:**\n\n- Extreme water pressure\n- No sunlight\n- Cold temperatures\n- Limited food resources\n- Pollution\n- Human activities (e.g., fishing, mining)\n\nBased on:\n  https://en.wikipedia.org/wiki/Marine_habitat",
        "**Challenges deep-sea life faces:**\n\n* **Low light:** sunlight only penetrates the top 200 meters, making it difficult for plants to grow and limiting the food available for other organisms.\n* **High pressure:** the pressure increases with depth, making it difficult for organisms to maintain their body structure and function.\n* **Cold temperatures:** the temperature decreases with depth, making it difficult for organisms to regulate their body temperature.\n* **Low oxygen levels:** the oxygen content of the water decreases with depth, making it difficult for organisms to breathe.\n* **Nutrient scarcity:** the availability of nutrients decreases with depth, making it difficult for organisms to find food.\n* **Pollution:** pollutants from human activities can accumulate in the deep sea, harming organisms and disrupting the ecosystem.\n* **Climate change:** climate change is altering the conditions in the deep sea, such as temperature, acidity, and oxygen levels, which can be harmful to organisms.\n\nBased on:\n  https://en.wikipedia.org/wiki/Sea",
        "Deep-sea life has evolved to cope with the extreme conditions of the deep ocean. There is no sunlight, so primary producers must use chemosynthesis to create food. The water is cold and dark, so animals must adapt to the low temperatures and lack of light. The pressure is immense, so animals must develop strong bodies to withstand the crushing force.\nOne of the most striking adaptations of deep-sea life is the use of bioluminescence. This is the ability to produce light, and it is used by many deep-sea animals to attract prey, communicate with each other, and defend themselves from predators. Bioluminescence is often produced by a chemical reaction that involves a luciferase enzyme and a luciferin substrate.\nAnother adaptation of deep-sea life is the use of gigantism. This is the tendency for deep-sea animals to be larger than their shallow-water counterparts. Gigantism is thought to be an adaptation to the low food availability in the deep sea. Larger animals have a greater chance of finding food, and they can also store more energy in their bodies.\nDeep-sea life is a fascinating and diverse group of organisms that have evolved to cope with the extreme conditions of the deep ocean. These animals have developed a range of adaptations that allow them to survive and thrive in this unique environment.\n\nBased on:\n  https://en.wikipedia.org/wiki/Marine_life",
        "**Adaptations** \n\n- Microbes that inhabit hydrothermal vents have adapted to extreme conditions, such as high temperatures, pressure, and chemical concentrations. \n\n- Hyperthermophiles, microorganisms that grow at temperatures above 90 \u00b0C, are found where fluids from the vents are expelled and mixed with the surrounding water. \n\n- Hyperthermophilic microbes are thought to contain proteins that have extended stability at higher temperatures due to intramolecular interactions. \n\n- Microbes are also found in symbiotic relationships with other organisms in the hydrothermal vent environment due to their ability to have a detoxification mechanism that allows them to metabolize the sulfide-rich waters which would otherwise be toxic to the organisms and the microbes.\n\nBased on:\n  https://en.wikipedia.org/wiki/Hydrothermal_vent_microbial_communities"
      ]
    }
  }
}
------------------------------------------------------------
model -> {
  "text": "Deep-sea life has evolved remarkable adaptations to survive the extreme conditions of the deep ocean. They have adapted to withstand high pressure, cold temperatures, and low oxygen levels. They have also developed unique ways to find food and communicate in the darkness of the deep sea.\n\nSome of the adaptations of deep-sea life include:\n\n* **High pressure tolerance:** Deep-sea organisms have evolved strong bodies to withstand the immense pressure of the deep ocean. Their bodies are often filled with a gelatinous substance that helps them to withstand the pressure.\n* **Cold tolerance:** Deep-sea organisms have adapted to the cold temperatures of the deep ocean. They have enzymes that function at low temperatures, and their bodies are often covered in a thick layer of insulation.\n* **Low oxygen tolerance:** Deep-sea organisms have adapted to the low oxygen levels of the deep ocean. They have evolved efficient respiratory systems that allow them to extract oxygen from the water.\n* ** Bioluminescence:** Many deep-sea organisms produce their own light, a process called bioluminescence. They use bioluminescence to attract prey, communicate with each other, and defend themselves from predators.\n* **Chemosynthesis:** Some deep-sea organisms do not rely on sunlight for food. Instead, they use a process called chemosynthesis to create food from chemicals in the water.\n\nThese are just a few of the adaptations that deep-sea life has evolved to survive in the extreme conditions of the deep ocean. These adaptations are a testament to the resilience and adaptability of life on Earth."
}
------------------------------------------------------------

In the chat history you can see all 4 steps:

  1. The user sent the query.
  2. The model replied with a glm.FunctionCall calling the wikipedia_search with a number of relevant searches.
  3. Because you set enable_automatic_function_calling=True when creating the genai.ChatSession, it executed the search function and returned the list of article summaries to the model.
  4. Folliwing the instructions in the prompt, the model generated a final answer based on those summaries.

[Optional] Manually execute the function call

If you want to understand what happened behind the scenes, this section executes the FunctionCall manually to demonstrate.

chat = model.start_chat()
result = chat.send_message(instructions.format(query=query))

Initially the model returns a FunctionCall:

fc = result.candidates[0].content.parts[0].function_call
fc = type(fc).to_dict(fc)
print(json.dumps(fc, indent=2))
{
  "name": "wikipedia_search",
  "args": {
    "search_queries": [
      "How do deep-sea animals survive?",
      "What are the adaptations of deep-sea creatures?",
      "How do deep-sea animals cope with extreme pressure?",
      "What are the unique characteristics of deep-sea organisms?",
      "How do deep-sea animals find food?",
      "How do deep-sea animals reproduce?",
      "What are the challenges faced by deep-sea animals?",
      "What is the role of deep-sea animals in the marine ecosystem?"
    ]
  }
}
fc['name']
'wikipedia_search'

Call the function with generated arguments to get the results.

summaries = wikipedia_search(**fc['args'])
Searching for "How do deep-sea animals survive?"
Related search terms: ['Deep sea', 'Marine life', 'Deep-sea fish']
Fetching page: "Deep sea"
Information Source: https://en.wikipedia.org/wiki/Deep_sea
Fetching page: "Marine life"
Information Source: https://en.wikipedia.org/wiki/Marine_life
Fetching page: "Deep-sea fish"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish
Searching for "What are the adaptations of deep-sea creatures?"
Related search terms: ['Deep sea', 'Deep-sea community', 'Deep-sea fish']
Fetching page: "Deep-sea community"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_community
Searching for "How do deep-sea animals cope with extreme pressure?"
Related search terms: ['Deep sea', 'Deep-sea fish', 'Deep-sea community']
Searching for "What are the unique characteristics of deep-sea organisms?"
Related search terms: ['Deep-sea fish', 'Deep sea', 'Deep-sea community']
Searching for "How do deep-sea animals find food?"
Related search terms: ['Deep-sea community', 'Deep-sea fish', 'Marine life']
Searching for "How do deep-sea animals reproduce?"
Related search terms: ['Marine life', 'Sea cucumber', 'Deep-water coral']
Fetching page: "Sea cucumber"
Information Source: https://en.wikipedia.org/wiki/Sea_cucumber
Fetching page: "Deep-water coral"
Information Source: https://en.wikipedia.org/wiki/Deep-water_coral
Searching for "What are the challenges faced by deep-sea animals?"
Related search terms: ['Deep sea', 'Marine life', 'Marine habitat']
Fetching page: "Marine habitat"
Information Source: https://en.wikipedia.org/wiki/Marine_habitat
Searching for "What is the role of deep-sea animals in the marine ecosystem?"
Related search terms: ['Marine ecosystem', 'Deep-sea community', 'Marine life']
Fetching page: "Marine ecosystem"
Information Source: https://en.wikipedia.org/wiki/Marine_ecosystem
Information Sources:
     https://en.wikipedia.org/wiki/Deep_sea
     https://en.wikipedia.org/wiki/Marine_life
     https://en.wikipedia.org/wiki/Deep-sea_fish
     https://en.wikipedia.org/wiki/Deep-sea_community
     https://en.wikipedia.org/wiki/Sea_cucumber
     https://en.wikipedia.org/wiki/Deep-water_coral
     https://en.wikipedia.org/wiki/Marine_habitat
     https://en.wikipedia.org/wiki/Marine_ecosystem

Now send the FunctionResult to the model.

response = chat.send_message(
    glm.Content(
      parts=[glm.Part(
          function_response = glm.FunctionResponse(
            name='wikipedia_search',
            response={'result': summaries}
          )
      )]
    )
)

to_markdown(response.text)

Deep-sea life survives by adapting to the extreme conditions of the deep ocean, including high pressure, low temperatures, and lack of light.

Adaptations for survival include:

  • High internal pressure: Deep-sea animals have high internal pressure that matches the external pressure, preventing them from being crushed.
  • Buoyancy adaptations: Many deep-sea fish have a gelatinous layer below the skin or around the spine for buoyancy and swimming efficiency. They also have low tissue density, achieved through high fat content, reduced skeletal weight, and water accumulation, allowing them to float without a swim bladder.
  • Light and vision: Deep-sea fish lack sunlight, so they rely on other senses, such as sensitivity to pressure changes and smell, for locating prey and mates. Many deep-sea fish are bioluminescent, using light to communicate, attract prey, or camouflage themselves. Some have sensitive eyes with high numbers of Rh1 genes, helping them see in low light conditions.
  • Feeding mechanisms: Deep-sea fish often have large mouths and sharp teeth for consuming prey of similar or larger sizes. They use feelers to locate prey in the darkness.
  • Behavior: Mesopelagic fish make vertical migrations following zooplankton prey, returning to deeper depths during the day. Bathypelagic fish are sedentary, waiting for prey to come close enough or being lured by bioluminescence. Some deep-sea fish are hermaphrodites, increasing their chances of reproduction in the sparse environment.
  • Physiological adaptations: Deep-sea animals have slow metabolisms and unspecialized diets, allowing them to survive with limited food availability. Their proteins are structurally modified to withstand high pressure, ensuring enzymatic reactions and cellular processes function properly. Na+/K+ -ATPase, involved in osmoregulation, is more tolerant of pressure in deep-sea fish compared to shallow-water species.

Re-ranking the search results

Helper function to embed the content:

def get_embeddings(content: list[str]) -> np.ndarray:
  embeddings = genai.embed_content('models/embedding-001', content, 'SEMANTIC_SIMILARITY')
  embds = embeddings.get('embedding', None)
  embds = np.array(embds).reshape(len(embds), -1)
  return embds

Please refer to the embeddings guide for more information on embeddings.

Your next step is to define functions that you can use to calculate similarity scores between two embedding vectors. These scores will help you decide which embedding vector is the most relevant vector to the user's query.

You will now implement cosine similarity as your metric. Here returned embedding vectors will be of unit length and hence their L1 norm (np.linalg.norm()) will be ~1. Hence, calculating cosine similarity is esentially same as calculating their dot product score.

def dot_product(a: np.ndarray, b: np.ndarray):
  return (a @ b.T)

Similarity with user's query

Now it's time to find the most relevant search result returned by the Wikipedia API.

Use Gemini API to get embeddings for user's query and search results.

search_res = get_embeddings(summaries)
embedded_query = get_embeddings([query])

Calculate similarity score:

sim_value = dot_product(search_res, embedded_query)

using np.argmax best candidate is selected.

Users's Input: Explain how deep-sea life survives.

Answer:

print(summaries[np.argmax(sim_value)])
In this document, there is no information about how deep-sea animals survive.

Based on:
  https://en.wikipedia.org/wiki/Marine_life

Similarity with Hypothetical Document Embeddings (HyDE)

Drawing inspiration from [Gao et al] the objective here is to generate a template answer to the user's query using gemini-pro's internal knowledge. This hypothetical answer will serve as a baseline to calculate relevance of all the search results.

hypothetical_ans_model = genai.GenerativeModel('gemini-pro')
res = hypothetical_ans_model.generate_content(f"""Generate a hypothetical answer
to the user's query by using your own knowledge. Assume that you know everything
about the said topic. Do not use factual information, instead use placeholders
to complete your answer. Your answer should feel like it has been written by a human.

query: {query}""")

to_markdown(res.text)

In the enigmatic depths where sunlight surrenders to inky blackness, life perseveres, illuminated by the faintest of luminescent flickers.

Imagine a realm where extreme pressure could crush the mightiest of vessels. Yet, in this unforgiving abyss, creatures have evolved with bodies resilient as the very seafloor. Their flexible exoskeletons or gelatinous tissues withstand the crushing weight gracefully.

Oxygen, a lifeline for most creatures, grows scarce with depth. Enter our deep-sea dwellers, whose bodies have ingeniously adapted. They absorb oxygen directly through their skin or gills, maximizing every molecule they find.

Nutrient scarcity plagues these depths, where sunlight cannot penetrate to foster photosynthesis. Instead, these creatures rely on chemosynthesis, a remarkable process that utilizes chemicals from hydrothermal vents or decaying matter.

In the perpetual darkness, vision becomes obsolete. Instead, sensory organs have evolved to detect minute vibrations, bioluminescence, and heat gradients, guiding them through the shadowy labyrinth.

Temperature fluctuations can be drastic, from freezing cold to scalding heat. But deep-sea creatures have mastered the art of thermoregulation, their internal systems finely tuned to withstand the extremes.

Growth is a slow and arduous process in these unforgiving depths. Many species exhibit extreme longevity, surviving for centuries or even millennia. Their life cycles are meticulously paced, ensuring their survival in this harsh environment.

Reproduction is a perilous task, with offspring often vulnerable and exposed. Some deep-sea creatures protect their young with parental care, nurturing them until they can fend for themselves in this unforgiving realm.

The deep sea, a testament to the resilience and adaptability of life, is a fascinating and mysterious world. Its inhabitants continue to inspire awe and wonder, reminding us of the extraordinary diversity and ingenuity that exists within our planet's watery depths.

Use Gemini API to get embeddings for the baseline answer and compare them with search results

hypothetical_ans = get_embeddings([res.text])

Calculate similarity scores to rank the search results

sim_value = dot_product(search_res, hypothetical_ans)
sim_value
array([[0.72687077],
       [0.73694087],
       [0.77235092],
       [0.75185433],
       [0.63363508],
       [0.62639701],
       [0.71418557],
       [0.70211815]])

using np.argmax best candidate is selected.

Users's Input: Explain how deep-sea life survives.

Answer:

to_markdown(summaries[np.argmax(sim_value)])

How do deep-sea animals survive?

Adaptations to Pressure:

  • Deep-sea animals have high internal pressure that matches the external pressure, preventing them from being crushed.
  • Their cell membranes contain a higher proportion of unsaturated fatty acids, which increases membrane fluidity in high-pressure environments.

Buoyancy Adaptations:

  • Many deep-sea fish have a gelatinous layer below the skin or around the spine for buoyancy and swimming efficiency.
  • They have low tissue density, achieved through high fat content, reduced skeletal weight, and water accumulation, allowing them to float without a swim bladder.

Light and Vision:

  • Deep-sea fish lack sunlight, so they rely on other senses, such as sensitivity to pressure changes and smell, for locating prey and mates.
  • Many deep-sea fish are bioluminescent, using light to communicate, attract prey, or camouflage themselves.
  • Some have sensitive eyes with high numbers of Rh1 genes, helping them see in low light conditions.

Feeding Mechanisms:

  • Deep-sea fish often have large mouths and sharp teeth for consuming prey of similar or larger sizes.
  • They use feelers to locate prey in the darkness.

Behavior:

  • Mesopelagic fish make vertical migrations following zooplankton prey, returning to deeper depths during the day.
  • Bathypelagic fish are sedentary, waiting for prey to come close enough or being lured by bioluminescence.
  • Some deep-sea fish are hermaphrodites, increasing their chances of reproduction in the sparse environment.

Physiological Adaptations:

  • Deep-sea animals have slow metabolisms and unspecialized diets, allowing them to survive with limited food availability.
  • Their proteins are structurally modified to withstand high pressure, ensuring enzymatic reactions and cellular processes function properly.
  • Na+/K+ -ATPase, involved in osmoregulation, is more tolerant of pressure in deep-sea fish compared to shallow-water species.

Based on: https://en.wikipedia.org/wiki/Deep-sea_fish

You have now created a search re-ranking engine using embeddings!

Next steps

To learn how to use other services in the Gemini API, visit the Python quickstart. To learn more about how you can use the embeddings, check out the examples available.