GoogleAIEngine

The GoogleAIEngine is used to make requests to the Google AI Studio API.

TL;DR

# see https://ai.google.dev/gemini-api/docs/models for a list of model IDs
from kani.engines.google import GoogleAIEngine
engine = GoogleAIEngine(api_key=os.getenv("GEMINI_API_KEY"), model="gemini-2.5-flash")

Reference

class kani.engines.google.GoogleAIEngine(
api_key: str = None,
model: str = 'gemini-2.5-flash',
max_context_size: int = None,
*,
retry: int = 2,
api_base: str = None,
headers: dict = None,
client: Client = None,
multimodal_upload_bytes_threshold: int = 512000,
**hyperparams,
)[source]

Engine for using the Google AI Studio API (aka Gemini Developer API, Google AI API) and Google Vertex AI API (aka Google Cloud API).

This engine supports all Google AI models.

See https://ai.google.dev/gemini-api/docs/models for a list of available models.

Multimodal support: images, audio, video.

Message Extras: "google_response": The raw response returned by the Google AI API.

Parameters:
  • api_key – Your Gemini Developer API key. By default, the API key will be read from the GEMINI_API_KEY environment variable.

  • model – The id of the model to use (e.g. “gemini-2.5-flash”). See https://ai.google.dev/gemini-api/docs/models for a list of models.

  • max_tokens – The maximum number of tokens to sample at each generation (defaults to 512). Generally, you should set this to the same number as your Kani’s desired_response_tokens.

  • max_context_size – The maximum amount of tokens allowed in the chat prompt. If None, uses the given model’s full context size.

  • retry – How many times the engine should retry failed HTTP calls with exponential backoff (default 2).

  • api_base – The base URL of the Google AI API to use. If not specified, the default URL for the specified API (AI Studio/Vertex) will be used.

  • headers – A dict of HTTP headers to include with each request.

  • client – An instance of genai.Client (for reusing the same client in multiple engines). You must specify exactly one of (api_key, client).

  • multimodal_upload_bytes_threshold – If a multimodal object (audio, image, video) is larger than this number of bytes, upload it as a file instead of passing it inline in a request. Default 512kB.

  • hyperparams – Any additional parameters to pass to the underlying API call (see https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentConfig).

Notes

Large Multimodal File Handling: When a multimodal message part exceeds the multimodal_upload_bytes_threshold set in the GoogleAIEngine’s constructor, kani will upload the file to the Files API. This allows the file to be reused across multiple requests without uploading the full body with each request.

Recipes

Thinking

import google.genai.types as gai
from kani.engines.google import GoogleAIEngine

# see https://ai.google.dev/gemini-api/docs/thinking for the thinking_budget explanation
thinking_engine = GoogleAIEngine(..., thinking_config=gai.ThinkingConfig(thinking_budget=-1, include_thoughts=True))

Server-Side Tools

To enable server-side tools, you pass them as additional arguments to the tools API argument. You can do this by overriding GoogleAIEngine._prepare_request.

import google.genai.types as gai
from kani.engines.google import GoogleAIEngine

class GoogleAIServersideToolsEngine(GoogleAIEngine):
    def __init__(self, *args, additional_tools: list[gai.Tool] = None, **kwargs):
        super().__init__(*args, **kwargs)
        self.additional_tools = additional_tools or []

    # override prepare_request to inject serverside tool configs
    def _prepare_request(self, messages, functions):
        generation_config, prompt_msgs = super()._prepare_request(messages, functions)
        if self.additional_tools:
            generation_config.setdefault("tools", [])
            generation_config["tools"].extend(self.additional_tools)
        return generation_config, prompt_msgs

web_search_engine = GoogleAIServersideToolsEngine(..., additional_tools=[
    gai.Tool(url_context=gai.UrlContext()),
    gai.Tool(google_search=gai.GoogleSearch()),
])