OpenAIEngine¶

The OpenAIEngine is used to make requests to the OpenAI API.

TL;DR

# see https://platform.openai.com/docs/models for a list of model IDs
from kani.engines.openai import OpenAIEngine
engine = OpenAIEngine(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-5-nano")

Reference¶

class kani.engines.openai.OpenAIEngine(

api_key: str = None,

model='gpt-4.1-nano',

max_context_size: int = None,

*,

api_type: Literal['chat_completions', 'responses'] = None,

organization: str = None,

retry: int = 5,

api_base: str = 'https://api.openai.com/v1',

headers: dict = None,

client: AsyncOpenAI = None,

tokenizer=None,

**hyperparams,

)[source]

Engine for using the OpenAI API.

This engine supports all chat-based models and fine-tunes.

Multimodal support: images, audio.

Message Extras

"openai_completion": The ChatCompletion (raw response) returned by the OpenAI servers, as a dictionary. Non-streaming responses only.
"openai_usage": The usage data (raw response) returned by the OpenAI servers, as a dictionary.

Parameters:

api_key – Your OpenAI API key. By default, the API key will be read from the OPENAI_API_KEY environment variable.
model – The id of the model to use (e.g. “gpt-4o-mini”, “ft:gpt-3.5-turbo:my-org:custom_suffix:id”).
max_context_size – The maximum amount of tokens allowed in the chat prompt. If None, uses the given model’s full context size.
api_type – Whether to use the Chat Completions API (default for most models) or Responses API (default for “deep-reasoning” style models). If unset, the best API type for the given model will be chosen.
organization – The OpenAI organization to use in requests. By default, the org ID would be read from the OPENAI_ORG_ID environment variable (defaults to the API key’s default org if not set).
retry – How many times the engine should retry failed HTTP calls with exponential backoff (default 5).
api_base – The base URL of the OpenAI API to use.
headers – A dict of HTTP headers to include with each request.
client – An instance of openai.AsyncOpenAI (for reusing the same client in multiple engines). You must specify exactly one of (api_key, client). If this is passed the organization, retry, api_base, and headers params will be ignored.
tokenizer – The tokenizer to use for token estimation - for OpenAI models this will be loaded automatically. A class with a .encode(text: str) method that returns a list (usually of token ids).
hyperparams – The arguments to pass to the create_chat_completion call with each request. See https://platform.openai.com/docs/api-reference/chat/create for a full list of params.

Recipes¶

OpenAI-Compatible Server¶

To use an OpenAI-compatible server hosting a non-OpenAI model, you will need to pass the api_base to the OpenAIEngine constructor and a tokenizer for the model.

The tokenizer should be any object with a method of signature .encode(text: str) -> list[Any].

from kani.engines.openai import OpenAIEngine
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("...")
engine = OpenAIEngine(
    model="my-local-model",
    api_key="dummy",
    api_base="http://127.0.0.1:9001/v1",
    tokenizer=tokenizer,
)