API Reference#

Kani#

class kani.Kani(
engine: BaseEngine,
system_prompt: str | None = None,
always_included_messages: list[ChatMessage] | None = None,
desired_response_tokens: int = 450,
chat_history: list[ChatMessage] | None = None,
functions: list[AIFunction] | None = None,
retry_attempts: int = 1,
)[source]#

Base class for all kani.

Entrypoints

chat_round(query: str, **kwargs) -> ChatMessage

chat_round_str(query: str, **kwargs) -> str

chat_round_stream(query: str, **kwargs) -> StreamManager

full_round(query: str, **kwargs) -> AsyncIterable[ChatMessage]

full_round_str(query: str, message_formatter: Callable[[ChatMessage], str], **kwargs) -> AsyncIterable[str]

full_round_stream(query: str, **kwargs) -> AsyncIterable[StreamManager]

Function Calling

Subclass and use @ai_function() to register functions. The schema will be autogenerated from the function signature (see ai_function()).

To perform a chat round with functions, use full_round() as an async iterator:

async for msg in kani.full_round(prompt):
    # responses...

Each response will be a ChatMessage.

Alternatively, you can use full_round_str() and control the format of a yielded function call with function_call_formatter.

Retry & Model Feedback

If the model makes an error when attempting to call a function (e.g. calling a function that does not exist or passing params with incorrect and non-coercible types) or the function raises an exception, Kani will send the error in a system message to the model, allowing it up to retry_attempts to correct itself and retry the call.

Parameters:
  • engine – The LM engine implementation to use.

  • system_prompt – The system prompt to provide to the LM. The prompt will not be included in chat_history.

  • always_included_messages – A list of messages to always include as a prefix in all chat rounds (i.e., evict newer messages rather than these to manage context length). These will not be included in chat_history.

  • desired_response_tokens – The minimum amount of space to leave in max context size - tokens in prompt. To control the maximum number of tokens generated more precisely, you may be able to configure the engine (e.g. OpenAIEngine(..., max_tokens=250)).

  • chat_history

    The chat history to start with (not including system prompt or always included messages), for advanced use cases. By default, each kani starts with a new conversation session.

    Caution

    If you pass another kani’s chat history here without copying it, the same list will be mutated! Use chat_history=mykani.chat_history.copy() to pass a copy.

  • functions – A list of AIFunction to expose to the model (for dynamic function calling). Use ai_function() to define static functions (see Function Calling).

  • retry_attempts – How many attempts the LM may take per full round if any tool call raises an exception.

always_included_messages: list[ChatMessage]#

Chat messages that are always included as a prefix in the model’s prompt. Includes the system message, if supplied.

chat_history: list[ChatMessage]#

All messages in the current chat state, not including system or always included messages.

async chat_round(query: str | Sequence[MessagePart | str] | None, **kwargs) ChatMessage[source]#

Perform a single chat round (user -> model -> user, no functions allowed).

Parameters:
  • query – The contents of the user’s chat message. Can be None to generate a completion without a user prompt.

  • kwargs – Additional arguments to pass to the model engine (e.g. hyperparameters).

Returns:

The model’s reply.

async chat_round_str(query: str | Sequence[MessagePart | str] | None, **kwargs) str[source]#

Like chat_round(), but only returns the text content of the message.

chat_round_stream(
query: str | Sequence[MessagePart | str] | None,
**kwargs,
) StreamManager[source]#

Returns a stream of tokens from the engine as they are generated.

To consume tokens from a stream, use this class as so:

stream = ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?")
async for token in stream:
    print(token, end="")
msg = await stream.message()

Tip

For compatibility and ease of refactoring, awaiting the stream itself will also return the message, i.e.:

msg = await ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?")

(note the await that is not present in the above examples).

The arguments are the same as chat_round().

async full_round(
query: str | Sequence[MessagePart | str] | None,
**kwargs,
) AsyncIterable[ChatMessage][source]#

Perform a full chat round (user -> model [-> function -> model -> …] -> user).

Yields each non-user ChatMessage created during the round. A ChatMessage will have at least one of (content, function_call).

Use this in an async for loop, like so:

async for msg in kani.full_round("How's the weather?"):
    print(msg.text)
Parameters:
  • query – The content of the user’s chat message. Can be None to generate a completion without a user prompt.

  • kwargs – Additional arguments to pass to the model engine (e.g. hyperparameters).

async full_round_str(query: str | ~typing.Sequence[MessagePart | str] | None, message_formatter: ~typing.Callable[[~kani.models.ChatMessage], str | None] = <function assistant_message_contents>, **kwargs) AsyncIterable[str][source]#

Like full_round(), but each yielded element is a str rather than a ChatMessage.

Parameters:
  • query – The content of the user’s chat message.

  • message_formatter – A function that returns a string to yield for each message. By default, full_round_str yields the content of each assistant message.

  • kwargs – Additional arguments to pass to the model engine (e.g. hyperparameters).

async full_round_stream(
query: str | Sequence[MessagePart | str] | None,
**kwargs,
) AsyncIterable[StreamManager][source]#

Perform a full chat round (user -> model [-> function -> model -> …] -> user).

Yields a stream of tokens for each non-user ChatMessage created during the round.

To consume tokens from a stream, use this class as so:

async for stream in ai.full_round_stream("What is the airspeed velocity of an unladen swallow?"):
    async for token in stream:
        print(token, end="")
    msg = await stream.message()

Each StreamManager object yielded by this method contains a StreamManager.role attribute that can be used to determine if a message is from the engine or a function call. This attribute will be available before iterating over the stream.

The arguments are the same as full_round().

property always_len: int#

Returns the number of tokens that will always be reserved.

(e.g. for system prompts, always included messages, the engine, and the response).

message_token_len(message: ChatMessage)[source]#

Returns the number of tokens used by a given message.

async get_model_completion(include_functions: bool = True, **kwargs) BaseCompletion[source]#

Get the model’s completion with the current chat state.

Compared to chat_round() and full_round(), this lower-level method does not save the model’s reply to the chat history or mutate the chat state; it is intended to help with logging or to repeat a call multiple times.

Parameters:
  • include_functions – Whether to pass this kani’s function definitions to the engine.

  • kwargs – Arguments to pass to the model engine.

async get_model_stream(
include_functions: bool = True,
**kwargs,
) AsyncIterable[str | BaseCompletion][source]#

Get the model’s completion with the current chat state as a stream. This is a low-level method like get_model_completion() but for streams.

async get_prompt() list[ChatMessage][source]#

Called each time before asking the LM engine for a completion to generate the chat prompt. Returns a list of messages such that the total token count in the messages is less than (self.max_context_size - self.desired_response_tokens).

Always includes the system prompt plus any always_included_messages at the start of the prompt.

You may override this to get more fine-grained control over what is exposed in the model’s memory at any given call.

async do_function_call(
call: FunctionCall,
tool_call_id: str | None = None,
) FunctionCallResult[source]#

Resolve a single function call.

By default, any exception raised from this method will be an instance of a FunctionCallException.

You may implement an override to add instrumentation around function calls (e.g. tracking success counts for varying prompts). See Handle a Function Call.

Parameters:
  • call – The name of the function to call and arguments to call it with.

  • tool_call_id – The tool_call_id to set in the returned FUNCTION message.

Returns:

A FunctionCallResult including whose turn it is next and the message with the result of the function call.

Raises:
async handle_function_call_exception(
call: FunctionCall,
err: FunctionCallException,
attempt: int,
tool_call_id: str | None = None,
) ExceptionHandleResult[source]#

Called when a function call raises an exception.

By default, returns a message telling the LM about the error and allows a retry if the error is recoverable and there are remaining retry attempts.

You may implement an override to customize the error prompt, log the error, or use custom retry logic. See Handle a Function Call Exception.

Parameters:
  • call – The FunctionCall the model was attempting to make.

  • err – The error the call raised. Usually this is NoSuchFunction or WrappedCallException, although it may be any exception raised by do_function_call().

  • attempt – The attempt number for the current call (0-indexed).

  • tool_call_id – The tool_call_id to set in the returned FUNCTION message.

Returns:

A ExceptionHandleResult detailing whether the model should retry and the message to add to the chat history.

async add_to_history(message: ChatMessage)[source]#

Add the given message to the chat history.

You might want to override this to log messages to an external or control how messages are saved to the chat session’s memory. By default, this appends to chat_history.

save(fp: str | bytes | PathLike, **kwargs)[source]#

Save the chat state of this kani to a JSON file. This will overwrite the file if it exists!

Parameters:
  • fp – The path to the file to save.

  • kwargs – Additional arguments to pass to Pydantic’s model_dump_json.

load(fp: str | bytes | PathLike, **kwargs)[source]#

Load chat state from a JSON file into this kani. This will overwrite any existing chat state!

Parameters:
  • fp – The path to the file containing the chat state.

  • kwargs – Additional arguments to pass to Pydantic’s model_validate_json.

Common Models#

class kani.ChatRole(value)[source]#

Represents who said a chat message.

SYSTEM = 'system'#

The message is from the system (usually a steering prompt).

USER = 'user'#

The message is from the user.

ASSISTANT = 'assistant'#

The message is from the language model.

FUNCTION = 'function'#

The message is the result of a function call.

class kani.FunctionCall(*, name: str, arguments: str)[source]#

Represents a model’s request to call a single function.

name: str#

The name of the requested function.

arguments: str#

The arguments to call it with, encoded in JSON.

property kwargs: dict[str, Any]#

The arguments to call the function with, as a Python dictionary.

classmethod with_args(name: str, **kwargs)[source]#

Create a function call with the given arguments (e.g. for few-shot prompting).

class kani.ToolCall(*, id: str, type: str, function: FunctionCall)[source]#

Represents a model’s request to call a tool with a unique request ID.

See Internal Representation for more information about tool calls vs function calls.

id: str#

The request ID created by the engine. This should be passed back to the engine in ChatMessage.tool_call_id in order to associate a FUNCTION message with this request.

type: str#

The type of tool requested (currently only “function”).

function: FunctionCall#

The requested function call.

classmethod from_function(name: str, *, call_id_: str | None = None, **kwargs)[source]#

Create a tool call request for a function with the given name and arguments.

Parameters:

call_id – The ID to assign to the request. If not passed, generates a random ID.

classmethod from_function_call(call: FunctionCall, call_id_: str | None = None)[source]#

Create a tool call request from an existing FunctionCall.

Parameters:

call_id – The ID to assign to the request. If not passed, generates a random ID.

class kani.MessagePart[source]#

Base class for a part of a message. Engines should inherit from this class to tag substrings with metadata or provide multimodality to an engine. By default, if coerced to a string, will raise a warning noting that rich message part data was lost. For more information see Message Parts.

__str__()[source]#

Used to define the fallback behaviour when a part is serialized to a string (e.g. via ChatMessage.text ). Override this to specify the canonical string representation of your message part.

Engines that support message parts should generally not use this, preferring to iterate over ChatMessage.parts instead.

class kani.ChatMessage(
*,
role: ChatRole,
content: str | list[MessagePart | str] | None,
name: str | None = None,
tool_call_id: str | None = None,
tool_calls: list[ToolCall] | None = None,
is_tool_call_error: bool | None = None,
)[source]#

Represents a message in the chat context.

role: ChatRole#

Who said the message?

content: str | list[MessagePart | str] | None#

The data used to create this message. Generally, you should use text or parts instead.

property text: str | None#

The content of the message, as a string. Can be None only if the message is a requested function call from the assistant. If the message is comprised of multiple parts, concatenates the parts.

property parts: list[MessagePart | str]#

The parts of the message that make up its content. Can be an empty tuple only if the message is a requested function call from the assistant.

This is a read-only list; changes here will not affect the message’s content. To mutate the message content, use copy_with() and set text, parts, or content.

name: str | None#

The name of the user who sent the message, if set (user/function messages only).

tool_call_id: str | None#

The ID for a requested ToolCall which this message is a response to (function messages only).

tool_calls: list[ToolCall] | None#

The tool calls requested by the model (assistant messages only).

is_tool_call_error: bool | None#

If this is a FUNCTION message containing the results of a function call, whether the function call raised an exception.

property function_call: FunctionCall | None#

If there is exactly one tool call to a function, return that tool call’s requested function.

This is mostly provided for backwards-compatibility purposes; iterating over tool_calls should be preferred.

classmethod system(content: str | Sequence[MessagePart | str], **kwargs)[source]#

Create a new system message.

classmethod user(content: str | Sequence[MessagePart | str], **kwargs)[source]#

Create a new user message.

classmethod assistant(content: str | Sequence[MessagePart | str] | None, **kwargs)[source]#

Create a new assistant message.

classmethod function(
name: str | None,
content: str | Sequence[MessagePart | str],
tool_call_id: str | None = None,
**kwargs,
)[source]#

Create a new function message.

copy_with(**new_values)[source]#

Make a shallow copy of this object, updating the passed attributes (if any) to new values.

This does not validate the updated attributes! This is mostly just a convenience wrapper around .model_copy.

Only one of (content, text, parts) may be passed and will update the other two attributes accordingly.

Only one of (tool_calls, function_call) may be passed and will update the other accordingly.

Exceptions#

exception kani.exceptions.KaniException[source]#

Base class for all Kani exceptions/errors.

exception kani.exceptions.MessageTooLong[source]#

This chat message will never fit in the context window.

exception kani.exceptions.HTTPException[source]#

Base class for all HTTP errors (for HTTP engines).

Deprecated since version 1.0.0.

exception kani.exceptions.HTTPTimeout[source]#

Timeout occurred connecting to or waiting for a response from an HTTP request.

Deprecated since version 1.0.0.

exception kani.exceptions.HTTPStatusException(response: aiohttp.ClientResponse, msg: str)[source]#

The HTTP server returned a non-200 status code.

Deprecated since version 1.0.0.

exception kani.exceptions.FunctionCallException(retry: bool)[source]#

Base class for exceptions that occur when a model calls an @ai_function.

exception kani.exceptions.WrappedCallException(retry, original)[source]#

The @ai_function raised an exception.

exception kani.exceptions.NoSuchFunction(name)[source]#

The model attempted to call a function that does not exist.

exception kani.exceptions.FunctionSpecError[source]#

This @ai_function spec is invalid.

exception kani.exceptions.MissingModelDependencies[source]#

You are trying to use an engine but do not have engine-specific packages installed.

exception kani.exceptions.PromptError[source]#

For some reason, the input to this model is invalid.

exception kani.exceptions.MissingMessagePartType(fqn: str, msg: str)[source]#

During loading a saved kani, a message part has a type which is not currently defined in the runtime.

Parameters:

fqn – The fully qualified name of the type that is missing.

AI Function#

kani.ai_function(
func=None,
*,
after: ChatRole = ChatRole.ASSISTANT,
name: str | None = None,
desc: str | None = None,
auto_retry: bool = True,
json_schema: dict | None = None,
auto_truncate: int | None = None,
)[source]#

Decorator to mark a method of a Kani to expose to the AI.

Parameters:
  • after – Who should speak next after the function call completes (see Next Actor). Defaults to the model.

  • name – The name of the function (defaults to the name of the function in source code).

  • desc – The function’s description (defaults to the function’s docstring).

  • auto_retry – Whether the model should retry calling the function if it gets it wrong (see Retry & Model Feedback).

  • json_schema – A JSON Schema document describing the function’s parameters. By default, kani will automatically generate one, but this can be helpful for overriding it in any tricky cases.

  • auto_truncate – If a function response is longer than this many tokens, truncate it until it is at most this many tokens and add “…” to the end. By default, no responses will be truncated. This uses a smart paragraph-aware truncation algorithm.

class kani.AIFunction(
inner,
after: ChatRole = ChatRole.ASSISTANT,
name: str | None = None,
desc: str | None = None,
auto_retry: bool = True,
json_schema: dict | None = None,
auto_truncate: int | None = None,
)[source]#

Wrapper around a function to expose to a language model.

Parameters:
  • inner – The function implementation.

  • after – Who should speak next after the function call completes (see Next Actor). Defaults to the model.

  • name – The name of the function (defaults to the name of the function in source code).

  • desc – The function’s description (defaults to the function’s docstring).

  • auto_retry – Whether the model should retry calling the function if it gets it wrong (see Retry & Model Feedback).

  • json_schema – A JSON Schema document describing the function’s parameters. By default, kani will automatically generate one, but this can be helpful for overriding it in any tricky cases.

  • auto_truncate – If a function response is longer than this many tokens, truncate it until it is at most this many tokens and add “…” to the end. By default, no responses will be truncated. This uses a smart paragraph-aware truncation algorithm.

create_json_schema() dict[source]#

Create a JSON schema representing this function’s parameters as a JSON object.

class kani.AIParam(desc: str)[source]#

Special tag to annotate types with in order to provide parameter-level metadata to kani.

Streaming#

class kani.streaming.StreamManager(
stream_iter: AsyncIterable[str | BaseCompletion],
role: ChatRole,
*,
after=None,
lock: Lock | None = None,
)[source]#

This class is responsible for managing a stream returned by an engine. It should not be constructed manually.

To consume tokens from a stream, use this class as so:

# CHAT ROUND:
stream = ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?")
async for token in stream:
    print(token, end="")
msg = await stream.message()

# FULL ROUND:
async for stream in ai.full_round_stream("What is the airspeed velocity of an unladen swallow?")
    async for token in stream:
        print(token, end="")
    msg = await stream.message()

After a stream finishes, its contents will be available as a ChatMessage. You can retrieve the final message or BaseCompletion with:

msg = await stream.message()
completion = await stream.completion()

The final ChatMessage may contain non-yielded tokens (e.g. a request for a function call). If the final message or completion is requested before the stream is iterated over, the stream manager will consume the entire stream.

Tip

For compatibility and ease of refactoring, awaiting the stream itself will also return the message, i.e.:

msg = await ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?")

(note the await that is not present in the above examples).

Parameters:
  • stream_iter – The async iterable that generates elements of the stream.

  • role – The role of the message that will be returned eventually.

  • after – A coro to call with the generated completion as its argument after the stream is fully consumed.

  • lock – A lock to hold for the duration of the stream run.

__await__()[source]#

Awaiting the StreamManager is equivalent to awaiting message().

__aiter__() AsyncIterable[str][source]#

Iterate over tokens yielded from the engine.

role#

The role of the message that this stream will return.

async completion() BaseCompletion[source]#

Get the final BaseCompletion generated by the model.

async message() ChatMessage[source]#

Get the final ChatMessage generated by the model.

Prompting#

This submodule contains utilities to transform a list of Kani ChatMessage into low-level formats to be consumed by an engine (e.g. str, list[dict], or torch.Tensor).

class kani.PromptPipeline(steps: list[PipelineStep] | None = None)[source]#

This class creates a reproducible pipeline for translating a list of ChatMessage into an engine-specific format using fluent-style chaining.

To build a pipeline, create an instance of PromptPipeline() and add steps by calling the step methods documented below. Most pipelines will end with a call to one of the terminals, which translates the intermediate form into the desired output format.

Usage

To use the pipeline, call the created pipeline object with a list of kani chat messages.

To inspect the inputs/outputs of your pipeline, you can use explain() to print a detailed explanation of the pipeline and multiple examples (selected based on the pipeline steps).

Example

Here’s an example using the PromptPipeline to build a LLaMA 2 chat-style prompt:

from kani import PromptPipeline, ChatRole

pipe = (
    PromptPipeline()

    # System messages should be wrapped with this tag. We'll translate them to USER
    # messages since a system and user message go together in a single [INST] pair.
    .wrap(role=ChatRole.SYSTEM, prefix="<<SYS>>\n", suffix="\n<</SYS>>\n")
    .translate_role(role=ChatRole.SYSTEM, to=ChatRole.USER)

    # If we see two consecutive USER messages, merge them together into one with a
    # newline in between.
    .merge_consecutive(role=ChatRole.USER, sep="\n")
    # Similarly for ASSISTANT, but with a space (kani automatically strips whitespace from the ends of
    # generations).
    .merge_consecutive(role=ChatRole.ASSISTANT, sep=" ")

    # Finally, wrap USER and ASSISTANT messages in the instruction tokens. If our
    # message list ends with an ASSISTANT message, don't add the EOS token
    # (we want the model to continue the generation).
    .conversation_fmt(
        user_prefix="<s>[INST] ",
        user_suffix=" [/INST]",
        assistant_prefix=" ",
        assistant_suffix=" </s>",
        assistant_suffix_if_last="",
    )
)

# We can see what this pipeline does by calling explain()...
pipe.explain()

# And use it in our engine to build a string prompt for the LLM.
prompt = pipe(ai.get_prompt())
__call__(
msgs: list[ChatMessage],
functions: list[AIFunction] | None = None,
**kwargs,
) T[source]#

Apply the pipeline to a list of kani messages. The return type will vary based on the steps in the pipeline; if no steps are defined the return type will be a copy of the input messages.

translate_role(
*,
to: ChatRole,
warn: str = None,
role: ChatRole | Collection[ChatRole] = None,
predicate: Callable[[ChatMessage], bool] = None,
) PromptPipeline[source]#

Change the role of the matching messages. (e.g. for models which do not support native function calling, make all FUNCTION messages a USER message)

Parameters:
  • to – The new role to translate the matching messages to.

  • warn – A warning to emit if any messages are translated (e.g. if a model does not support certain roles).

  • role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.

  • predicate – A function that takes a ChatMessage and returns a boolean specifying whether to operate on this message or not.

If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.

wrap(
*,
prefix: str = None,
suffix: str = None,
role: ChatRole | Collection[ChatRole] = None,
predicate: Callable[[ChatMessage], bool] = None,
) PromptPipeline[source]#

Wrap the matching messages with a given string prefix and/or suffix.

For more fine-grained control over user/assistant message pairs as the last step in a pipeline, use conversation_fmt() instead.

Parameters:
  • prefix – The prefix to add before each matching message, if any.

  • suffix – The suffix to add after each matching message, if any.

  • role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.

  • predicate – A function that takes a ChatMessage and returns a boolean specifying whether to operate on this message or not.

If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.

merge_consecutive(
*,
sep: str = None,
joiner: Callable[[list[ChatMessage]], str | list[MessagePart | str] | None] = None,
out_role: ChatRole = None,
role: ChatRole | Collection[ChatRole] = None,
predicate: Callable[[ChatMessage], bool] = None,
) PromptPipeline[source]#

If multiple messages that match are found consecutively, merge them by either joining their contents with a string or call a joiner function.

Caution

If multiple roles are specified, this method will merge them as a group (e.g. if role=(USER, ASSISTANT), a USER message followed by an ASSISTANT message will be merged together into one with a role of out_role).

Similarly, if a predicate is specified, this method will merge all consecutive messages which match the given predicate.

Parameters:
  • sep – The string to add between each matching message. Mutually exclusive with joiner. If this is set, this is roughly equivalent to joiner=lambda msgs: sep.join(m.text for m in msgs).

  • joiner – A function that will take a list of all messages in a consecutive group and return the final string. Mutually exclusive with sep.

  • out_role – The role of the merged message to use. This is required if multiple roles are specified or role is not set; otherwise it defaults to the common role of the merged messages.

  • role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.

  • predicate – A function that takes a ChatMessage and returns a boolean specifying whether to operate on this message or not.

If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.

function_call_fmt(
func: Callable[[ToolCall], str | None],
*,
prefix: str = '\n',
sep: str = '',
suffix: str = '',
) PromptPipeline[source]#

For each message with one or more requested tool calls, call the provided function on each requested tool call and append it to the message’s content.

Parameters:
  • func – A function taking a ToolCall and returning a string to append to the content of the message containing the requested call, or None to ignore the tool call.

  • prefix – If at least one tool call is formatted, a prefix to insert after the message’s contents and before the formatted string.

  • sep – If two or more tool calls are formatted, the string to insert between them.

  • suffix – If at least one tool call is formatted, a suffix to insert after the formatted string.

remove(
*,
role: ChatRole | Collection[ChatRole] = None,
predicate: Callable[[ChatMessage], bool] = None,
) PromptPipeline[source]#

Remove all messages that match the filters from the output.

Parameters:
  • role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.

  • predicate – A function that takes a ChatMessage and returns a boolean specifying whether to operate on this message or not.

If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.

ensure_start(
*,
role: ChatRole | Collection[ChatRole] = None,
predicate: Callable[[ChatMessage], bool] = None,
) PromptPipeline[source]#

Ensure that the output starts with a message with the given role by removing all messages from the start that do NOT match the given filters, such that the first message in the output matches.

This should NOT be used to ensure that a system prompt is passed; the intent of this step is to prevent an orphaned FUNCTION result or ASSISTANT reply after earlier messages were context-managed out.

Parameters:
  • role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.

  • predicate – A function that takes a ChatMessage and returns a boolean specifying whether to operate on this message or not.

If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.

ensure_bound_function_calls() PromptPipeline[source]#

Ensure that each FUNCTION message is preceded by an ASSISTANT message requesting it, and that each FUNCTION message’s tool_call_id matches the request. If a FUNCTION message has no tool_call_id (e.g. a few-shot prompt), bind it to a preceding ASSISTANT message if it is unambiguous.

Will remove hanging FUNCTION messages (i.e. messages where the corresponding request was managed out of the model’s context) from the beginning of the prompt if necessary.

Raises:

PromptError – if it is impossible to bind each function call to a request unambiguously.

apply(
func: Callable[[ChatMessage], ApplyResultT] | Callable[[ChatMessage, ApplyContext], ApplyResultT],
*,
role: ChatRole | Collection[ChatRole] = None,
predicate: Callable[[ChatMessage], bool] = None,
) PromptPipeline[list[ApplyResultT]][source]#

Apply the given function to all matched messages. Replace the message with the function’s return value.

The function may take 1-2 positional parameters: the first will always be the matched message at the current pipeline step, and the second will be the context this operation is occurring in (a ApplyContext).

Parameters:
  • func – A function that takes 1-2 positional parameters (msg, ctx) that will be called on each matching message. If this function does not return a ChatMessage, it should be the last step in the pipeline. If this function returns None, the input message will be removed from the output.

  • role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.

  • predicate – A function that takes a ChatMessage and returns a boolean specifying whether to operate on this message or not.

If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.

macro_apply(
func: Callable[[list[ChatMessage], list[AIFunction]], list[MacroApplyResultT]],
) PromptPipeline[list[MacroApplyResultT]][source]#

Apply the given function to the list of all messages in the pipeline. This step can effectively be used to create an ad-hoc step.

The function must take 2 positional parameters: the first is the list of messages, and the second is the list of available functions.

Parameters:

func – A function that takes 2 positional parameters (messages, functions) that will be called on the list of messages. If this function does not return a list[ChatMessage], it should be the last step in the pipeline.

conversation_fmt(
*,
prefix: str = '',
sep: str = '',
suffix: str = '',
generation_suffix: str = '',
user_prefix: str = '',
user_suffix: str = '',
assistant_prefix: str = '',
assistant_suffix: str = '',
assistant_suffix_if_last: str = None,
system_prefix: str = '',
system_suffix: str = '',
function_prefix: str = None,
function_suffix: str = None,
) PromptPipeline[str][source]#

Takes in the list of messages and joins them into a single conversation-formatted string by:

  • wrapping messages with the defined prefixes/suffixes by role

  • joining the messages’ contents with the defined sep

  • adding a generation suffix, if necessary.

This method should be the last step in a pipeline and will cause the pipeline to return a str.

Parameters:
  • prefix – A string to insert once before the rest of the prompt, unconditionally.

  • sep – A string to insert between messages, if any. Similar to sep.join(...).

  • suffix – A string to insert once after the rest of the prompt, unconditionally.

  • generation_suffix – A string to add to the end of the prompt to prompt the model to begin its turn.

  • user_prefix – A prefix to add before each USER message.

  • user_suffix – A suffix to add after each USER message.

  • assistant_prefix – A prefix to add before each ASSISTANT message.

  • assistant_suffix – A suffix to add after each ASSISTANT message.

  • assistant_suffix_if_last – If not None and the prompt ends with an ASSISTANT message, this string will be added to the end of the prompt instead of the assistant_suffix + generation_suffix. This is intended to allow consecutive ASSISTANT messages to continue generation from an unfinished prior message.

  • system_prefix – A prefix to add before each SYSTEM message.

  • system_suffix – A suffix to add after each SYSTEM message.

  • function_prefix – A prefix to add before each FUNCTION message.

  • function_suffix – A suffix to add after each FUNCTION message.

conversation_dict(
*,
system_role: str = 'system',
user_role: str = 'user',
assistant_role: str = 'assistant',
function_role: str = 'tool',
content_transform: Callable[[ChatMessage], Any] = lambda msg: ...,
additional_keys: Callable[[ChatMessage], dict] = lambda msg: ...,
) PromptPipeline[list[dict[str, Any]]][source]#

Takes in the list of messages and returns a list of dictionaries with (“role”, “content”) keys.

By default, the “role” key will be “system”, “user”, “assistant”, or “tool” unless the respective role override is specified.

By default, the “content” key will be message.text unless the content_transform argument is specified.

This method should be the last step in a pipeline and will cause the pipeline to return a list[dict].

Parameters:
  • system_role – The role to give to SYSTEM messages (default “system”).

  • user_role – The role to give to USER messages (default “user”).

  • assistant_role – The role to give to ASSISTANT messages (default “assistant”).

  • function_role – The role to give to FUNCTION messages (default “tool”).

  • content_transform – A function taking in the message and returning the contents of the “content” key (defaults to msg.text).

  • additional_keys – A function taking in the message and returning a dictionary containing any additional keys to add to the message’s dict.

execute(
msgs: list[ChatMessage],
functions: list[AIFunction] | None = None,
*,
deepcopy=False,
for_measurement=False,
) T[source]#

Apply the pipeline to a list of kani messages. The return type will vary based on the steps in the pipeline; if no steps are defined the return type will be a copy of the input messages.

This lower-level method offers more fine-grained control over the steps that are run (e.g. to measure the length of a single message).

Parameters:
  • msgs – The messages to apply the pipeline to.

  • functions – Any functions available to the model.

  • deepcopy – Whether to deep-copy each message before running the pipeline.

  • for_measurement – If the pipeline is being run to measure the length of a single message. In this case, any ensure_start steps will be ignored.

explain(example: list[ChatMessage] | None = None, *, all_cases=False, **kwargs)[source]#

Print out a summary of the pipeline and an example conversation transformation based on the steps in the pipeline.

Caution

This method will run the pipeline on an example constructed based on the steps in this pipeline. You may encounter unexpected side effects if your pipeline uses apply() with a function with side effects.

class kani.prompts.PipelineStep[source]#

The base class for all pipeline steps.

If needed, you can subclass this and manually add steps to a PromptPipeline, but this is generally not necessary (consider using PromptPipeline.apply() instead).

execute(msgs: list[ChatMessage], functions: list[AIFunction])[source]#

Apply this step’s effects on the pipeline.

explain() str[source]#

Return a string explaining what this step does.

explain_example_kwargs() dict[str, bool][source]#

Return a dict of kwargs to pass to examples.build_conversation to ensure relevant examples are included.

class kani.prompts.ApplyContext(
msg: ChatMessage,
is_last: bool,
idx: int,
messages: list[ChatMessage],
functions: list[AIFunction],
)[source]#

Context about where a message lives in the pipeline for an arbitrary Apply operation.

msg: ChatMessage#

The message being operated on.

is_last: bool#

Whether the message being operated on is the last message (of all types) in the chat prompt.

idx: int#

The index of the message in the chat prompt.

messages: list[ChatMessage]#

The list of all messages in the chat prompt.

functions: list[AIFunction]#

The list of functions available in the chat prompt.

property is_last_of_type: bool#

Whether this message is the last one of its role in the chat prompt.

Internals#

class kani.FunctionCallResult(is_model_turn: bool, message: ChatMessage)[source]#

A model requested a function call, and the kani runtime resolved it.

Parameters:
  • is_model_turn – True if the model should immediately react; False if the user speaks next.

  • message – The message containing the result of the function call, to add to the chat history.

class kani.ExceptionHandleResult(should_retry: bool, message: ChatMessage)[source]#

A function call raised an exception, and the kani runtime has prompted the model with exception information.

Parameters:
  • should_retry – Whether the model should be allowed to retry the call that caused this exception.

  • message – The message containing details about the exception and/or instructions to retry, to add to the chat history.

Engines#

See Engine Reference.

Utilities#

kani.chat_in_terminal(
kani: Kani,
*,
rounds: int = 0,
stopword: str = None,
echo: bool = False,
ai_first: bool = False,
width: int = None,
show_function_args: bool = False,
show_function_returns: bool = False,
verbose: bool = False,
stream: bool = True,
)[source]#

Chat with a kani right in your terminal.

Useful for playing with kani, quick prompt engineering, or demoing the library.

If the environment variable KANI_DEBUG is set, debug logging will be enabled.

Warning

This function is only a development utility and should not be used in production.

Parameters:
  • rounds (int) – The number of chat rounds to play (defaults to 0 for infinite).

  • stopword (str) – Break out of the chat loop if the user sends this message.

  • echo (bool) – Whether to echo the user’s input to stdout after they send a message (e.g. to save in interactive notebook outputs; default false)

  • ai_first (bool) – Whether the user should send the first message (default) or the model should generate a completion before prompting the user for a message.

  • width (int) – The maximum width of the printed outputs (default unlimited).

  • show_function_args (bool) – Whether to print the arguments the model is calling functions with for each call (default false).

  • show_function_returns (bool) – Whether to print the results of each function call (default false).

  • verbose (bool) – Equivalent to setting echo, show_function_args, and show_function_returns to True.

  • stream (bool) – Whether or not to print tokens as soon as they are generated by the model (default true).

async kani.chat_in_terminal_async(
kani: Kani,
*,
rounds: int = 0,
stopword: str | None = None,
echo: bool = False,
ai_first: bool = False,
width: int | None = None,
show_function_args: bool = False,
show_function_returns: bool = False,
verbose: bool = False,
stream: bool = True,
)[source]#

Async version of chat_in_terminal(). Use in environments when there is already an asyncio loop running (e.g. Google Colab).

async kani.print_stream(stream: StreamManager, width: int | None = None, prefix: str = '')[source]#

Print tokens from a stream to the terminal, with the width of each line less than width. If prefix is provided, indents each line after the first by the length of the prefix.

This is a helper function intended to be used with Kani.chat_round_stream() or Kani.full_round_stream().

Message Formatters#

A couple convenience formatters to customize Kani.full_round_str().

You can pass any of these functions in with, e.g., Kani.full_round_str(..., message_formatter=all_message_contents).

kani.utils.message_formatters.all_message_contents(msg: ChatMessage)[source]#

Return the content of any message.

kani.utils.message_formatters.assistant_message_contents(msg: ChatMessage)[source]#

Return the content of any assistant message; otherwise don’t return anything.

kani.utils.message_formatters.assistant_message_contents_thinking(msg: ChatMessage, show_args=False)[source]#

Return the content of any assistant message, and “Thinking…” on function calls.

If show_args is True, include the arguments to each function call. You can use this in full_round_str by using a partial, e.g.: ai.full_round_str(..., message_formatter=functools.partial(assistant_message_contents_thinking, show_args=True))

kani.utils.message_formatters.assistant_message_thinking(msg: ChatMessage, show_args=False)[source]#

Return “Thinking…” on assistant messages with function calls, ignoring any content.

This is useful if you are streaming the message’s contents.

If show_args is True, include the arguments to each function call.