API Reference#
Kani#
- class kani.Kani(
- engine: BaseEngine,
- system_prompt: str | None = None,
- always_included_messages: list[ChatMessage] | None = None,
- desired_response_tokens: int = 450,
- chat_history: list[ChatMessage] | None = None,
- functions: list[AIFunction] | None = None,
- retry_attempts: int = 1,
Base class for all kani.
Entrypoints
chat_round(query: str, **kwargs) -> ChatMessage
chat_round_str(query: str, **kwargs) -> str
chat_round_stream(query: str, **kwargs) -> StreamManager
full_round(query: str, **kwargs) -> AsyncIterable[ChatMessage]
full_round_str(query: str, message_formatter: Callable[[ChatMessage], str], **kwargs) -> AsyncIterable[str]
full_round_stream(query: str, **kwargs) -> AsyncIterable[StreamManager]
Function Calling
Subclass and use
@ai_function()
to register functions. The schema will be autogenerated from the function signature (seeai_function()
).To perform a chat round with functions, use
full_round()
as an async iterator:async for msg in kani.full_round(prompt): # responses...
Each response will be a
ChatMessage
.Alternatively, you can use
full_round_str()
and control the format of a yielded function call withfunction_call_formatter
.Retry & Model Feedback
If the model makes an error when attempting to call a function (e.g. calling a function that does not exist or passing params with incorrect and non-coercible types) or the function raises an exception, Kani will send the error in a system message to the model, allowing it up to retry_attempts to correct itself and retry the call.
- Parameters:
engine – The LM engine implementation to use.
system_prompt – The system prompt to provide to the LM. The prompt will not be included in
chat_history
.always_included_messages – A list of messages to always include as a prefix in all chat rounds (i.e., evict newer messages rather than these to manage context length). These will not be included in
chat_history
.desired_response_tokens – The minimum amount of space to leave in
max context size - tokens in prompt
. To control the maximum number of tokens generated more precisely, you may be able to configure the engine (e.g.OpenAIEngine(..., max_tokens=250)
).chat_history –
The chat history to start with (not including system prompt or always included messages), for advanced use cases. By default, each kani starts with a new conversation session.
Caution
If you pass another kani’s chat history here without copying it, the same list will be mutated! Use
chat_history=mykani.chat_history.copy()
to pass a copy.functions – A list of
AIFunction
to expose to the model (for dynamic function calling). Useai_function()
to define static functions (see Function Calling).retry_attempts – How many attempts the LM may take per full round if any tool call raises an exception.
- always_included_messages: list[ChatMessage]#
Chat messages that are always included as a prefix in the model’s prompt. Includes the system message, if supplied.
- chat_history: list[ChatMessage]#
All messages in the current chat state, not including system or always included messages.
- async chat_round(query: str | Sequence[MessagePart | str] | None, **kwargs) ChatMessage [source]#
Perform a single chat round (user -> model -> user, no functions allowed).
- Parameters:
query – The contents of the user’s chat message. Can be None to generate a completion without a user prompt.
kwargs – Additional arguments to pass to the model engine (e.g. hyperparameters).
- Returns:
The model’s reply.
- async chat_round_str(query: str | Sequence[MessagePart | str] | None, **kwargs) str [source]#
Like
chat_round()
, but only returns the text content of the message.
- chat_round_stream(
- query: str | Sequence[MessagePart | str] | None,
- **kwargs,
Returns a stream of tokens from the engine as they are generated.
To consume tokens from a stream, use this class as so:
stream = ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?") async for token in stream: print(token, end="") msg = await stream.message()
Tip
For compatibility and ease of refactoring, awaiting the stream itself will also return the message, i.e.:
msg = await ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?")
(note the
await
that is not present in the above examples).The arguments are the same as
chat_round()
.
- async full_round(
- query: str | Sequence[MessagePart | str] | None,
- **kwargs,
Perform a full chat round (user -> model [-> function -> model -> …] -> user).
Yields each non-user ChatMessage created during the round. A ChatMessage will have at least one of
(content, function_call)
.Use this in an async for loop, like so:
async for msg in kani.full_round("How's the weather?"): print(msg.text)
- Parameters:
query – The content of the user’s chat message. Can be None to generate a completion without a user prompt.
kwargs – Additional arguments to pass to the model engine (e.g. hyperparameters).
- async full_round_str(query: str | ~typing.Sequence[MessagePart | str] | None, message_formatter: ~typing.Callable[[~kani.models.ChatMessage], str | None] = <function assistant_message_contents>, **kwargs) AsyncIterable[str] [source]#
Like
full_round()
, but each yielded element is a str rather than a ChatMessage.- Parameters:
query – The content of the user’s chat message.
message_formatter – A function that returns a string to yield for each message. By default,
full_round_str
yields the content of each assistant message.kwargs – Additional arguments to pass to the model engine (e.g. hyperparameters).
- async full_round_stream(
- query: str | Sequence[MessagePart | str] | None,
- **kwargs,
Perform a full chat round (user -> model [-> function -> model -> …] -> user).
Yields a stream of tokens for each non-user ChatMessage created during the round.
To consume tokens from a stream, use this class as so:
async for stream in ai.full_round_stream("What is the airspeed velocity of an unladen swallow?"): async for token in stream: print(token, end="") msg = await stream.message()
Each
StreamManager
object yielded by this method contains aStreamManager.role
attribute that can be used to determine if a message is from the engine or a function call. This attribute will be available before iterating over the stream.The arguments are the same as
full_round()
.
- property always_len: int#
Returns the number of tokens that will always be reserved.
(e.g. for system prompts, always included messages, the engine, and the response).
- message_token_len(message: ChatMessage)[source]#
Returns the number of tokens used by a given message.
- async get_model_completion(include_functions: bool = True, **kwargs) BaseCompletion [source]#
Get the model’s completion with the current chat state.
Compared to
chat_round()
andfull_round()
, this lower-level method does not save the model’s reply to the chat history or mutate the chat state; it is intended to help with logging or to repeat a call multiple times.- Parameters:
include_functions – Whether to pass this kani’s function definitions to the engine.
kwargs – Arguments to pass to the model engine.
- async get_model_stream(
- include_functions: bool = True,
- **kwargs,
Get the model’s completion with the current chat state as a stream. This is a low-level method like
get_model_completion()
but for streams.
- async get_prompt() list[ChatMessage] [source]#
Called each time before asking the LM engine for a completion to generate the chat prompt. Returns a list of messages such that the total token count in the messages is less than
(self.max_context_size - self.desired_response_tokens)
.Always includes the system prompt plus any always_included_messages at the start of the prompt.
You may override this to get more fine-grained control over what is exposed in the model’s memory at any given call.
- async do_function_call(
- call: FunctionCall,
- tool_call_id: str | None = None,
Resolve a single function call.
By default, any exception raised from this method will be an instance of a
FunctionCallException
.You may implement an override to add instrumentation around function calls (e.g. tracking success counts for varying prompts). See Handle a Function Call.
- Parameters:
call – The name of the function to call and arguments to call it with.
tool_call_id – The
tool_call_id
to set in the returned FUNCTION message.
- Returns:
A
FunctionCallResult
including whose turn it is next and the message with the result of the function call.- Raises:
NoSuchFunction – The requested function does not exist.
WrappedCallException – The function raised an exception.
- async handle_function_call_exception(
- call: FunctionCall,
- err: FunctionCallException,
- attempt: int,
- tool_call_id: str | None = None,
Called when a function call raises an exception.
By default, returns a message telling the LM about the error and allows a retry if the error is recoverable and there are remaining retry attempts.
You may implement an override to customize the error prompt, log the error, or use custom retry logic. See Handle a Function Call Exception.
- Parameters:
call – The
FunctionCall
the model was attempting to make.err – The error the call raised. Usually this is
NoSuchFunction
orWrappedCallException
, although it may be any exception raised bydo_function_call()
.attempt – The attempt number for the current call (0-indexed).
tool_call_id – The
tool_call_id
to set in the returned FUNCTION message.
- Returns:
A
ExceptionHandleResult
detailing whether the model should retry and the message to add to the chat history.
- async add_to_history(message: ChatMessage)[source]#
Add the given message to the chat history.
You might want to override this to log messages to an external or control how messages are saved to the chat session’s memory. By default, this appends to
chat_history
.
Common Models#
- class kani.ChatRole(value)[source]#
Represents who said a chat message.
- SYSTEM = 'system'#
The message is from the system (usually a steering prompt).
- USER = 'user'#
The message is from the user.
- ASSISTANT = 'assistant'#
The message is from the language model.
- FUNCTION = 'function'#
The message is the result of a function call.
- class kani.FunctionCall(*, name: str, arguments: str)[source]#
Represents a model’s request to call a single function.
- class kani.ToolCall(*, id: str, type: str, function: FunctionCall)[source]#
Represents a model’s request to call a tool with a unique request ID.
See Internal Representation for more information about tool calls vs function calls.
- id: str#
The request ID created by the engine. This should be passed back to the engine in
ChatMessage.tool_call_id
in order to associate a FUNCTION message with this request.
- function: FunctionCall#
The requested function call.
- classmethod from_function(name: str, *, call_id_: str | None = None, **kwargs)[source]#
Create a tool call request for a function with the given name and arguments.
- Parameters:
call_id – The ID to assign to the request. If not passed, generates a random ID.
- classmethod from_function_call(call: FunctionCall, call_id_: str | None = None)[source]#
Create a tool call request from an existing FunctionCall.
- Parameters:
call_id – The ID to assign to the request. If not passed, generates a random ID.
- class kani.MessagePart[source]#
Base class for a part of a message. Engines should inherit from this class to tag substrings with metadata or provide multimodality to an engine. By default, if coerced to a string, will raise a warning noting that rich message part data was lost. For more information see Message Parts.
- __str__()[source]#
Used to define the fallback behaviour when a part is serialized to a string (e.g. via
ChatMessage.text
). Override this to specify the canonical string representation of your message part.Engines that support message parts should generally not use this, preferring to iterate over
ChatMessage.parts
instead.
- class kani.ChatMessage(
- *,
- role: ChatRole,
- content: str | list[MessagePart | str] | None,
- name: str | None = None,
- tool_call_id: str | None = None,
- tool_calls: list[ToolCall] | None = None,
- is_tool_call_error: bool | None = None,
Represents a message in the chat context.
- content: str | list[MessagePart | str] | None#
The data used to create this message. Generally, you should use
text
orparts
instead.
- property text: str | None#
The content of the message, as a string. Can be None only if the message is a requested function call from the assistant. If the message is comprised of multiple parts, concatenates the parts.
- property parts: list[MessagePart | str]#
The parts of the message that make up its content. Can be an empty tuple only if the message is a requested function call from the assistant.
This is a read-only list; changes here will not affect the message’s content. To mutate the message content, use
copy_with()
and settext
,parts
, orcontent
.
- tool_call_id: str | None#
The ID for a requested
ToolCall
which this message is a response to (function messages only).
- is_tool_call_error: bool | None#
If this is a FUNCTION message containing the results of a function call, whether the function call raised an exception.
- property function_call: FunctionCall | None#
If there is exactly one tool call to a function, return that tool call’s requested function.
This is mostly provided for backwards-compatibility purposes; iterating over
tool_calls
should be preferred.
- classmethod system(content: str | Sequence[MessagePart | str], **kwargs)[source]#
Create a new system message.
- classmethod user(content: str | Sequence[MessagePart | str], **kwargs)[source]#
Create a new user message.
- classmethod assistant(content: str | Sequence[MessagePart | str] | None, **kwargs)[source]#
Create a new assistant message.
- classmethod function(
- name: str | None,
- content: str | Sequence[MessagePart | str],
- tool_call_id: str | None = None,
- **kwargs,
Create a new function message.
- copy_with(**new_values)[source]#
Make a shallow copy of this object, updating the passed attributes (if any) to new values.
This does not validate the updated attributes! This is mostly just a convenience wrapper around
.model_copy
.Only one of (content, text, parts) may be passed and will update the other two attributes accordingly.
Only one of (tool_calls, function_call) may be passed and will update the other accordingly.
Exceptions#
- exception kani.exceptions.MessageTooLong[source]#
This chat message will never fit in the context window.
- exception kani.exceptions.HTTPException[source]#
Base class for all HTTP errors (for HTTP engines).
Deprecated since version 1.0.0.
- exception kani.exceptions.HTTPTimeout[source]#
Timeout occurred connecting to or waiting for a response from an HTTP request.
Deprecated since version 1.0.0.
- exception kani.exceptions.HTTPStatusException(response: aiohttp.ClientResponse, msg: str)[source]#
The HTTP server returned a non-200 status code.
Deprecated since version 1.0.0.
- exception kani.exceptions.FunctionCallException(retry: bool)[source]#
Base class for exceptions that occur when a model calls an @ai_function.
- exception kani.exceptions.WrappedCallException(retry, original)[source]#
The @ai_function raised an exception.
- exception kani.exceptions.NoSuchFunction(name)[source]#
The model attempted to call a function that does not exist.
AI Function#
- kani.ai_function(
- func=None,
- *,
- after: ChatRole = ChatRole.ASSISTANT,
- name: str | None = None,
- desc: str | None = None,
- auto_retry: bool = True,
- json_schema: dict | None = None,
- auto_truncate: int | None = None,
Decorator to mark a method of a Kani to expose to the AI.
- Parameters:
after – Who should speak next after the function call completes (see Next Actor). Defaults to the model.
name – The name of the function (defaults to the name of the function in source code).
desc – The function’s description (defaults to the function’s docstring).
auto_retry – Whether the model should retry calling the function if it gets it wrong (see Retry & Model Feedback).
json_schema – A JSON Schema document describing the function’s parameters. By default, kani will automatically generate one, but this can be helpful for overriding it in any tricky cases.
auto_truncate – If a function response is longer than this many tokens, truncate it until it is at most this many tokens and add “…” to the end. By default, no responses will be truncated. This uses a smart paragraph-aware truncation algorithm.
- class kani.AIFunction(
- inner,
- after: ChatRole = ChatRole.ASSISTANT,
- name: str | None = None,
- desc: str | None = None,
- auto_retry: bool = True,
- json_schema: dict | None = None,
- auto_truncate: int | None = None,
Wrapper around a function to expose to a language model.
- Parameters:
inner – The function implementation.
after – Who should speak next after the function call completes (see Next Actor). Defaults to the model.
name – The name of the function (defaults to the name of the function in source code).
desc – The function’s description (defaults to the function’s docstring).
auto_retry – Whether the model should retry calling the function if it gets it wrong (see Retry & Model Feedback).
json_schema – A JSON Schema document describing the function’s parameters. By default, kani will automatically generate one, but this can be helpful for overriding it in any tricky cases.
auto_truncate – If a function response is longer than this many tokens, truncate it until it is at most this many tokens and add “…” to the end. By default, no responses will be truncated. This uses a smart paragraph-aware truncation algorithm.
Streaming#
- class kani.streaming.StreamManager(
- stream_iter: AsyncIterable[str | BaseCompletion],
- role: ChatRole,
- *,
- after=None,
- lock: Lock | None = None,
This class is responsible for managing a stream returned by an engine. It should not be constructed manually.
To consume tokens from a stream, use this class as so:
# CHAT ROUND: stream = ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?") async for token in stream: print(token, end="") msg = await stream.message() # FULL ROUND: async for stream in ai.full_round_stream("What is the airspeed velocity of an unladen swallow?") async for token in stream: print(token, end="") msg = await stream.message()
After a stream finishes, its contents will be available as a
ChatMessage
. You can retrieve the final message orBaseCompletion
with:msg = await stream.message() completion = await stream.completion()
The final
ChatMessage
may contain non-yielded tokens (e.g. a request for a function call). If the final message or completion is requested before the stream is iterated over, the stream manager will consume the entire stream.Tip
For compatibility and ease of refactoring, awaiting the stream itself will also return the message, i.e.:
msg = await ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?")
(note the
await
that is not present in the above examples).- Parameters:
stream_iter – The async iterable that generates elements of the stream.
role – The role of the message that will be returned eventually.
after – A coro to call with the generated completion as its argument after the stream is fully consumed.
lock – A lock to hold for the duration of the stream run.
- __aiter__() AsyncIterable[str] [source]#
Iterate over tokens yielded from the engine.
- role#
The role of the message that this stream will return.
- async completion() BaseCompletion [source]#
Get the final
BaseCompletion
generated by the model.
- async message() ChatMessage [source]#
Get the final
ChatMessage
generated by the model.
Prompting#
This submodule contains utilities to transform a list of Kani ChatMessage
into low-level formats to be
consumed by an engine (e.g. str
, list[dict]
, or torch.Tensor
).
- class kani.PromptPipeline(steps: list[PipelineStep] | None = None)[source]#
This class creates a reproducible pipeline for translating a list of
ChatMessage
into an engine-specific format using fluent-style chaining.To build a pipeline, create an instance of
PromptPipeline()
and add steps by calling the step methods documented below. Most pipelines will end with a call to one of the terminals, which translates the intermediate form into the desired output format.Usage
To use the pipeline, call the created pipeline object with a list of kani chat messages.
To inspect the inputs/outputs of your pipeline, you can use
explain()
to print a detailed explanation of the pipeline and multiple examples (selected based on the pipeline steps).Example
Here’s an example using the PromptPipeline to build a LLaMA 2 chat-style prompt:
from kani import PromptPipeline, ChatRole pipe = ( PromptPipeline() # System messages should be wrapped with this tag. We'll translate them to USER # messages since a system and user message go together in a single [INST] pair. .wrap(role=ChatRole.SYSTEM, prefix="<<SYS>>\n", suffix="\n<</SYS>>\n") .translate_role(role=ChatRole.SYSTEM, to=ChatRole.USER) # If we see two consecutive USER messages, merge them together into one with a # newline in between. .merge_consecutive(role=ChatRole.USER, sep="\n") # Similarly for ASSISTANT, but with a space (kani automatically strips whitespace from the ends of # generations). .merge_consecutive(role=ChatRole.ASSISTANT, sep=" ") # Finally, wrap USER and ASSISTANT messages in the instruction tokens. If our # message list ends with an ASSISTANT message, don't add the EOS token # (we want the model to continue the generation). .conversation_fmt( user_prefix="<s>[INST] ", user_suffix=" [/INST]", assistant_prefix=" ", assistant_suffix=" </s>", assistant_suffix_if_last="", ) ) # We can see what this pipeline does by calling explain()... pipe.explain() # And use it in our engine to build a string prompt for the LLM. prompt = pipe(ai.get_prompt())
- __call__(
- msgs: list[ChatMessage],
- functions: list[AIFunction] | None = None,
- **kwargs,
Apply the pipeline to a list of kani messages. The return type will vary based on the steps in the pipeline; if no steps are defined the return type will be a copy of the input messages.
- translate_role(
- *,
- to: ChatRole,
- warn: str = None,
- role: ChatRole | Collection[ChatRole] = None,
- predicate: Callable[[ChatMessage], bool] = None,
Change the role of the matching messages. (e.g. for models which do not support native function calling, make all FUNCTION messages a USER message)
- Parameters:
to – The new role to translate the matching messages to.
warn – A warning to emit if any messages are translated (e.g. if a model does not support certain roles).
role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.
predicate – A function that takes a
ChatMessage
and returns a boolean specifying whether to operate on this message or not.
If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.
- wrap(
- *,
- prefix: str = None,
- suffix: str = None,
- role: ChatRole | Collection[ChatRole] = None,
- predicate: Callable[[ChatMessage], bool] = None,
Wrap the matching messages with a given string prefix and/or suffix.
For more fine-grained control over user/assistant message pairs as the last step in a pipeline, use
conversation_fmt()
instead.- Parameters:
prefix – The prefix to add before each matching message, if any.
suffix – The suffix to add after each matching message, if any.
role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.
predicate – A function that takes a
ChatMessage
and returns a boolean specifying whether to operate on this message or not.
If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.
- merge_consecutive(
- *,
- sep: str = None,
- joiner: Callable[[list[ChatMessage]], str | list[MessagePart | str] | None] = None,
- out_role: ChatRole = None,
- role: ChatRole | Collection[ChatRole] = None,
- predicate: Callable[[ChatMessage], bool] = None,
If multiple messages that match are found consecutively, merge them by either joining their contents with a string or call a joiner function.
Caution
If multiple roles are specified, this method will merge them as a group (e.g. if
role=(USER, ASSISTANT)
, a USER message followed by an ASSISTANT message will be merged together into one with a role ofout_role
).Similarly, if a predicate is specified, this method will merge all consecutive messages which match the given predicate.
- Parameters:
sep – The string to add between each matching message. Mutually exclusive with
joiner
. If this is set, this is roughly equivalent tojoiner=lambda msgs: sep.join(m.text for m in msgs)
.joiner – A function that will take a list of all messages in a consecutive group and return the final string. Mutually exclusive with
sep
.out_role – The role of the merged message to use. This is required if multiple
role
s are specified orrole
is not set; otherwise it defaults to the common role of the merged messages.role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.
predicate – A function that takes a
ChatMessage
and returns a boolean specifying whether to operate on this message or not.
If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.
- function_call_fmt( ) PromptPipeline [source]#
For each message with one or more requested tool calls, call the provided function on each requested tool call and append it to the message’s content.
- Parameters:
func – A function taking a
ToolCall
and returning a string to append to the content of the message containing the requested call, or None to ignore the tool call.prefix – If at least one tool call is formatted, a prefix to insert after the message’s contents and before the formatted string.
sep – If two or more tool calls are formatted, the string to insert between them.
suffix – If at least one tool call is formatted, a suffix to insert after the formatted string.
- remove(
- *,
- role: ChatRole | Collection[ChatRole] = None,
- predicate: Callable[[ChatMessage], bool] = None,
Remove all messages that match the filters from the output.
- Parameters:
role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.
predicate – A function that takes a
ChatMessage
and returns a boolean specifying whether to operate on this message or not.
If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.
- ensure_start(
- *,
- role: ChatRole | Collection[ChatRole] = None,
- predicate: Callable[[ChatMessage], bool] = None,
Ensure that the output starts with a message with the given role by removing all messages from the start that do NOT match the given filters, such that the first message in the output matches.
This should NOT be used to ensure that a system prompt is passed; the intent of this step is to prevent an orphaned FUNCTION result or ASSISTANT reply after earlier messages were context-managed out.
- Parameters:
role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.
predicate – A function that takes a
ChatMessage
and returns a boolean specifying whether to operate on this message or not.
If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.
- ensure_bound_function_calls() PromptPipeline [source]#
Ensure that each FUNCTION message is preceded by an ASSISTANT message requesting it, and that each FUNCTION message’s
tool_call_id
matches the request. If a FUNCTION message has notool_call_id
(e.g. a few-shot prompt), bind it to a preceding ASSISTANT message if it is unambiguous.Will remove hanging FUNCTION messages (i.e. messages where the corresponding request was managed out of the model’s context) from the beginning of the prompt if necessary.
- Raises:
PromptError – if it is impossible to bind each function call to a request unambiguously.
- apply(
- func: Callable[[ChatMessage], ApplyResultT] | Callable[[ChatMessage, ApplyContext], ApplyResultT],
- *,
- role: ChatRole | Collection[ChatRole] = None,
- predicate: Callable[[ChatMessage], bool] = None,
Apply the given function to all matched messages. Replace the message with the function’s return value.
The function may take 1-2 positional parameters: the first will always be the matched message at the current pipeline step, and the second will be the context this operation is occurring in (a
ApplyContext
).- Parameters:
func – A function that takes 1-2 positional parameters
(msg, ctx)
that will be called on each matching message. If this function does not return aChatMessage
, it should be the last step in the pipeline. If this function returnsNone
, the input message will be removed from the output.role – The role (if a single role is given) or roles (if a list is given) to apply this operation to. If not set, ignores the role of the message.
predicate – A function that takes a
ChatMessage
and returns a boolean specifying whether to operate on this message or not.
If multiple filter params are supplied, this method will only operate on messages that match ALL of the filters.
- macro_apply(
- func: Callable[[list[ChatMessage], list[AIFunction]], list[MacroApplyResultT]],
Apply the given function to the list of all messages in the pipeline. This step can effectively be used to create an ad-hoc step.
The function must take 2 positional parameters: the first is the list of messages, and the second is the list of available functions.
- Parameters:
func – A function that takes 2 positional parameters
(messages, functions)
that will be called on the list of messages. If this function does not return alist[ChatMessage]
, it should be the last step in the pipeline.
- conversation_fmt(
- *,
- prefix: str = '',
- sep: str = '',
- suffix: str = '',
- generation_suffix: str = '',
- user_prefix: str = '',
- user_suffix: str = '',
- assistant_prefix: str = '',
- assistant_suffix: str = '',
- assistant_suffix_if_last: str = None,
- system_prefix: str = '',
- system_suffix: str = '',
- function_prefix: str = None,
- function_suffix: str = None,
Takes in the list of messages and joins them into a single conversation-formatted string by:
wrapping messages with the defined prefixes/suffixes by role
joining the messages’ contents with the defined sep
adding a generation suffix, if necessary.
This method should be the last step in a pipeline and will cause the pipeline to return a
str
.- Parameters:
prefix – A string to insert once before the rest of the prompt, unconditionally.
sep – A string to insert between messages, if any. Similar to
sep.join(...)
.suffix – A string to insert once after the rest of the prompt, unconditionally.
generation_suffix – A string to add to the end of the prompt to prompt the model to begin its turn.
user_prefix – A prefix to add before each USER message.
user_suffix – A suffix to add after each USER message.
assistant_prefix – A prefix to add before each ASSISTANT message.
assistant_suffix – A suffix to add after each ASSISTANT message.
assistant_suffix_if_last – If not None and the prompt ends with an ASSISTANT message, this string will be added to the end of the prompt instead of the
assistant_suffix + generation_suffix
. This is intended to allow consecutive ASSISTANT messages to continue generation from an unfinished prior message.system_prefix – A prefix to add before each SYSTEM message.
system_suffix – A suffix to add after each SYSTEM message.
function_prefix – A prefix to add before each FUNCTION message.
function_suffix – A suffix to add after each FUNCTION message.
- conversation_dict(
- *,
- system_role: str = 'system',
- user_role: str = 'user',
- assistant_role: str = 'assistant',
- function_role: str = 'tool',
- content_transform: Callable[[ChatMessage], Any] = lambda msg: ...,
- additional_keys: Callable[[ChatMessage], dict] = lambda msg: ...,
Takes in the list of messages and returns a list of dictionaries with (“role”, “content”) keys.
By default, the “role” key will be “system”, “user”, “assistant”, or “tool” unless the respective role override is specified.
By default, the “content” key will be
message.text
unless thecontent_transform
argument is specified.This method should be the last step in a pipeline and will cause the pipeline to return a
list[dict]
.- Parameters:
system_role – The role to give to SYSTEM messages (default “system”).
user_role – The role to give to USER messages (default “user”).
assistant_role – The role to give to ASSISTANT messages (default “assistant”).
function_role – The role to give to FUNCTION messages (default “tool”).
content_transform – A function taking in the message and returning the contents of the “content” key (defaults to
msg.text
).additional_keys – A function taking in the message and returning a dictionary containing any additional keys to add to the message’s dict.
- execute(
- msgs: list[ChatMessage],
- functions: list[AIFunction] | None = None,
- *,
- deepcopy=False,
- for_measurement=False,
Apply the pipeline to a list of kani messages. The return type will vary based on the steps in the pipeline; if no steps are defined the return type will be a copy of the input messages.
This lower-level method offers more fine-grained control over the steps that are run (e.g. to measure the length of a single message).
- Parameters:
msgs – The messages to apply the pipeline to.
functions – Any functions available to the model.
deepcopy – Whether to deep-copy each message before running the pipeline.
for_measurement – If the pipeline is being run to measure the length of a single message. In this case, any
ensure_start
steps will be ignored.
- explain(example: list[ChatMessage] | None = None, *, all_cases=False, **kwargs)[source]#
Print out a summary of the pipeline and an example conversation transformation based on the steps in the pipeline.
Caution
This method will run the pipeline on an example constructed based on the steps in this pipeline. You may encounter unexpected side effects if your pipeline uses
apply()
with a function with side effects.
- class kani.prompts.PipelineStep[source]#
The base class for all pipeline steps.
If needed, you can subclass this and manually add steps to a
PromptPipeline
, but this is generally not necessary (consider usingPromptPipeline.apply()
instead).- execute(msgs: list[ChatMessage], functions: list[AIFunction])[source]#
Apply this step’s effects on the pipeline.
- class kani.prompts.ApplyContext(
- msg: ChatMessage,
- is_last: bool,
- idx: int,
- messages: list[ChatMessage],
- functions: list[AIFunction],
Context about where a message lives in the pipeline for an arbitrary Apply operation.
- msg: ChatMessage#
The message being operated on.
- is_last: bool#
Whether the message being operated on is the last message (of all types) in the chat prompt.
- messages: list[ChatMessage]#
The list of all messages in the chat prompt.
- functions: list[AIFunction]#
The list of functions available in the chat prompt.
Internals#
- class kani.FunctionCallResult(is_model_turn: bool, message: ChatMessage)[source]#
A model requested a function call, and the kani runtime resolved it.
- Parameters:
is_model_turn – True if the model should immediately react; False if the user speaks next.
message – The message containing the result of the function call, to add to the chat history.
- class kani.ExceptionHandleResult(should_retry: bool, message: ChatMessage)[source]#
A function call raised an exception, and the kani runtime has prompted the model with exception information.
- Parameters:
should_retry – Whether the model should be allowed to retry the call that caused this exception.
message – The message containing details about the exception and/or instructions to retry, to add to the chat history.
Engines#
See Engine Reference.
Utilities#
- kani.chat_in_terminal(
- kani: Kani,
- *,
- rounds: int = 0,
- stopword: str = None,
- echo: bool = False,
- ai_first: bool = False,
- width: int = None,
- show_function_args: bool = False,
- show_function_returns: bool = False,
- verbose: bool = False,
- stream: bool = True,
Chat with a kani right in your terminal.
Useful for playing with kani, quick prompt engineering, or demoing the library.
If the environment variable
KANI_DEBUG
is set, debug logging will be enabled.Warning
This function is only a development utility and should not be used in production.
- Parameters:
rounds (int) – The number of chat rounds to play (defaults to 0 for infinite).
stopword (str) – Break out of the chat loop if the user sends this message.
echo (bool) – Whether to echo the user’s input to stdout after they send a message (e.g. to save in interactive notebook outputs; default false)
ai_first (bool) – Whether the user should send the first message (default) or the model should generate a completion before prompting the user for a message.
width (int) – The maximum width of the printed outputs (default unlimited).
show_function_args (bool) – Whether to print the arguments the model is calling functions with for each call (default false).
show_function_returns (bool) – Whether to print the results of each function call (default false).
verbose (bool) – Equivalent to setting
echo
,show_function_args
, andshow_function_returns
to True.stream (bool) – Whether or not to print tokens as soon as they are generated by the model (default true).
- async kani.chat_in_terminal_async(
- kani: Kani,
- *,
- rounds: int = 0,
- stopword: str | None = None,
- echo: bool = False,
- ai_first: bool = False,
- width: int | None = None,
- show_function_args: bool = False,
- show_function_returns: bool = False,
- verbose: bool = False,
- stream: bool = True,
Async version of
chat_in_terminal()
. Use in environments when there is already an asyncio loop running (e.g. Google Colab).
- async kani.print_stream(stream: StreamManager, width: int | None = None, prefix: str = '')[source]#
Print tokens from a stream to the terminal, with the width of each line less than width. If prefix is provided, indents each line after the first by the length of the prefix.
This is a helper function intended to be used with
Kani.chat_round_stream()
orKani.full_round_stream()
.
Message Formatters#
A couple convenience formatters to customize Kani.full_round_str()
.
You can pass any of these functions in with, e.g., Kani.full_round_str(..., message_formatter=all_message_contents)
.
- kani.utils.message_formatters.all_message_contents(msg: ChatMessage)[source]#
Return the content of any message.
- kani.utils.message_formatters.assistant_message_contents(msg: ChatMessage)[source]#
Return the content of any assistant message; otherwise don’t return anything.
- kani.utils.message_formatters.assistant_message_contents_thinking(msg: ChatMessage, show_args=False)[source]#
Return the content of any assistant message, and “Thinking…” on function calls.
If show_args is True, include the arguments to each function call. You can use this in
full_round_str
by using a partial, e.g.:ai.full_round_str(..., message_formatter=functools.partial(assistant_message_contents_thinking, show_args=True))
- kani.utils.message_formatters.assistant_message_thinking(msg: ChatMessage, show_args=False)[source]#
Return “Thinking…” on assistant messages with function calls, ignoring any content.
This is useful if you are streaming the message’s contents.
If show_args is True, include the arguments to each function call.