Build the Chat Prompt ===================== Modern language models work by generating a *continuation* to a *prompt*. The prompt contains all the context-specific knowledge the model has access to; if it's not in the prompt, it can only use its pretraining data. While we would love to pass the entire chat history in the prompt, models also contain a *token limit* - the maximum size of the prompt we can give it at one time. .. hint:: "token limit" is also known as "context size". Since chats can be longer than a model's token limit, we have to decide which parts to keep and which parts to omit, creating a sliding window of memory for the LM. .. important:: Language models can't remember what happened in a conversation beyond their token limit. Making them do so is a hot area of research! By default, kani includes the **system prompt** and any messages specified as **always include** (in the initializer), then as many messages as possible fit in the remaining token limit, prioritizing later messages. .. todo: figure demonstrating this To override this behaviour, override :meth:`.Kani.get_prompt` in your subclass: .. automethod:: kani.Kani.get_prompt :noindex: For example, here's how you might override the behaviour to only include the most recent 4 messages (omitting earlier ones to fit in the token length if necessary) and any always included messages: .. seealso:: This example is available in the `GitHub repo `__. .. code-block:: python class LastFourKani(Kani): async def get_prompt(self): # calculate how many tokens we have for the prompt, accounting for the response max_len = self.max_context_size - self.desired_response_tokens # working backwards through history... messages = [] for message in reversed(self.chat_history[-4:]): # if the message fits in the space we have remaining... token_len = await self.prompt_token_len( messages=self.always_included_messages + [message] + messages, functions=self.get_enabled_functions() ) if token_len <= max_len: # add it to the returned prompt! messages.insert(0, message) else: break return self.always_included_messages + messages Chatting with this kani, we can see that it loses any memory of what happened more than 4 messages (2 rounds) ago: .. code-block:: pycon >>> chat_in_terminal(LastFourKani(engine)) USER: Hi kani! My name is Andrew. AI: Hello Andrew! How can I assist you today? USER: What does "kani" mean in Japanese? AI: "Kani" in Japanese means "Crab". USER: How do you pronounce it? AI: Kani is pronounced as "kah-nee" in Japanese. USER: What is my name? AI: As an AI, I don't have access to personal data about individuals unless it has been shared with me in the course of our conversation. I'm designed to respect user privacy and confidentiality.