Retrieval#

Retrieval is the idea that models can be augmented with an external factual database that they can retrieve information from, allowing them to access more relevant and up-to-date information without having to train.

In this example, we demonstrate how kani’s function calling can be used to retrieve information from a data source like Wikipedia.

# let's define a client using kani's BaseClient:
class WikipediaClient(BaseClient):
    SERVICE_BASE = "https://en.wikipedia.org/w/api.php"


class WikipediaRetrievalKani(Kani):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.wikipedia_client = WikipediaClient()

    @ai_function()
    async def wikipedia(
        self,
        title: Annotated[
            str,
            AIParam(desc='The article title on Wikipedia, e.g. "Train_station".')
        ],
    ):
        """Get additional information about a topic from Wikipedia."""
        resp = await self.wikipedia_client.get(
            "/",
            params={
                "action": "query",
                "format": "json",
                "prop": "extracts",
                "titles": title,
                "explaintext": 1,
                "formatversion": 2,
            },
        )
        page = resp["query"]["pages"][0]
        if extract := page.get("extract"):
            return extract
        return f"The page {title!r} does not exist on Wikipedia."

    @ai_function()
    async def search(self, query: str):
        """Find titles of Wikipedia articles similar to the given query."""
        resp = await self.wikipedia_client.get(
            "/",
            params={"action": "opensearch", "format": "json", "search": query}
        )
        return json.dumps(resp[1])
>>> chat_in_terminal(WikipediaRetrievalKani(engine))
USER: Tell me about the Yamanote line in Tokyo.
AI: Thinking (search)...
AI: Thinking (wikipedia)...
AI: The Yamanote Line is a loop service in Tokyo, Japan...

Caution

Wikipedia articles might be longer than can fit in the model’s context window. Try combining this with the sub-kani summarization example above to build a powerful retrieval agent!

You may also use @ai_function(auto_truncate=...) if truncating the response is acceptable.