Generative API Documentation Search

My current active project with generative AI at my workplace is looking to incorporate our API documentation into our search bot, Penny. There were a few tweaks I had to make compared to our first user documentation search project that I thought were interesting enough to share.

Contextual Search works best when you have context

As we discussed in How Vector Search Empowers Generative AI, vectors are great - they allow you to capture the context of a query in a mathematical (and by extension, easy-to-compute) way. But, when you don’t have a lot of context, vector search starts to lose its ability to find you related results.

A good example of this is API documentation, where most of the documentation content is code. Compare this to user documentation, where you have swaths of text to work with - so much that you have to chop it into easily-embeddable pieces. By contrast, API documentation has barely any information to work with from a vector perspective, often only having a sentence or two associated with each endpoint.

A simple search like “How do I create an account?” against our vectorized API documentation didn’t even manage to bring up the POST /Account endpoint, instead highlighting other entries related to account packages, account contact information, account notes, account services, and so on. We found that we had to modify our search for information to include in our prompts (via retrieval-augmented generation) to be a hybrid search of both keyword (using the regular BM25 algorithm) and vector search. This gave us a much better result set, often also including those other endpoints, but at least the obvious answer was now included.

ai bot

Users often want to do more than one thing

Queries like “How do I create an account?” are nice and simple. So simple, in fact, that anyone familiar enough to want to reference API documentation will likely never ask them - they’d probably already know to look for a POST /Account endpoint. Most of the queries that users will be prompted to ask are more about workflows; how to tie multiple API actions together.

Unfortunately, API documentation usually doesn’t contain that information - that’s more in the realm of user documentation, where there’s a nice paragraph or three explaining the business process and relationship between the entities involved. However, that user documentation often doesn’t include the specific API call details required to take those actions.

Luckily, with a bit of language-modeling magic, we can work with this. We found an approach where we can ask the LLM to break a user’s question into distinct parts, and then answer each of those parts separately. As an example, a query like “How can I create a new service and associate it with an account?”, becomes

[
    "How can I create a new service?",
    "How can I associate a service with an account?"
]

These are two distinct things we can query and provide a specific API call for, and they’re phrased nicely enough that we’ll likely find a decent API call match for them, if one exists.

Ask the model which endpoint to use

We initially took a “here’s the 10 endpoints that potentially match the question, answer it” approach, but found the model would get confused with a list of request bodies and different routes and whatnot. Instead, we’re taking the same approach we took above - asking the model to do one thing at a time. We provided the model a list of potential endpoints and their descriptions (based on the hybrid search above), and then have it pick one as a sort of in-between step to answering the question. Once it has one selected, we can then provide a more complete snapshot of the API call it’s about to reply with, including the request and response body and any additional details we may need.

Example prompt:

Given these API routes:

Route: POST /Service
Description: Create a new service

Route: PUT /Service/{id}
Description: Update a service

Route: POST /Package
Description: Create a new package

Which API should be used to answer the following question: "How do I create a new service?" Only return the route, nothing else.

To which most models will respond POST /Service. Then you can follow up with

Given the API route:

Route: POST /Service
Description: Create a new service
Request Body: { json object }
Response Body: { json object }

Answer the question "How do I create a new service?"

Of course, all these extra steps add a bit more time and cost to the overall query, but in the end it ends up with a better result.