Using AWS' Converse API

3 Dec 2024

AWS has recently released their Converse API, meant to offer a programmatic interface with their hosted large language models through a series of messages as opposed to straight up text generation. The unique thing about their API is that it’s model-agnostic; it provides a generic interface and translates the format of the messages on your behalf depending on the model you interact with. Let’s dive in!

Traditionally I’ve been working with Semantic Kernel in my personal projects, as it plugs into the OpenAI API relatively easily as well as Azure’s OpenAI services. There is a project to support AWS Bedrock within Semantic Kernel, but it’s still in its early phases. While we wait for that library extension to be proven out, we can fall back on the AWS Bedrock Runtime SDK to engage in conversation-style interactions with language models hosted on AWS.

ConverseRequest

Just like ChatHistory in Semantic Kernel, Bedrock relies on a transport object to represent the conversation that will be used to generate a response, in their case called ConverseRequest. A ConverseRequest is a type of AmazonBedrockRuntimeRequest, and contains a few properties for managing guardrails, inference, tools, but most importantly the ModelId and the Messages. The ModelId lets AWS know which language model you want to invoke with your request, and the Messages property is a generic List<T> with type Message, where the conversation will be stored.

var conversation = new ConverseRequest
{
    ModelId = "mistral.mistral-small-2402-v1:0"
};
conversation.Messages.Add(new Message
{
    Role = ConversationRole.User,
    Content = new List<ContentBlock> { new ContentBlock { Text = "Write a song about my dog Molly." }}
});

ConverseAsync

Once you have your ConverseRequest set up, you can pass it to an AmazonBedrockRuntime through a new method, ConverseAsync(ConverseRequest request). This as you can imagine, will send your conversation up to AWS, invoke the appropriate model, and asynchronously return a ConverseResponse object back.

var runtime = new AmazonBedrockRuntimeClient();
var response = await runtime.ConverseAsync(conversation);
Console.WriteLine(response.Output.Message.Content.First().Text);

The nice thing here is the response object returns a Message object, one which you can just stick onto the end of your ConverseRequest via Messages.Add(response.Output.Message) if you want to add the response to the conversation state.

Optional Configuration

Since the ConverseRequest is meant to handle all conversation requests against any supported language model, it provides a generic interface that is model-agnostic. That said, if there are extra properties you’d like to include in your request, you can do so through the extra properies on the ConverseRequest object.

AdditionalModelRequestFields

Allows you to specify language-model-specific fields in the request

var conversation = new ConverseRequest()
{
AdditionalModelRequestFields = Document.FromObject(new {
    max_tokens = 2048
});
}

AdditionalModelResponseFieldPaths
- If you want the model-specific response fields that the ConverseResponse wouldn’t normally capture, you can ask for them here
```
var conversation = new ConverseRequest()
{
AdditionalModelResponseFieldPaths = new List<string>
{
    "results.tokenCount"
}
}
```
GuardrailConfig
- If you have a guardrail configured within AWS, this is an easy way to apply it to the model’s responses
InferenceConfig
- This is where you can specify properties that pertain to most language models, like MaxTokens, StopSequences, Temperature, and TopP

Major Differences compared to Semantic Kernel

Here’s a few things I’ve found so far working with the AWS Bedrock ConverseRequest pattern compared to Semantic Kernel

System messages
- With Semantic Kernel, you can append System messages at any point in the conversation history, and I use this to great effect in my projects. With ConverseRequest there’s a single System property that gets prepended to the top of the conversation with each inference request. I’m guessing this has more to do with trying to be compatible with more language models that may not support mid-conversation System messages, but it’s definitely less flexible.
Tool Config
- ConverseRequest supports tool integration, where you can pass descriptions of methods that your language model can choose to invoke. Semantic Kernel also supports this functionality - the main difference is with ConverseResponse you have to manually define your available tools, and you’ll simply get a response back from the language model indicating that it wishes to execute the tool. Semantic Kernel serialize your tagged methods in the request, intercept the invoke tool response from the model, invoke the method on the model’s behalf, stick the function call result back into the conversation and re-initiate the invocation. I’m sure the AWS Bedrock SDK will eventually get there, but for now it’s a lot more manual of a process to register and invoke those methods.
Like most things in the generative AI space, it’s exciting to see the pace of evolution in these tools and can’t wait to see what AWS Bedrock adds next!