Getting Real with the OpenAI Realtime API
I’ve had a number of generative AI projects floating in the back of my mind that require a verbal interface - a way to communicate with a model using speech. While this has definitely been a capability of models for a while now, the process is a bit janky; generally you’d go from audio to text, and then send the text to a generative AI model, and then receive text in return, and then emulate audio using that text. With OpenAI’s recently released Realtime API, we can skip a few of these steps and make a true conversational AI!