Trying the OpenAI Assistants API

If you have ever tried to build an AI assistant, you know that is not a simple task. In almost all cases, your assistant needs access to external knowledge such as documents or APIs. You might even want to provide your assistant a code sandbox to solve user queries with code. When your assistant is accessed via a chat application, you also have to implement chat history.

Although there are several frameworks like LangChain and Semantic Kernel that can help, OpenAI recently released the Assistants API. It is their own API, tied to their models. The primitives of an assistant are Assistants, Threads and Runs. Let’s start by creating an assistant.

Note: this post contains code snippets in Python. You can find the full example in this gist: https://gist.github.com/gbaeke/e6e88c0dc68af3aa4a89b1228012ae53

Note: although I except this API to become available in Azure OpenAI, I am not quite sure it will happen fast, if at all. So for now, try it out at OpenAI directly. It is still in beta!

Creating an assistant

You can create an assistant using the portal or from code. An assistant has several parameters:

  • Instructions: how should the assistant behave or respond; think of it as the system message
  • Model: use any supported model, including fine-tuned models; to support retrieval from documents, you need the 1106 version of gpt-3.5-turbo/gpt-4
  • Tools: currently, the API supports Code Interpreter and Retrieval; these are fully hosted by OpenAI
  • Functions: define custom functions to call to integrate with external APIs for instance

Note that the retrieval tool supports uploaded files. There is no need for your own search solution (e.g., vector database with support for vector search, hybrid search, etc…). This is great in simpler scenarios where a full-fledged search system is not required. More control over retrieval will come later.

In this post, we will focus on an assistant that uses Code Interpreter. You can simply create the assistant in the portal. You can see the instructions, model, tools and files:

Assistant with only the Code interpreter tool using the latest gpt-4 model

To create this assistant, make sure you have an account at https://platform.openai.com. Create the assistant from the Assistants section:

Creating an assistant

Assistants have an id. For example, my assistant has this id: asst_VljToh6vQ1Mbu6Ct5L6qgpfy. I can use this id in my code to start creating threads.

Before talking about threads, let’s look at creating the assistant with code:

To run this code, make sure you use the most recent version of the openai package (>=1.2). Note that if you run this code multiple times, you will create an assistant at each run. You should save the assistant id after creation and implement some logic to only run the above code when you do not have an id.

Above, we create an assistant with one tool: code interpreter.

Threads

After creating an assistant, you can create threads. Although somewhat unintuitive, a thread is not associated with an assistant. They exist on their own. After a thread is created, you can add messages to a thread, for instance a user message:

To get a completion from the assistant for our thread, we need to create a run. The run tells the assistant to look at the messages in the thread and provide a response.

Runs

Below, we create the run:

Above, both the thread_id and assistant_id are passed to the run, tying both together. If you did not create the assistant in your code, ensure you pass the id of a valid assistant created in your OpenAI account. Note that the run can be passed extra instructions. You can also override the model and tools that the assistant uses.

Creating a run is an asynchronous operation. It returns the metadata of the run immediately. The metadata includes fields like the run’s id, the created_at date and more.

You will need to manually check the run’s status in your code. For example:

When the run is finished, we can retrieve messages:

The messages data field contains all messages. Each message has a role like user or assistant. Assistant messages can have different content, like text or image_file.

For example, if I ask Plot y=x^3 + 2x, there will be both text and image_file responses. It’s up to the developer to properly display them in the app. Below is a naive approach, which only works with text and image responses, not downloads (Code Interpreter can give download links):

The above should be pretty clear:

  • if the assistant responds with text, display the text
  • if the assistant responds with an image, there is an image Id; I use a get_content function to download the image from OpenIA; get_content also implements some straightforward caching logic to avoid having to download images over and over again in the same thread

The get_content function uses client.files.content(file_id).response.content to retrieve the file (client is OpenAI client). The returned result can be used by PIL to open the image and subsequently display it with Streamlit’s st.image:

Assistant in a Streamlit app

Note that I can keep asking questions, which adds messages to the same thread, based on the thread’s Id in Streamlit’s session state. When the user refreshes the browser, session state is cleared so a new thread is started. For example, when I ask change 2x in 3x:

Asking to change the function

In the code, I do not have to worry about chat history at all. I just add messages to the thread, which is managed by OpenAI. At the next run, all those messages are sent to the assistant’s model, which responds appropriately. Note that you do pay for the tokens that all those messages consume.

Conclusion

Compared to the synchronous and stateless ChatCompletion API, the Assistants API is asynchronous and stateful. As a developer, you create an assistant with tools, functions and content for retrieval purposes. Interacting with the assistant is easy: simply add messages to a thread and create a run.

Obviously, it is early days for this API as it is still in beta. Personally, I think it’s a great step forward, making it easier to create quite sophisticated assistants. Most orchestration frameworks and AI tools like LangChain, Semantic Kernel, Flowise, etc… already have support or will support assistants and will add extra capabilities or ease of use on top of the base functionality.

2 thoughts on “Trying the OpenAI Assistants API”

Leave a Reply

Discover more from baeke.info

Subscribe now to keep reading and get access to the full archive.

Continue reading