Page cover

Multi-Turn Conversations

Multi-Turn Conversations enable models to keep context from previous messages in a conversation providing a more in-depth experience. This guide will show how to use Arcee AI models through Arcee Platform for multi-turn conversations.

The Arcee AI /chat/completions API is a "stateless" API, meaning the server does not record the context of the user's requests. Therefore, the user must concatenate all previous conversation history and pass it to the chat API with each request.

The following Python code demonstrates how to easily concatenate context to achieve multi-turn conversations.

from openai import OpenAI

client = OpenAI(
    api_key="afm-13cf46d35fd48a6aa2da4c8d62424de8", 
    base_url="https://api.arcee.ai/api/v1"
)

# Round 1
messages = [{"role": "user", "content": "What is a small language model?"}]
response = client.chat.completions.create(
    model="arcee-ai/trinity-mini-thinking",
    messages=messages
)

answer = {"role": "assistant", "content": response.choices[0].message.content}
messages.append(answer)
print(f"Messages Round 1: {messages}")

# Round 2
messages.append({"role": "user", "content": "How do they differ from LLMs?"})
response = client.chat.completions.create(
    model="arcee-ai/trinity-mini-thinking",
    messages=messages
)

answer = {"role": "assistant", "content": response.choices[0].message.content}
messages.append(answer)
print(f"Messages Round 2: {messages}")

In the first round of the request, the messages passed to the API are:

In the second round of the request:

  1. Add the model's output from the first round to the end of the messages.

  2. Add the new question to the end of the messages.

The messages ultimately passed to the API are:

Last updated