Page cover

ElevenLabs

ElevenLabs Agents is a conversational voice agent platform that combines automatic speech recognition (ASR), a pluggable language model, human-like TTS, and a turn-taking engine into a complete voice stack.

This tutorial will guide you through how to integrate Arcee AI model's as the language model for your ElevenLabs agent. The first section will showcase how to utilize our models through Together.ai and the second will showcase a self-hosting option.


Using Arcee Models with Deepgram Voice Agents via Together.ai

Step 1: Create a Together.ai API Key

  1. Click “Create API Key”

  2. Copy the key and store it securely

Step 2: Connect an Arcee model to Your ElevenLabs Agent

  1. In the ElevenLabs dashboard, go to SettingsWorkspace Secrets

  2. Click “Add a Secret”

    • Name: together-ai-api-key

    • Value: Paste your Together AI API key

  3. Click “Add a Secret” to save it to your workspace

  4. Go to the Agents tab from the left pane

  5. Select your existing agent or create a new one

  6. Scroll to the LLM section

  7. Beside "Select which provider and model to use for the LLM", select “Custom LLM”

  8. Fill in the following fields:

  1. Click Save to apply the agent configuration

Step 3: Test the Agent

Click "Test AI Agent" in the ElevenLabs dashboard to chat with the model.


Using a Self Hosted Model with ElevenLabs Agents

This section explains how to use one of our self hosted models as the LLM backbone for your agent by self-hosting the model on your own infrastructure. We will use AFM 4.5B in this example

Deploy the model

Refer to our Quick Deploys section and our Hardware Prequesities page to select a method for deployment based on your use case and hardware. In this example, we'll use llama.cpp:

Launch the OpenAI-Compatible Server

Start llama-server with the correct model and context size. This will expose an OpenAI-compatible /v1/chat/completions endpoint:

bin/llama-server -m ./afm/AFM-4.5B-bf16.gguf \
  --host 0.0.0.0 \
  --port 8000 \
  --jinja \
  --ctx-size 8192

Make sure the --jinja flag is included. This is required to enable the OpenAI-compatible API.

Expose the Server with ngrok (Required)

To make your server accessible, create a public URL using a tunneling tool like ngrok:

ngrok http 8000

This will generate a public HTTPS URL like:

https://your-subdomain.ngrok-free.dev → http://localhost:8000

Keep this ngrok tunnel open while the agent is active.

Configure ElevenLabs Agent to Use Your Self Hosted Model

Configure your agent

  1. Go to the Agents tab and open your agent

  2. In the Model Configuration section, enter the ngrok url with "/v1" at the end, a placeholder model ID and select "None" for the API key:

Test the Agent

Click "Test AI Agent" in the ElevenLabs dashboard to chat with the model.

Last updated