Models Overview

Arcee AI offers models at various sizes to meet different deployment scenarios. Choosing the right model can help you complete tasks more efficiently, accurately, and cost effectively.

Models

To help you find the best fit for your use case, we’ve created a table outlining the core features and strengths of each model in the Arcee AI family. Note that API hosted models are always the lastest update. Previous versions and post-trains are available for download on HuggingFace.

Model

Trinity-Nano-6B

Trinity-Mini-26B

Trinity [Coming Soon]

Strength

Lightweight, ultra-low latency model.

Fast and cost-efficient model for well-defined tasks.

Coming Soon

Ideal Deployment

Fully local on consumer GPUs, edge servers, and mobile devices. Tuned for offline operation.

Serve customer-facing apps, agent backends, and high-throughput services in your cloud or VPC.

Coming Soon

Active Parameters

1B per token

3B per token

Coming Soon

Context Window

128k tokens

Coming Soon

Reasoning Support

Yes

Coming Soon

Knowledge Cutoff

2024

Coming Soon

Intelligence

⭐⭐ Efficient

⭐⭐⭐ Good

Coming Soon

Speed

⚡⚡⚡⚡ Very Fast

⚡⚡⚡ Fast

Coming Soon

Max Output Tokens

non-reasoning: 8k reasoning: 32k

non-reasoning: 8k readoning: 32k

Coming Soon

Endpoints

Chat Completion

Coming Soon

PreviousQuick Start NextPricing

Last updated 3 months ago

hashtagModels

Models