Page cover

Models Overview

Arcee AI offers models at various sizes to meet different deployment scenarios. Choosing the right model can help you complete tasks more efficiently, accurately, and cost effectively.

Models

To help you find the best fit for your use case, we’ve created a table outlining the core features and strengths of each model in the Arcee AI family. Note that API hosted models are always the lastest update. Previous versions and post-trains are available for download on HuggingFace.

Model
Trinity-Nano-6B
Trinity-Mini-26B
Trinity [Coming Soon]

Strength

Lightweight, ultra-low latency model.

Fast and cost-efficient model for well-defined tasks.

Coming Soon

Ideal Deployment

Fully local on consumer GPUs, edge servers, and mobile devices. Tuned for offline operation.

Serve customer-facing apps, agent backends, and high-throughput services in your cloud or VPC.

Coming Soon

Active Parameters

1B per token

3B per token

Coming Soon

Context Window

128k tokens

128k tokens

Coming Soon

Reasoning Support

Yes

Yes

Coming Soon

Knowledge Cutoff

2024

2024

Coming Soon

Intelligence

Efficient

Good

Coming Soon

Speed

Very Fast

Fast

Coming Soon

Max Output Tokens

non-reasoning: 8k reasoning: 32k

non-reasoning: 8k readoning: 32k

Coming Soon

Endpoints

Chat Completion

Chat Completion

Coming Soon

Last updated