Page cover

Models Overview

Arcee AI offers models at various sizes to meet different deployment scenarios. Choosing the right model can help you complete tasks more efficiently, accurately, and cost effectively.

circle-info

Trinity Mini and Large (Preview) are currently the only models available via API. Try Trinity-Large-Preview on OpenRouter herearrow-up-right.

Model
Trinity-Nano-6B
Trinity-Mini-26B
Trinity-Large-400B (Instruct, Preview)

Strength

Lightweight, ultra-low latency model.

Fast and cost-efficient model for well-defined tasks.

Robust generalist model with strong performance across reasoning, coding, math, and complex task decomposition.

Ideal Deployment

Fully local on consumer GPUs, edge servers, and mobile devices. Tuned for offline operation.

Serve customer-facing apps, agent backends, and high-throughput services in cloud or VPC.

Advanced agents, reasoning systems, and developer tools. Deployed via hosted cloud endpoints or self-hosted in multi-GPU configurations.

Active Parameters

1B per token

3B per token

13B per token

Context Window

128k tokens

128k tokens

512k tokens (hosted at 128k)

Knowledge Cutoff

2024

2024

2024

Speed

Instant

Very Fast

Very Fast

API Model Name

Coming Soon

trinity-mini

trinity-large-preview

Robust generalist model with strong performance across reasoning, coding, math, and complex task decomposition.

Last updated