Models Overview

Arcee AI offers models at various sizes to meet different deployment scenarios. Choosing the right model can help you complete tasks more efficiently, accurately, and cost effectively.

Trinity Mini and Large (Preview) are currently the only models available via API. Try Trinity-Large-Preview on OpenRouter here.

Model

Trinity-Nano-6B

Trinity-Mini-26B

Trinity-Large-400B (Instruct, Preview)

Strength

Lightweight, ultra-low latency model.

Fast and cost-efficient model for well-defined tasks.

Robust generalist model with strong performance across reasoning, coding, math, and complex task decomposition.

Ideal Deployment

Fully local on consumer GPUs, edge servers, and mobile devices. Tuned for offline operation.

Serve customer-facing apps, agent backends, and high-throughput services in cloud or VPC.

Advanced agents, reasoning systems, and developer tools. Deployed via hosted cloud endpoints or self-hosted in multi-GPU configurations.

Active Parameters

1B per token

3B per token

13B per token

Context Window

128k tokens

512k tokens (hosted at 128k)

Knowledge Cutoff

2024

Speed

⚡⚡⚡⚡⚡ Instant

⚡⚡⚡ Very Fast

API Model Name

Coming Soon

trinity-mini

trinity-large-preview

Robust generalist model with strong performance across reasoning, coding, math, and complex task decomposition.

PreviousQuick Start NextPricing

Last updated 1 month ago