# Models Overview

Arcee AI offers models at various sizes to meet different deployment scenarios. Choosing the right model can help you complete tasks more efficiently, accurately, and cost effectively.&#x20;

{% hint style="info" %}
Trinity Mini and Large (Preview) are currently the only models available via API. Try Trinity-Large-Preview on OpenRouter [here](https://openrouter.ai/arcee-ai/trinity-large-preview:free/).
{% endhint %}

<table><thead><tr><th>Model</th><th>Trinity-Nano-6B</th><th>Trinity-Mini-26B</th><th width="187.4609375">Trinity-Large-400B (Instruct, Preview)</th></tr></thead><tbody><tr><td><strong>Strength</strong></td><td>Lightweight, ultra-low latency model.</td><td>Fast and cost-efficient model for well-defined tasks.</td><td>Robust generalist model with strong performance across reasoning, coding, math, and complex task decomposition.</td></tr><tr><td><strong>Ideal Deployment</strong></td><td>Fully local on consumer GPUs, edge servers, and mobile devices. Tuned for offline operation.</td><td>Serve customer-facing apps, agent backends, and high-throughput services in cloud or VPC.</td><td>Advanced agents, reasoning systems, and developer tools. Deployed via hosted cloud endpoints or self-hosted in multi-GPU configurations.</td></tr><tr><td><strong>Active Parameters</strong></td><td>1B per token</td><td>3B per token</td><td>13B per token</td></tr><tr><td><strong>Context Window</strong></td><td>128k tokens</td><td>128k tokens</td><td>512k tokens (hosted at 128k)</td></tr><tr><td><strong>Knowledge Cutoff</strong></td><td>2024</td><td>2024</td><td>2024</td></tr><tr><td><strong>Speed</strong></td><td><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><br>Instant</td><td><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><br>Very Fast</td><td><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><span data-gb-custom-inline data-tag="emoji" data-code="26a1">⚡</span><br>Very Fast</td></tr><tr><td><strong>API Model Name</strong></td><td>Coming Soon</td><td><strong>trinity-mini</strong></td><td><strong>trinity-large-preview</strong></td></tr></tbody></table>

> Robust generalist model with strong performance across reasoning, coding, math, and **complex task decomposition**.
