# Trinity-Large-Preview

**Overview**

Trinity Large (Preview) is a 400B-parameter (13B active) sparse mixture-of-experts language model, engineered to scale model capacity while maintaining inference efficiency over long contexts, with strong performance in reasoning-heavy workloads including math, coding-related tasks, and multi-step agent workflows.

**Key Features**&#x20;

* **Sparse mixture-of-experts architecture:** Uses an extremely sparse MoE design with 400B total parameters and 13B activated per token. Sparse expert routing constrains per-token activation, enabling efficient inference at scale.
* **Long-context training and utilization:** Trained at 256K sequence length with support for 512K inference (hosted at 128k), using architecture and training procedures designed to operate effectively over long inputs and extended multi-turn interactions over large inputs.
* **High throughput efficiency:** Designed with inference-time efficiency as a primary objective, leveraging both extreme sparsity and optimized attention mechanisms to achieve strong throughput on modern accelerator hardware.

### Deployment Quickstart

To get started deploying Trinity Large, download the model [here](https://huggingface.co/arcee-ai) and proceed to [quick-deploys](https://docs.arcee.ai/quick-deploys "mention")

### Model Summary

|                                  |                                                        |
| -------------------------------- | ------------------------------------------------------ |
| Name                             | Trinity-Large-Preview                                  |
| Architecture                     | Mixture-of-Experts                                     |
| Parameters                       | 400 Billion Total, 13 Billion Active                   |
| Experts                          | 256 Experts, 4 Active                                  |
| Attention Mechanism              | Grouped Query Attention (GQA)                          |
| Training Tokens                  | 17 trillion                                            |
| License                          | Apache 2.0                                             |
| Recommended Inference Parameters | <ul><li>temperature: 0.8</li><li>top\_p: 0.8</li></ul> |

v
