Models Overview
Arcee AI offers models at various sizes to meet different deployment scenarios. Choosing the right model can help you complete tasks more efficiently, accurately, and cost effectively.
Models
To help you find the best fit for your use case, we’ve created a table outlining the core features and strengths of each model in the Arcee AI family. Note that API hosted models are always the lastest update. Previous versions and post-trains are available for download on HuggingFace.
Strength
Lightweight, ultra-low latency model.
Fast and cost-efficient model for well-defined tasks.
Coming Soon
Ideal Deployment
Fully local on consumer GPUs, edge servers, and mobile devices. Tuned for offline operation.
Serve customer-facing apps, agent backends, and high-throughput services in your cloud or VPC.
Coming Soon
Active Parameters
1B per token
3B per token
Coming Soon
Context Window
128k tokens
128k tokens
Coming Soon
Reasoning Support
Yes
Yes
Coming Soon
Knowledge Cutoff
2024
2024
Coming Soon
Intelligence
⭐⭐ Efficient
⭐⭐⭐ Good
Coming Soon
Speed
⚡⚡⚡⚡ Very Fast
⚡⚡⚡ Fast
Coming Soon
Max Output Tokens
non-reasoning: 8k reasoning: 32k
non-reasoning: 8k readoning: 32k
Coming Soon
Endpoints
Chat Completion
Chat Completion
Coming Soon
Last updated


