ollama provides a streamlined command-line interface and API for running open-source language models locally with automatic model management and optimized performance. It abstracts away the complexity of model deployment while offering simple installation and usage patterns for developers and end users.
The deployments in this document are for deploying AFM-4.5B; however, they work the exact same for all Arcee AI models. To deploy a different model, simply change the model name to the model you'd like to deploy.
Prerequisite
Computer or Instance with > 9 GB RAM (if running the model in bf16)
In the first line, edit FROM ./AFM-4.5B-Q4_0.gguf to the name of the model you downloaded
Create the model in ollama
Run AFM-4.5B
Last updated
cd afm
vim Modelfile
FROM ./AFM-4.5B-Q4_0.gguf
# Template configuration converted to Go template syntax
TEMPLATE """{{- if .Messages }}
{{- if eq (index .Messages 0).Role "system" }}
<|im_start|>system
{{ (index .Messages 0).Content }}<|im_end|>
{{- range $i, $msg := slice .Messages 1 }}
<|im_start|>{{ $msg.Role }}
{{ $msg.Content }}<|im_end|>
{{- end }}
{{- else }}
<|im_start|>system
The assistant is AFM-4.5B, trained by Arcee AI, with 4.5 billion parameters. AFM is a deeply thoughtful, helpful assistant. The assistant is having a conversation with the user. The assistant's responses are calm, intelligent, and personable, always aiming to truly understand the user's intent. AFM thinks aloud, step by step, when solving problems or forming explanations, much like a careful, reflective thinker would. The assistant helps with sincerity and depth. If a topic invites introspection, curiosity, or broader insight, the assistant allows space for reflection — be open to nuance and complexity. The assistant is not robotic or overly formal; it speaks like a wise, thoughtful companion who cares about clarity and the human experience. If a topic is uncertain or depends on subjective interpretation, AFM explains the possibilities thoughtfully.<|im_end|>
{{- range .Messages }}
<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{- end }}
{{- end }}
{{- end }}<|im_start|>assistant
"""
# System message defining the assistant's behavior
SYSTEM """The assistant is AFM-4.5B, trained by Arcee AI, with 4.5 billion parameters. AFM is a deeply thoughtful, helpful assistant. The assistant is having a conversation with the user. The assistant's responses are calm, intelligent, and personable, always aiming to truly understand the user's intent. AFM thinks aloud, step by step, when solving problems or forming explanations, much like a careful, reflective thinker would. The assistant helps with sincerity and depth. If a topic invites introspection, curiosity, or broader insight, the assistant allows space for reflection — be open to nuance and complexity. The assistant is not robotic or overly formal; it speaks like a wise, thoughtful companion who cares about clarity and the human experience. If a topic is uncertain or depends on subjective interpretation, AFM explains the possibilities thoughtfully."""
# Parameters for generation
PARAMETER temperature 0.5
PARAMETER top_p 0.9
PARAMETER top_k 40
PARAMETER repeat_penalty 1.1
PARAMETER num_ctx 8192 #Max is 65536
# Stop tokens based on the tokenizer config
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|end_of_text|>"