# ollama

ollama provides a streamlined command-line interface and API for running open-source language models locally with automatic model management and optimized performance. It abstracts away the complexity of model deployment while offering simple installation and usage patterns for developers and end users.

{% hint style="info" %}
The deployments in this document are for deploying AFM-4.5B; however, they work the exact same for all Arcee AI models. To deploy a different model, simply change the model name to the model you'd like to deploy.
{% endhint %}

**Prerequisite**

1. Computer or Instance with > 9 GB RAM (if running the model in bf16)
2. A Hugging Face account with access to [arcee-ai/AFM-4.5B-GGUF](https://huggingface.co/arcee-ai/AFM-4.5B-GGUF)
3. Download [ollama](https://ollama.com/download)

**Deployment**

1. Download an AFM-4.5B GGUF version from Hugging Face (we recommend using BF16, Q8\_0, or Q4\_0)

```bash
pip install --upgrade huggingface_hub[cli]
hf auth login

mkdir afm

# bf16
hf download arcee-ai/AFM-4.5B-GGUF AFM-4.5B-bf16.gguf --repo-type model --local-dir ./afm

# Q8_0
hf download arcee-ai/AFM-4.5B-GGUF AFM-4.5B-Q8_0.gguf --repo-type model --local-dir ./afm

# Q4_0
hf download arcee-ai/AFM-4.5B-GGUF AFM-4.5B-Q4_0.gguf --repo-type model --local-dir ./afm
```

2. Create a `Modelfile`&#x20;

```bash
cd afm
vim Modelfile
```

3. Paste in the following content into the Modelfile

```
FROM ./AFM-4.5B-Q4_0.gguf

# Template configuration converted to Go template syntax
TEMPLATE """{{- if .Messages }}
{{- if eq (index .Messages 0).Role "system" }}
<|im_start|>system
{{ (index .Messages 0).Content }}<|im_end|>
{{- range $i, $msg := slice .Messages 1 }}
<|im_start|>{{ $msg.Role }}
{{ $msg.Content }}<|im_end|>
{{- end }}
{{- else }}
<|im_start|>system
The assistant is AFM-4.5B, trained by Arcee AI, with 4.5 billion parameters. AFM is a deeply thoughtful, helpful assistant. The assistant is having a conversation with the user. The assistant's responses are calm, intelligent, and personable, always aiming to truly understand the user's intent. AFM thinks aloud, step by step, when solving problems or forming explanations, much like a careful, reflective thinker would. The assistant helps with sincerity and depth. If a topic invites introspection, curiosity, or broader insight, the assistant allows space for reflection — be open to nuance and complexity. The assistant is not robotic or overly formal; it speaks like a wise, thoughtful companion who cares about clarity and the human experience. If a topic is uncertain or depends on subjective interpretation, AFM explains the possibilities thoughtfully.<|im_end|>
{{- range .Messages }}
<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{- end }}
{{- end }}
{{- end }}<|im_start|>assistant
"""

# System message defining the assistant's behavior
SYSTEM """The assistant is AFM-4.5B, trained by Arcee AI, with 4.5 billion parameters. AFM is a deeply thoughtful, helpful assistant. The assistant is having a conversation with the user. The assistant's responses are calm, intelligent, and personable, always aiming to truly understand the user's intent. AFM thinks aloud, step by step, when solving problems or forming explanations, much like a careful, reflective thinker would. The assistant helps with sincerity and depth. If a topic invites introspection, curiosity, or broader insight, the assistant allows space for reflection — be open to nuance and complexity. The assistant is not robotic or overly formal; it speaks like a wise, thoughtful companion who cares about clarity and the human experience. If a topic is uncertain or depends on subjective interpretation, AFM explains the possibilities thoughtfully."""

# Parameters for generation
PARAMETER temperature 0.5
PARAMETER top_p 0.9
PARAMETER top_k 40
PARAMETER repeat_penalty 1.1
PARAMETER num_ctx 8192 #Max is 65536

# Stop tokens based on the tokenizer config
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|end_of_text|>"
```

{% hint style="info" %}
In the first line, edit `FROM ./AFM-4.5B-Q4_0.gguf` to the name of the model you downloaded
{% endhint %}

4. Create the model in ollama

```bash
ollama create afm-4.5b
```

5. Run AFM-4.5B

```bash
ollama run afm-4.5b
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.arcee.ai/quick-deploys/ollama.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
