Page cover

llamaIndex

LlamaIndex is an open-source data framework designed to help LLMs connect with external data sources in a structured, efficient, and context-aware way. It provides a powerful suite of tools for ingesting, indexing, querying, and retrieving data from diverse formats such as PDFs, databases, APIs, and more. With modular components like custom indices, retrievers, and agents, LlamaIndex enables developers to build scalable Retrieval-Augmented Generation (RAG) pipelines and LLM-powered applications.

This tutorial will guide you through integrating Arcee models into llamaIndex using an OpenAI-compatible endpoint.

The first example shows how to run simple inference with llamaIndex, while the second example shows how to setup a local RAG pipeline.


Model Inference

Prerequisites

  • Python: >=3.9

  • Arcee AI model running locally or accessible via API and an OpenAI-compatible endpoint

Quickstart

Environment and project setup:

# Create project folder
mkdir arceeai_llamaindex && cd arceeai_llamaindex

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env

# Create and activate virtual environment
uv venv --python 3.12 --seed
source .venv/bin/activate

# Install LlamaIndex OpenAI-compatible client
uv pip install llama-index-llms-openai-like

Create a new python file called arceeai_llamaindex.py with the following:

Test your script:

Retrieval Augmented Generation

This example sets up a RAG pipeline with LlamaIndex using an Arcee AI model for text generation and an OpenAI Embeddings model for document embeddings. It uses an in-memory vector database that is cleared after execution. For persistent storage and more advanced pipelines, see the LlamaIndex documentation.

Prerequisites

  • Python: >=3.9

Environment Setup for RAG Pipeline

Create a new python file called arceeai_llamaindex_rag.py with the following:

Run your script:

Last updated