Aligning

Start Alignment

Overview

Start alignment training in Arcee by initiating the process from your pretrained models, aligned models, merged models, or HuggingFace models. Choose the alignment dataset and base generator model to begin.

Start Alignment Training

Step-by-step
  1. Open Arcee UI

  2. Select pretrained model

  3. Choose alignment dataset

  4. Pick base generator model

  5. Launch alignment training

  6. Monitor training progress

Dataset Selection and Training

Choosing Alignment Dataset

To select an alignment dataset in the Arcee UI, navigate to the Datasets section. Here, you will find a list of available datasets that you can use for alignment.

Choosing the correct dataset for the alignment process is important. The dataset determines how well your small language model adapts to its specific context. An accurate dataset helps ensure that the model performs better and meets your requirements.

  • Choose datasets that closely match your target domain for better alignment.

  • Ensure the dataset size is appropriate. Larger datasets may improve performance but require more resources.

  • Check the quality of the data. Clean and well-structured datasets yield better results.

Alignment training can begin once the correct dataset and the base generator model are selected in the UI. By choosing the right dataset, you help your model learn more effectively and align with your specific needs.

Start Alignment From Arcee Python

pip install arcee-py

%env ARCEE_API_KEY=YOUR-ARCEE-KEY

plaintext
import arcee
arcee.start_alignment("my-align", hf_model="meta-llama/Meta-Llama-3-8B")

Video Walkthrough

This video demonstrates model alignment with the Arcee UI and the Arcee Python SDK. The video description includes a link to the companion notebook.

Frequently Asked Questions

  • You can align pretrained models, aligned models, merged models, and HuggingFace models on the Arcee platform.

  • No, you cannot pause training once it has started on the Arcee platform.

  • The duration of the training depends on the size and complexity of your data. It can range from a few hours to several days.

  • The data for alignment should be in a CSV format, structured according to the guidelines provided by Arcee.