Start Alignment
Overview
Start alignment training in Arcee by initiating the process from your pretrained models, aligned models, merged models, or HuggingFace models. Choose the alignment dataset and base generator model to begin.
Start Alignment Training
Open Arcee UI
Select pretrained model
Choose alignment dataset
Pick base generator model
Launch alignment training
Monitor training progress
Dataset Selection and Training
Choosing Alignment Dataset
To select an alignment dataset in the Arcee UI, navigate to the Datasets section. Here, you will find a list of available datasets that you can use for alignment.
Choosing the correct dataset for the alignment process is important. The dataset determines how well your small language model adapts to its specific context. An accurate dataset helps ensure that the model performs better and meets your requirements.
Choose datasets that closely match your target domain for better alignment.
Ensure the dataset size is appropriate. Larger datasets may improve performance but require more resources.
Check the quality of the data. Clean and well-structured datasets yield better results.
Alignment training can begin once the correct dataset and the base generator model are selected in the UI. By choosing the right dataset, you help your model learn more effectively and align with your specific needs.
Start Alignment From Arcee Python
pip install arcee-py
%env ARCEE_API_KEY=YOUR-ARCEE-KEY
import arcee
arcee.start_alignment("my-align", hf_model="meta-llama/Meta-Llama-3-8B")
Video Walkthrough
This video demonstrates model alignment with the Arcee UI and the Arcee Python SDK. The video description includes a link to the companion notebook.
Frequently Asked Questions
You can align pretrained models, aligned models, merged models, and HuggingFace models on the Arcee platform.
No, you cannot pause training once it has started on the Arcee platform.
The duration of the training depends on the size and complexity of your data. It can range from a few hours to several days.
The data for alignment should be in a CSV format, structured according to the guidelines provided by Arcee.