Start Deployment
Overview
Initiate the deployment process in Arcee by selecting from pretrained, merged, or aligned models. Ensuring model performance is crucial during deployment.
Free Endpoints Last 2 Hours
Due to GPU shortages, free deployment endpoints on Arcee will remain active for only 2 hours for testing before shutting down automatically.
Expect Deployment in 5 minutes
Deployment for 7-8B models in Arcee typically takes about 5 minutes. Please be sure to stand by and get ready to test your model.
Mind the Token Limit
Arcee generation windows are capped at 2048 tokens. If you have a longer context, please download your model.
Starting a Deployment on the Arcee Platform
Select the Deployments tab and click on the Create Deployment button.
Enter the deployment name in the provided text field.
Select the model to deploy from the dropdown menu.
Configure the deployment settings as needed.
Click on the Create Deployment button.
Verify the deployment status.
Setting Up Deployment
Deployment Requirements
To start a deployment in Arcee, you must provide several essential arguments. These arguments help define the specific characteristics and configurations of your deployment.
Deployment Name: This is the unique name for your deployment. It helps identify your deployment, among others.
Alignment: Specifies the model alignment to be used in the deployment.
Merging: Indicates if the deployment involves merging models.
Pretraining: Points to the pretrained model version you want to deploy.
Target Instance: Specifies the instance type where the deployment will be hosted.
Deploying with Arcee Python
!pip install -q arcee-py
%env ARCEE_API_KEY=[MY-ARCEE-API-KEY]
import arcee
arcee.start_deployment("mydeploy", alignment="Qwen2-7B-Instruct")
Parameters:
- `deployment_name` (str): The name of the deployment (required).
- `alignment` (Optional[str]): The alignment configuration name.
- `merging` (Optional[str]): The merging configuration name.
- `pretraining` (Optional[str]): The pretraining configuration name.
- `retriever` (Optional[str]): The retriever configuration name.
- `target_instance` (Optional[str]): The target instance for deployment. Default is ml.g5.2xlarge (a single A10 GPU)
- `openai_compatability` (Optional[bool]): Flag indicating OpenAI compatibility, defaults to False. Not advised unless you need the messages API.
Free Tier Limitation
Deployment endpoints on the free tier are automatically closed within two hours after deployment.
Production Recommendation
For production deployments, you are recommended to download your model weights.
Choosing the Model to Deploy
To select a model for deployment in Arcee, start by accessing the Create Deployment screen. This screen features a modal window where you can begin the process.
First, enter a name for your deployment in the Deployment Name text field. Next, use the Model to Deploy dropdown menu to choose the model you wish to deploy. A list of available pretrained models will appear when you click on this dropdown menu. You can select from Qwen2-7B-Instruct, Qwen2-7B, Mistral-7B-Instruct-V0.2, Mistral-7B-Instruct-V0.1, and Mistral-7B-V0.1. Please note, these are not Arcee platform models. These are models that we support off the shelf.
Choosing the correct model is essential, as it should align with your deployment goals and requirements. Please carefully review the available models to select the best fit for your needs.
Available Model Types for Deployment
Model Type | Example Models |
---|---|
Pretrained | Your pretrained models, e.g., llama-base |
Merged | Your merges |
Aligned | Your alignments, e.g., llama-chat |
Frequently Asked Questions
You can deploy models that have undergone alignment, pretraining, or merging within the Arcee platform.
To select a model for deployment, choose from the list of available pretrained models in the Model to Deploy dropdown menu in the Create Deployment modal.
Verifying deployment is essential to ensure your model's performance meets your expectations.
On the free tier, deployment endpoints are automatically closed within two hours after deployment.