Deployment

Generating Response

Overview

Learn how to generate responses using Arcee, from deploying your model to querying it through the deployment interface or the Arcee Python client.

Generating Response via Interface

Step-by-step
  1. Open the deployment page.

  2. Enter your query in the text input field.

  3. Click the send button with the paper plane icon.

  4. View the response generated by the model.

Generating Responses

Deployment Setup

First, you need to deploy your model to generate responses. Arcee allows you to deploy different types of models, including pretraining, merges, and alignments.

Starting a deployment in Arcee involves using the Arcee Python client. You can point to an Arcee alignment, pretraining, or merged model.

Here are the basic steps to deploy a model using the Arcee Python client:

  1. Install the Arcee Python client by running !pip install -q arcee-py.

  2. Set up your Arcee API key with %env ARCEE_API_KEY=[YOUR-ARCEE-KEY].

  3. Deploy your model by pointing to the model you want to use.

Once your model is deployed, you can test and query it from the deployment interface or your Python client.

Start Your Deployment

Stand up an endpoint to begin deploying models in Arcee. Use the Arcee Python client to point to pretraining, merges, or alignments.

Querying Deployed Models

To generate a response from your model, first deploy your model.

Once your model is deployed, go to the deployment page. On the left, you will see deployment details, including the name of your deployment and any associated alignments. On the right, there is a large chat interface.

Type your query into the text input field at the bottom of the chat interface. Click the pink send button with the paper plane icon to submit your question or command. The response from the model will appear above the input field.

Generating Response from Python

If you prefer to generate a response using the Python client, first install the Arcee Python package and set your API key:

plaintext
!pip install -q arcee-py
%env ARCEE_API_KEY=[YOUR-ARCEE-KEY]

Then, you can generate a response by using the following code:

plaintext
import arcee
arcee.generate(deployment_name="mydeploy", query="what is the meaning of life")

Note: Multi-turn chat is not supported and is on the roadmap

Please note that multiturn chat is not supported.

Frequently Asked Questions

  • You can deploy all model types - pretrainings, merges, and alignments from the Arcee platform.

  • First, deploy your model. Then, go to the deployment page and type your query into the chat widget.

  • Install the Arcee Python client using !pip install -q arcee-py. Set the ARCEE_API_KEY environment variable, and use arcee.generate(deployment_name="mydeploy", query="your question").

  • No, multiturn chat is not supported, and it will soon be.

  • Response times vary based on model size, generation length, and the hardware you have deployed on.