Merging

Start YAML Merge

Overview

Configure and execute a YAML merge using Arcee's mergekit to combine multiple transformer models into a single, optimized model.

Starting a YAML Merge on the Arcee Platform

Step-by-step
  1. Go to the Merging tab and click on the Create Merging button.

  2. Click the Yaml tab.

  3. Enter a name for the merging.

  4. Edit the YAML configuration as needed. You can learn more about merging algorithms and parameters in the mergekitrepository.

  5. Click on the Create Merging button to launch the merge operation.

  6. Once the merging operation is complete, you can deploy and test the merged model by clicking on Deploy your merging. For detailed instructions, please refer to the Deployment section in the documentation.

Use the Same Base Architecture

Currently, models must share the same base architecture for merging, but we're working to expand to cross-arch soon!

Advanced Configuration and Execution

Arcee Python Client Setup

Follow these steps to perform a YAML merge using the Arcee Python client. First, install the client and set up your API key.

  • Install the Arcee Python client by running the command:

  • Set your Arcee API key with the following command:

Next, write your YAML merge configuration in a file. An example configuration might look like this:

plaintext
%%writefile -a newmerge.yaml
slices:
  - sources:
      - model: psmathur/orca_mini_v3_13b
        layer_range: [0, 40]
      - model: garage-bAInd/Platypus2-13B
        layer_range: [0, 40]
merge_method: slerp
base_model: psmathur/orca_mini_v3_13b
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: float16

Finally, start the merge by running the command:

pip install -q arcee-py

%env ARCEE_API_KEY=MY-ARCEE-KEY

plaintext
import arcee
arcee.mergekit_yaml("new_merge", "newmerge.yaml")

Merge Arcee-Platform Models

In the instance where you would like to specifically merge Arcee platform models, this requires that you append the following prefixes to your YAML configuration.

plaintext
"arcee-platform/pretraining/{pretraining_name}",
"arcee-platform/merging/{merging_name}",
"arcee-platform/alignment/{alignment_name}"

Example Arcee Platform YAML Configuration

Here is an example YAML file for merging an Arcee continuously pretretrained Llama3 with Llama3-Instruct:

plaintext
merge_method: ties
base_model: meta-llama/Meta-Llama-3-8B
models:
  - model: arcee-platform/pretraining/mypretrain
    parameters:
      weight:
        - filter: mlp
          value: [0.25, 0.5, 0.5, 0.25]
        - filter: self_attn
          value: [0.25, 0.5, 0.5, 0]
        - value: [0.25, 0.5, 0.5, 0.25]
      density: 0.75
  - model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      weight:
        - filter: mlp
          value: [0.75, 0.5, 0.5, 0.75]
        - filter: self_attn
          value: [0.75, 0.5, 0.5, 1]
        - value: [0.75, 0.5, 0.5, 0.75]
      density: 1.0
parameters:
  normalize: true
  int8_mask: true
dtype: bfloat16

Video Walkthrough

This video demonstrates model merging with the Arcee UI and the Arcee Python SDK. The video description includes a link to the companion notebook.

Frequently Asked Questions

  • Check the YAML file for syntax errors, and ensure that all model names and layer ranges are correct. Also, ensure that the merge method and parameters are defined properly.

  • Ensure the models you want to merge have compatible layer ranges and similar architectures. Incompatible models may result in faulty merges or errors.

  • The merging time can vary based on the models' size and complexity. Larger models will generally take longer to merge, or if there is a buildup on our GPU queues.