Tutorial: Create and deploy a Foundation Model Fine-tuning run

Preview

This feature is in Public Preview in us-east-1 and us-west-2.

This article describes how to create and configure a run using the Foundation Model Fine-tuning (now part of Mosaic AI Model Training) API, and then review the results and deploy the model using the Databricks UI and Mosaic AI Model Serving.

Requirements

Step 1: Prepare your data for training

See Prepare data for Foundation Model Fine-tuning.

Step 2: Install the databricks_genai SDK

Use the following to install the databricks_genai SDK.

%pip install databricks_genai

Next, import the foundation_model library:

dbutils.library.restartPython()
from databricks.model_training import foundation_model as fm

Step 3: Create a training run

Create a training run using the Foundation Model Fine-tuning create() function. The following parameters are required:

  • model: the model you want to train.

  • train_data_path: the location of the training dataset in.

  • register_to: the Unity Catalog catalog and schema where you want checkpoints saved in.

For example:

run = fm.create(model='meta-llama/Meta-Llama-3.1-8B-Instruct',
                train_data_path='dbfs:/Volumes/main/my-directory/ift/train.jsonl', # UC Volume with JSONL formatted data
                register_to='main.my-directory',
                training_duration='1ep')

run

Step 4: View the status of a run

The time it takes to complete a training run depends on the number of tokens, the model, and GPU availability. For faster training, Databricks recommends that you use reserved compute. Reach out to your Databricks account team for details.

After you launch your run, you can monitor the status of it using get_events().

run.get_events()

Step 5: View metrics and outputs

Follow these steps to view the results in the Databricks UI:

  1. In the Databricks workspace, click Experiments in the left nav bar.

  2. Select your experiment from the list.

  3. Review the metrics charts in the Charts tab. Training metrics are generated for each training run and evaluation metrics are only generated if an evaluation data path is provided.

    1. The primary training metric showing progress is loss. Evaluation loss can be used to see if your model is overfitting to your training data. However, loss should not be relied on entirely because in supervised training tasks, the evaluation loss can appear to be overfitting while the model continues to improve.

    2. The higher the accuracy the better your model, but keep in mind that accuracy close to 100% might demonstrate overfitting.

    3. The following metrics appear in MLflow after your run:

      • LanguageCrossEntropy computes cross entropy on language modeling outputs. A lower score is better.

      • LanguagePerplexity measures how well a language model predicts the next word or character in a block of text based on previous words or characters. A lower score is better.

      • TokenAccuracy computes token-level accuracy for language modeling. A higher score is better.

    4. In this tab, you can also view the output of your evaluation prompts if you specified them.

Step 6: Evaluate multiple customized model with Mosaic AI Agent Evaluation before deploy

See What is Mosaic AI Agent Evaluation?.

Step 7: Deploy your model

The training run automatically registers your model in Unity Catalog after it completes. The model is registered based on what you specified in the register_to field in the run create() method.

To deploy the model for serving, follow these steps:

  1. Navigate to the model in Unity Catalog.

  2. Click Serve this model.

  3. Click Create serving endpoint.

  4. In the Name field, provide a name for your endpoint.

  5. Click Create.

Additional resources