Use prompts in deployed applications
This feature is in Beta.
This guide shows you how to use prompts from the MLflow Prompt Registry in your production GenAI applications.
When deploying GenAI applications, configure them to load prompts from the MLflow Prompt Registry using aliases rather than hard-coded versions. This approach enables dynamic updates without redeployment.
Prerequisites
-
Install MLflow and required packages
Bashpip install --upgrade "mlflow[databricks]>=3.1.0"
-
Create an MLflow experiment by following the setup your environment quickstart.
-
Make sure you have access to a Unity Catalog schema with the
CREATE FUNCTION
,EXECUTE
, andMANAGE
permissions to use the prompt registry.
Step 1. Create a new prompt
You can create prompts programmatically using the Python SDK.
Create prompts programmatically with mlflow.genai.register_prompt()
. Prompts use double-brace syntax ({{variable}}
) for template variables.
import mlflow
# Replace with a Unity Catalog schema where you have CREATE FUNCTION permission
uc_schema = "workspace.default"
# This table will be created in the above UC schema
prompt_name = "summarization_prompt"
# Define the prompt template with variables
initial_template = """\
Summarize content you are provided with in {{num_sentences}} sentences.
Content: {{content}}
"""
# Register a new prompt
prompt = mlflow.genai.register_prompt(
name=f"{uc_schema}.{prompt_name}",
template=initial_template,
# all parameters below are optional
commit_message="Initial version of summarization prompt",
tags={
"author": "data-science-team@company.com",
"use_case": "document_summarization"
"task": "summarization",
"language": "en",
"model_compatibility": "gpt-4"
}
)
print(f"Created prompt '{prompt.name}' (version {prompt.version})")
Step 2. Add an alias to the prompt version
Aliases allow you to assign a static string tag to a specific prompt version, making it easier to reference prompts in production applications. Instead of hardcoding version numbers, you can use meaningful aliases like production
, staging
, or development
. When you need to update your production prompt, simply reassign the production
alias to point to a newer version without changing or redeploying your application code.
import mlflow
mlflow.genai.set_prompt_alias(
name=f"{uc_schema}.{prompt_name}",
alias="production",
version=1
)
Step 3: Reference the prompt in your app
Once you've registered your prompt and assigned an alias, you can reference it in your deployed applications using the prompt URI format. The recommended approach is to use environment variables to make your application flexible and avoid hardcoding prompt references.
The prompt URI format is: prompts:/{catalog}.{schema}.{prompt_name}@{alias}
Using the prompt we registered in Step 1, the URI would be:
prompts://workspace.default.summarization_prompt@production
Here's how to reference the prompt in your application:
import mlflow
import os
from typing import Optional
class ProductionApp:
def __init__(self):
# Use environment variable for flexibility
self.prompt_alias = os.getenv("PROMPT_ALIAS", "production")
self.prompt_name = os.getenv("PROMPT_URI", "workspace.default.summarization_prompt")
def get_prompt(self) -> str:
"""Load prompt from registry using alias."""
uri = f"prompts:/{self.prompt_name}@{self.prompt_alias}"
prompt = mlflow.genai.load_prompt(uri)
return prompt
# Rest of your application's code
# Example usage
app = ProductionApp()
prompt = app.get_prompt()
print(f"Loaded prompt: {prompt}")
Use the prompt registry with a deployed agent using Mosaic AI Agent Framework
To access prompts in the prompt registry from a deployed agent, you must set up manual authentication either with your own PAT or using a service principal. Note that if you do this, normal system authentication for other logged resources will also not work. Follow these steps:
-
Get a personal access token (PAT):
- Use a service principal (recommended for security):
- Create a service principal.
- Grant the service principal
CREATE FUNCTION
,EXECUTE
, andMANAGE
permissions on the Unity Catalog schema in which to store prompts. - Grant the service principal
CAN USE
access to other resources (such as Unity Catalog functions and Vector Search indexes) used by your agent. - Create a PAT for the service principal.
- Use your personal account.
- Use a service principal (recommended for security):
-
Access the databricks secret within your agent code:
# TODO: set secret_scope_name and secret_key_name to access your PAT
secret_scope_name = ""
secret_key_name = ""
os.environ["DATABRICKS_HOST"] = None # TODO: fill in your workspace URL ex: https://host.databricks.com
assert os.environ["DATABRICKS_HOST"] is not None
os.environ["DATABRICKS_TOKEN"] = dbutils.secrets.get(scope=secret_scope_name, key=secret_key_name)
assert os.environ["DATABRICKS_TOKEN"] is not None, ("The DATABRICKS_TOKEN was not set to the PAT secret")
- When deploying your model using
agents.deploy()
, include the secret as an environment variable:
agents.deploy(
UC_MODEL_NAME,
uc_registered_model_info.version,
environment_vars={
"DATABRICKS_HOST": os.environ.get("DATABRICKS_HOST")
"DATABRICKS_TOKEN": f"{{{{secrets/{secret_scope_name}/{secret_key_name}}}}}"
},
)
Next Steps
- Link production traces to app versions - Track which prompt versions are used in production
- Run scorers in production - Monitor the quality of your deployed prompts
- Evaluate prompts - Test new prompt versions before promoting to production