Log and register AI agents
Preview
This feature is in Public Preview.
Log AI agents using Mosaic AI Agent Framework. Logging an agent is the basis of the development process. Logging captures a “point in time” of the agent’s code and configuration so you can evaluate the quality of the configuration.
Requirements
Create an AI agent before logging it.
Code-based vs. serialization-based logging
You can use code-based MLflow logging or serialization-based MLflow logging. Databricks recommends that you use code-based logging.
Code-based MLflow logging: The chain’s code is captured as a Python file. The Python environment is captured as a list of packages. When the chain is deployed, the Python environment is restored, and the chain’s code is executed to load the chain into memory so it can be invoked when the endpoint is called.
Serialization-based MLflow logging: The chain’s code and current state in the Python environment is serialized to disk, often using libraries such as pickle
or joblib
. When the chain is deployed, the Python environment is restored, and the serialized object is loaded into memory so it can be invoked when the endpoint is called.
The table shows the advantages and disadvantages of each method.
Method |
Advantages |
Disadvantages |
---|---|---|
Code-based MLflow logging |
|
|
Serialization-based MLflow logging |
|
|
For code-based logging, the code that logs your agent or chain must be in a separate notebook from your chain code. This notebook is called a driver notebook. For an example notebook, see Example notebooks.
Code-based logging with LangChain
Create a notebook or Python file with your code. For purposes of this example, the notebook or file is named
chain.py
. The notebook or file must contain a LangChain chain, referred to here aslc_chain
.Include
mlflow.models.set_model(lc_chain)
in the notebook or file.Create a new notebook to serve as the driver notebook (called
driver.py
in this example).In the driver notebook, use
mlflow.lang_chain.log_model(lc_model=”/path/to/chain.py”)
to runchain.py
and log the results to an MLflow model.Deploy the model. See Deploy an agent for generative AI application. The deployment of your agent might depend on other Databricks resources such as a vector search index and model serving endpoints. For LangChain agents:
The MLflow
log_model
infers the dependencies required by the chain and logs them to theMLmodel
file in the logged model artifact.During deployment,
databricks.agents.deploy
automatically creates the M2M OAuth tokens required to access and communicate with these inferred resource dependencies.
When the serving environment is loaded,
chain.py
is executed.When a serving request comes in,
lc_chain.invoke(...)
is called.
import mlflow
code_path = "/Workspace/Users/first.last/chain.py"
config_path = "/Workspace/Users/first.last/config.yml"
input_example = {
"messages": [
{
"role": "user",
"content": "What is Retrieval-augmented Generation?",
}
]
}
# example using LangChain
with mlflow.start_run():
logged_chain_info = mlflow.langchain.log_model(
lc_model=code_path,
model_config=config_path, # If you specify this parameter, this is the configuration that is used for training the model. The development_config is overwritten.
artifact_path="chain", # This string is used as the path inside the MLflow model where artifacts are stored
input_example=input_example, # Must be a valid input to your chain
example_no_conversion=True, # Required
)
print(f"MLflow Run: {logged_chain_info.run_id}")
print(f"Model URI: {logged_chain_info.model_uri}")
# To verify that the model has been logged correctly, load the chain and call `invoke`:
model = mlflow.langchain.load_model(logged_chain_info.model_uri)
model.invoke(example)
Code-based logging with PyFunc
Create a notebook or Python file with your code. For purposes of this example, the notebook or file is named
chain.py
. The notebook or file must contain a PyFunc class, referred to here asPyFuncClass
.Include
mlflow.models.set_model(PyFuncClass)
in the notebook or file.Create a new notebook to serve as the driver notebook (called
driver.py
in this example).In the driver notebook, use
mlflow.pyfunc.log_model(python_model=”/path/to/chain.py”, resources=”/path/to/resources.yaml”)
to runchain.py
and log the results to an MLflow model. Theresources
parameter declares any resources needed to serve the model such as, a vector search index or serving endpoint that serves a foundation model. See an example resources file for PyFunc.Deploy the model. See Deploy an agent for generative AI application.
When the serving environment is loaded,
chain.py
is executed.When a serving request comes in,
PyFuncClass.predict(...)
is called.
import mlflow
code_path = "/Workspace/Users/first.last/chain.py"
config_path = "/Workspace/Users/first.last/config.yml"
input_example = {
"messages": [
{
"role": "user",
"content": "What is Retrieval-augmented Generation?",
}
]
}
# example using PyFunc model
resources_path = "/Workspace/Users/first.last/resources.yml"
with mlflow.start_run():
logged_chain_info = mlflow.pyfunc.log_model(
python_model=chain_notebook_path,
artifact_path="chain",
input_example=input_example,
resources=resources_path,
example_no_conversion=True,
)
print(f"MLflow Run: {logged_chain_info.run_id}")
print(f"Model URI: {logged_chain_info.model_uri}")
# To verify that the model has been logged correctly, load the chain and call `invoke`:
model = mlflow.pyfunc.load_model(logged_chain_info.model_uri)
model.invoke(example)
Specify resources for PyFunc agent
You can specify resources, such as a vector search index and a serving endpoint, that are required to serve the model. For LangChain, resources are automatically picked up and logged along with the model.
When deploying a pyfunc
flavored agent, you must manually add any resource dependencies of the deployed agent. An M2M OAuth token with access to all the specified resources in the resources
parameter is created and provided to the deployed agent.
Note
You can override the resources your endpoint has permission to by manually specifying the resources when logging the chain.
The following shows how to add serving endpoint and vector search index dependencies by specifying them in the resources
parameter.
with mlflow.start_run():
logged_chain_info = mlflow.pyfunc.log_model(
python_model=chain_notebook_path,
artifact_path="chain",
input_example=input_example,
example_no_conversion=True,
resources=[
DatabricksServingEndpoint(endpoint_name="databricks-mixtral-8x7b-instruct"),
DatabricksServingEndpoint(endpoint_name="databricks-bge-large-en"),
DatabricksVectorSearchIndex(index_name="rag.studio_bugbash.databricks_docs_index")
]
)
You can also add resources by specifying them in a resources.yaml
file. You can reference that file path in the resources
parameter. An M2M OAuth token with access to all the specified resources in the resources.yaml
is created and provided to the deployed agent.
The following is an example resources.yaml
file that defines model serving endpoints and a vector search index.
api_version: "1"
databricks:
vector_search_index:
- name: "catalog.schema.my_vs_index"
serving_endpoint:
- name: databricks-dbrx-instruct
- name: databricks-bge-large-en
Register the chain to Unity Catalog
Before you deploy the chain, you must register the chain to Unity Catalog. When you register the chain, it is packaged as a model in Unity Catalog, and you can use Unity Catalog permissions for authorization for resources in the chain.
import mlflow
mlflow.set_registry_uri("databricks-uc")
catalog_name = "test_catalog"
schema_name = "schema"
model_name = "chain_name"
model_name = catalog_name + "." + schema_name + "." + model_name
uc_model_info = mlflow.register_model(model_uri=logged_chain_info.model_uri, name=model_name)