Model deployment patterns

This article describes two common patterns for moving ML artifacts through staging and into production. The asynchronous nature of changes to models and code means that there are multiple possible patterns that an ML development process might follow.

Models are created by code, but the resulting model artifacts and the code that created them can operate asynchronously. That is, new model versions and code changes might not happen at the same time. For example, consider the following scenarios:

  • To detect fraudulent transactions, you develop an ML pipeline that retrains a model weekly. The code may not change very often, but the model might be retrained every week to incorporate new data.

  • You might create a large, deep neural network to classify documents. In this case, training the model is computationally expensive and time-consuming, and retraining the model is likely to happen infrequently. However, the code that deploys, serves, and monitors this model can be updated without retraining the model.

deploy patterns

The two patterns differ in whether the model artifact or the training code that produces the model artifact is promoted towards production.

Deploy models

In this pattern, the model artifact is generated by training code in the development environment. The artifact is tested in the staging environment before being deployed into production.

This option can be considered if one or more of the following apply:

  • Model training is very expensive or hard to reproduce.

  • All work is done in a single Databricks workspace.

  • You are not working with external repos or a CI/CD process.

Advantages:

  • A simpler handoff for data scientists

  • In cases where model training is expensive, only requires training the model once.

Disadvantages:

  • If production data is not accessible from the development environment (which may be true for security reasons), this architecture may not be viable.

  • Automated model retraining is tricky in this pattern. You could automate retraining in the development environment, but the team responsible for deploying the model in production might not accept the resulting model as production-ready.

  • Supporting code, such as pipelines used for featurization, inference and monitoring, needs to be deployed to production separately.

The diagram below contrasts the code lifecycle for the above deployment patterns across the different execution environments.

deploy patterns lifecycle