This article describes how to deploy custom models with Model Serving.
Custom models provide flexibility to deploy logic alongside your models. The following are example scenarios where you might want to use the guide.
Your model requires preprocessing before inputs can be passed to the model’s predict function.
Your application requires the model’s raw outputs to be post-processed for consumption.
The model itself has per-request branching logic.
You are looking to deploy fully custom code as a model.
MLflow offers the ability to log custom models in the form of models written with the custom Python models format.
Databricks recommends constructing custom models by writing a
PythonModel class. There are two required functions:
load_context- anything that needs to be loaded just one time for the model to operate should be defined in this function. This is critical so that the system minimize the number of artifacts loaded during the
predictfunction, which speeds up inference.
predict- this function houses all the logic that is run every time an input request is made.
Even though you are writing your model with custom code, it is possible to use shared modules of code from your organization. With the
code_path parameter, authors of models can log full code references that load into the path and usable from custom models.
For example, if a model is logged with:
mlflow.pyfunc.log_model(CustomModel(), "model", code_path = ["preprocessing_utils/"])
Code from the
preprocessing_utils is available in the loaded context of the model. The following is an example model that uses this code.
class CustomModel(mlflow.pyfunc.PythonModel): def load_context(self, context): self.model = torch.load(context.artifacts["model-weights"]) from preprocessing_utils.my_custom_tokenizer import CustomTokenizer self.tokenizer = CustomTokenizer(context.artifacts["tokenizer_cache"]) def format_inputs(self, model_input): # insert some code that formats your inputs pass def format_outputs(self, outputs): predictions = (torch.sigmoid(outputs)).data.numpy() return predictions def predict(self, context, model_input): model_input = self.format_inputs(model_input) outputs = self.model.predict(model_input) return self.format_outputs(outputs)
After you log your custom model, you can register it to the MLflow Model Registry and serve your model to a Model Serving endpoint.