Foundation Model APIs model maintenance policy

This article describes the model maintenance policy for the Foundation Model APIs pay-per-token offering.

The Foundation Model API pay-per-token offering allows customers to experiment with the best models. In order to continue supporting the most state-of-the-art models, Databricks might deprecate older models or update supported models.

If you require long-term support for a specific model version, Databricks recommends using provisioned throughput.

Model deprecation

The following deprecation policy only applies to chat and completion models.

If a model is set for deprecation, Databricks takes the following steps to notify customers:

  • A warning message displays in the model card from the Serving page of your Databricks workspace that indicates that the model is deprecated.

  • The documentation contains a notice that indicates the model is deprecated.

After customers are notified about the upcoming model deprecation, Databricks will retire the model in 3 months. During this period of time, customers can choose to migrate to a provisioned throughput endpoint to continue using the model past its end-of-life date.

Model updates

Databricks might ship incremental updates to pay-per-token models to deliver optimizations. When a model is updated, the endpoint URL remains the same, but the model ID in the response object changes to reflect the date of the update. For example, if an update is shipped to llama-2-70b-chat on 3/4/2024, the model name in the response object updates accordingly to llama-2-70b-chat-030424. Databricks maintains a version history of the updates that customers can refer to.