Serve DeepSeek R1 (Distilled Llama 70B) using provisioned throughput
This notebook demonstrates how to download and register the DeepSeek R1 distilled Llama model in Unity Catalog and deploy it using a Foundation Model APIs provisioned throughput endpoint.
Install the transformers library from HuggingFace
4:
Install HuggingFace transformers
Download DeepSeek R1 distilled Llama 70B
The following code downloads the DeepSeek R1 distilled Llama 70B model to your local machine.
6
7:
Set the huggingface cache folder on the local SSD drive.
8:
Download first the checkpoint to deploy
Register the downloaded model to Unity Catalog
The following code shows how to start and log a run that registers the downloaded model to Unity Catalog.
10:
Register your downloaded model in Unity Catalog
Create a provisioned throughput endpoint for model serving
The following code shows how to create a provisioned throughput model serving endpoint to serve the Llama 70B that you downloaded and registered to Unity Catalog.