provisioned-throughput-bge-serving(Python)

Loading...

Provisioned Throughput BGE serving example

Provisioned Throughput provides optimized inference for Foundation Models with performance guarantees for production workloads.

This example walks through:

  1. Downloading the model from Hugging Face transformers
  2. Logging the model in a provisioned throughput supported format into the Databricks Unity Catalog or Workspace Registry
  3. Enabling optimized serving on the model

Prerequisites

  • Attach a cluster with sufficient memory to the notebook
  • Make sure to have MLflow version 2.11 or later installed
  • Make sure to enable Models in UC, especially when working with models larger than 7B in size

Step 1: Log the model for optimized LLM serving

Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages. Collecting mlflow Downloading mlflow-2.10.0-py3-none-any.whl (19.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/19.5 MB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.1/19.5 MB 4.0 MB/s eta 0:00:05 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.5/19.5 MB 8.7 MB/s eta 0:00:03 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.6/19.5 MB 44.6 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.4/19.5 MB 72.1 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.9/19.5 MB 140.0 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.5/19.5 MB 145.1 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.5/19.5 MB 145.1 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.5/19.5 MB 145.1 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.5/19.5 MB 62.8 MB/s eta 0:00:00 Requirement already satisfied: scipy<2 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (1.10.0) Requirement already satisfied: sqlalchemy<3,>=1.4.0 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (1.4.39) Requirement already satisfied: packaging<24 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (23.1) Requirement already satisfied: markdown<4,>=3.3 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (3.4.1) Requirement already satisfied: matplotlib<4 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (3.7.2) Requirement already satisfied: sqlparse<1,>=0.4.0 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (0.4.2) Collecting docker<8,>=4.0.0 Downloading docker-7.0.0-py3-none-any.whl (147 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/147.6 kB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 147.6/147.6 kB 29.1 MB/s eta 0:00:00 Requirement already satisfied: pyyaml<7,>=5.1 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (6.0) Requirement already satisfied: pytz<2024 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (2022.7) Requirement already satisfied: protobuf<5,>=3.12.0 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (4.24.0) Requirement already satisfied: pyarrow<16,>=4.0.0 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (14.0.1) Requirement already satisfied: scikit-learn<2 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (1.3.0) Requirement already satisfied: gitpython<4,>=2.1.0 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (3.1.27) Requirement already satisfied: numpy<2 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (1.23.5) Requirement already satisfied: gunicorn<22 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (20.1.0) Requirement already satisfied: pandas<3 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (2.0.3) Requirement already satisfied: importlib-metadata!=4.7.0,<8,>=3.7.0 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (6.0.0) Requirement already satisfied: click<9,>=7.0 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (8.0.4) Requirement already satisfied: entrypoints<1 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (0.4) Requirement already satisfied: Flask<4 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (2.2.5) Requirement already satisfied: cloudpickle<4 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (2.2.1) Requirement already satisfied: requests<3,>=2.17.3 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (2.31.0) Collecting querystring-parser<2 Downloading querystring_parser-1.2.4-py2.py3-none-any.whl (7.9 kB) Requirement already satisfied: Jinja2<4,>=2.11 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (3.1.2) Collecting alembic!=1.10.0,<2 Downloading alembic-1.13.1-py3-none-any.whl (233 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/233.4 kB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 233.4/233.4 kB 41.2 MB/s eta 0:00:00 Requirement already satisfied: databricks-cli<1,>=0.8.7 in /databricks/python3/lib/python3.11/site-packages (from mlflow) (0.17.6) Requirement already satisfied: Mako in /databricks/python3/lib/python3.11/site-packages (from alembic!=1.10.0,<2->mlflow) (1.2.0) Requirement already satisfied: typing-extensions>=4 in /databricks/python3/lib/python3.11/site-packages (from alembic!=1.10.0,<2->mlflow) (4.4.0) Requirement already satisfied: oauthlib>=3.1.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow) (3.2.0) Requirement already satisfied: six>=1.10.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow) (1.16.0) Requirement already satisfied: pyjwt>=1.7.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow) (2.3.0) Requirement already satisfied: tabulate>=0.7.7 in /databricks/python3/lib/python3.11/site-packages (from databricks-cli<1,>=0.8.7->mlflow) (0.8.10) Requirement already satisfied: urllib3>=1.26.0 in /databricks/python3/lib/python3.11/site-packages (from docker<8,>=4.0.0->mlflow) (1.26.16) Requirement already satisfied: Werkzeug>=2.2.2 in /databricks/python3/lib/python3.11/site-packages (from Flask<4->mlflow) (2.2.3) Requirement already satisfied: itsdangerous>=2.0 in /databricks/python3/lib/python3.11/site-packages (from Flask<4->mlflow) (2.0.1) Requirement already satisfied: gitdb<5,>=4.0.1 in /databricks/python3/lib/python3.11/site-packages (from gitpython<4,>=2.1.0->mlflow) (4.0.11) Requirement already satisfied: setuptools>=3.0 in /databricks/python3/lib/python3.11/site-packages (from gunicorn<22->mlflow) (68.0.0) Requirement already satisfied: zipp>=0.5 in /databricks/python3/lib/python3.11/site-packages (from importlib-metadata!=4.7.0,<8,>=3.7.0->mlflow) (3.11.0) Requirement already satisfied: MarkupSafe>=2.0 in /databricks/python3/lib/python3.11/site-packages (from Jinja2<4,>=2.11->mlflow) (2.1.1) Requirement already satisfied: cycler>=0.10 in /databricks/python3/lib/python3.11/site-packages (from matplotlib<4->mlflow) (0.11.0) Requirement already satisfied: pyparsing<3.1,>=2.3.1 in /databricks/python3/lib/python3.11/site-packages (from matplotlib<4->mlflow) (3.0.9) Requirement already satisfied: kiwisolver>=1.0.1 in /databricks/python3/lib/python3.11/site-packages (from matplotlib<4->mlflow) (1.4.4) Requirement already satisfied: contourpy>=1.0.1 in /databricks/python3/lib/python3.11/site-packages (from matplotlib<4->mlflow) (1.0.5) Requirement already satisfied: python-dateutil>=2.7 in /databricks/python3/lib/python3.11/site-packages (from matplotlib<4->mlflow) (2.8.2) Requirement already satisfied: pillow>=6.2.0 in /databricks/python3/lib/python3.11/site-packages (from matplotlib<4->mlflow) (9.4.0) Requirement already satisfied: fonttools>=4.22.0 in /databricks/python3/lib/python3.11/site-packages (from matplotlib<4->mlflow) (4.25.0) Requirement already satisfied: tzdata>=2022.1 in /databricks/python3/lib/python3.11/site-packages (from pandas<3->mlflow) (2022.1) Requirement already satisfied: idna<4,>=2.5 in /databricks/python3/lib/python3.11/site-packages (from requests<3,>=2.17.3->mlflow) (3.4) Requirement already satisfied: certifi>=2017.4.17 in /databricks/python3/lib/python3.11/site-packages (from requests<3,>=2.17.3->mlflow) (2023.7.22) Requirement already satisfied: charset-normalizer<4,>=2 in /databricks/python3/lib/python3.11/site-packages (from requests<3,>=2.17.3->mlflow) (2.0.4) Requirement already satisfied: joblib>=1.1.1 in /databricks/python3/lib/python3.11/site-packages (from scikit-learn<2->mlflow) (1.2.0) Requirement already satisfied: threadpoolctl>=2.0.0 in /databricks/python3/lib/python3.11/site-packages (from scikit-learn<2->mlflow) (2.2.0) Requirement already satisfied: greenlet!=0.4.17 in /databricks/python3/lib/python3.11/site-packages (from sqlalchemy<3,>=1.4.0->mlflow) (2.0.1) Requirement already satisfied: smmap<6,>=3.0.1 in /databricks/python3/lib/python3.11/site-packages (from gitdb<5,>=4.0.1->gitpython<4,>=2.1.0->mlflow) (5.0.0) Installing collected packages: querystring-parser, docker, alembic, mlflow Successfully installed alembic-1.13.1 docker-7.0.0 mlflow-2.10.0 querystring-parser-1.2.4 Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages. Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages. Requirement already satisfied: transformers in /databricks/python3/lib/python3.11/site-packages (4.32.1) Collecting transformers Downloading transformers-4.37.2-py3-none-any.whl (8.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/8.4 MB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.2/8.4 MB 6.8 MB/s eta 0:00:02 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.8/8.4 MB 11.4 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/8.4 MB 14.3 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/8.4 MB 16.6 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/8.4 MB 19.2 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/8.4 MB 21.8 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.0/8.4 MB 24.7 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/8.4 MB 28.1 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.4/8.4 MB 29.4 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.4/8.4 MB 25.3 MB/s eta 0:00:00 Requirement already satisfied: requests in /databricks/python3/lib/python3.11/site-packages (from transformers) (2.31.0) Requirement already satisfied: filelock in /databricks/python3/lib/python3.11/site-packages (from transformers) (3.9.0) Requirement already satisfied: tqdm>=4.27 in /databricks/python3/lib/python3.11/site-packages (from transformers) (4.65.0) Collecting tokenizers<0.19,>=0.14 Downloading tokenizers-0.15.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/3.6 MB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/3.6 MB 72.0 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 74.6 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 51.8 MB/s eta 0:00:00 Collecting safetensors>=0.4.1 Downloading safetensors-0.4.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/1.3 MB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 54.4 MB/s eta 0:00:00 Requirement already satisfied: packaging>=20.0 in /databricks/python3/lib/python3.11/site-packages (from transformers) (23.1) Requirement already satisfied: regex!=2019.12.17 in /databricks/python3/lib/python3.11/site-packages (from transformers) (2022.7.9) Requirement already satisfied: pyyaml>=5.1 in /databricks/python3/lib/python3.11/site-packages (from transformers) (6.0) Requirement already satisfied: numpy>=1.17 in /databricks/python3/lib/python3.11/site-packages (from transformers) (1.23.5) Collecting huggingface-hub<1.0,>=0.19.3 Downloading huggingface_hub-0.20.3-py3-none-any.whl (330 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/330.1 kB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 330.1/330.1 kB 50.5 MB/s eta 0:00:00 Collecting fsspec>=2023.5.0 Downloading fsspec-2023.12.2-py3-none-any.whl (168 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/169.0 kB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 169.0/169.0 kB 33.0 MB/s eta 0:00:00 Requirement already satisfied: typing-extensions>=3.7.4.3 in /databricks/python3/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.19.3->transformers) (4.4.0) Requirement already satisfied: idna<4,>=2.5 in /databricks/python3/lib/python3.11/site-packages (from requests->transformers) (3.4) Requirement already satisfied: certifi>=2017.4.17 in /databricks/python3/lib/python3.11/site-packages (from requests->transformers) (2023.7.22) Requirement already satisfied: charset-normalizer<4,>=2 in /databricks/python3/lib/python3.11/site-packages (from requests->transformers) (2.0.4) Requirement already satisfied: urllib3<3,>=1.21.1 in /databricks/python3/lib/python3.11/site-packages (from requests->transformers) (1.26.16) Installing collected packages: safetensors, fsspec, huggingface-hub, tokenizers, transformers Attempting uninstall: safetensors Found existing installation: safetensors 0.3.2 Not uninstalling safetensors at /databricks/python3/lib/python3.11/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a Can't uninstall 'safetensors'. No files were found to uninstall. Attempting uninstall: fsspec Found existing installation: fsspec 2023.4.0 Not uninstalling fsspec at /databricks/python3/lib/python3.11/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a Can't uninstall 'fsspec'. No files were found to uninstall. Attempting uninstall: huggingface-hub Found existing installation: huggingface-hub 0.15.1 Not uninstalling huggingface-hub at /databricks/python3/lib/python3.11/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a Can't uninstall 'huggingface-hub'. No files were found to uninstall. Attempting uninstall: tokenizers Found existing installation: tokenizers 0.13.3 Not uninstalling tokenizers at /databricks/python3/lib/python3.11/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a Can't uninstall 'tokenizers'. No files were found to uninstall. Attempting uninstall: transformers Found existing installation: transformers 4.32.1 Not uninstalling transformers at /databricks/python3/lib/python3.11/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a Can't uninstall 'transformers'. No files were found to uninstall. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. petastorm 0.12.1 requires pyspark>=2.1.0, which is not installed. datasets 2.14.5 requires fsspec[http]<2023.9.0,>=2023.1.0, but you have fsspec 2023.12.2 which is incompatible. Successfully installed fsspec-2023.12.2 huggingface-hub-0.20.3 safetensors-0.4.2 tokenizers-0.15.1 transformers-4.37.2 Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages. Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages. Requirement already satisfied: accelerate in /databricks/python3/lib/python3.11/site-packages (0.18.0) Collecting accelerate Downloading accelerate-0.26.1-py3-none-any.whl (270 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/270.9 kB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 174.1/270.9 kB 5.3 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 270.9/270.9 kB 5.8 MB/s eta 0:00:00 Requirement already satisfied: pyyaml in /databricks/python3/lib/python3.11/site-packages (from accelerate) (6.0) Requirement already satisfied: safetensors>=0.3.1 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a/lib/python3.11/site-packages (from accelerate) (0.4.2) Requirement already satisfied: packaging>=20.0 in /databricks/python3/lib/python3.11/site-packages (from accelerate) (23.1) Requirement already satisfied: numpy>=1.17 in /databricks/python3/lib/python3.11/site-packages (from accelerate) (1.23.5) Requirement already satisfied: torch>=1.10.0 in /databricks/python3/lib/python3.11/site-packages (from accelerate) (2.0.1+cpu) Requirement already satisfied: huggingface-hub in /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a/lib/python3.11/site-packages (from accelerate) (0.20.3) Requirement already satisfied: psutil in /databricks/python3/lib/python3.11/site-packages (from accelerate) (5.9.0) Requirement already satisfied: filelock in /databricks/python3/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.9.0) Requirement already satisfied: sympy in /databricks/python3/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (1.11.1) Requirement already satisfied: jinja2 in /databricks/python3/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.1.2) Requirement already satisfied: typing-extensions in /databricks/python3/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (4.4.0) Requirement already satisfied: networkx in /databricks/python3/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.1) Requirement already satisfied: fsspec>=2023.5.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a/lib/python3.11/site-packages (from huggingface-hub->accelerate) (2023.12.2) Requirement already satisfied: tqdm>=4.42.1 in /databricks/python3/lib/python3.11/site-packages (from huggingface-hub->accelerate) (4.65.0) Requirement already satisfied: requests in /databricks/python3/lib/python3.11/site-packages (from huggingface-hub->accelerate) (2.31.0) Requirement already satisfied: MarkupSafe>=2.0 in /databricks/python3/lib/python3.11/site-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.1) Requirement already satisfied: idna<4,>=2.5 in /databricks/python3/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (3.4) Requirement already satisfied: certifi>=2017.4.17 in /databricks/python3/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (2023.7.22) Requirement already satisfied: charset-normalizer<4,>=2 in /databricks/python3/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (2.0.4) Requirement already satisfied: urllib3<3,>=1.21.1 in /databricks/python3/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (1.26.16) Requirement already satisfied: mpmath>=0.19 in /databricks/python3/lib/python3.11/site-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0) Installing collected packages: accelerate Attempting uninstall: accelerate Found existing installation: accelerate 0.18.0 Not uninstalling accelerate at /databricks/python3/lib/python3.11/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a Can't uninstall 'accelerate'. No files were found to uninstall. Successfully installed accelerate-0.26.1 Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.

2024-02-03 03:44:54.344584: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2024-02-03 03:44:54.373670: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
config.json: 0%| | 0.00/779 [00:00<?, ?B/s]
model.safetensors: 0%| | 0.00/1.34G [00:00<?, ?B/s]
tokenizer_config.json: 0%| | 0.00/366 [00:00<?, ?B/s]
vocab.txt: 0%| | 0.00/232k [00:00<?, ?B/s]
tokenizer.json: 0%| | 0.00/711k [00:00<?, ?B/s]
special_tokens_map.json: 0%| | 0.00/125 [00:00<?, ?B/s]

To enable optimized serving, when logging the model, include the extra metadata dictionary when calling mlflow.transformers.log_model as shown below:

metadata = {"task": "llm/v1/completions"}

This specifies the API signature used for the model serving endpoint.

/root/.ipykernel/1461/command-3301052633031349-1789204682:16: FutureWarning: The 'transformers' MLflow Models integration is known to be compatible with the following package version ranges: ``4.25.1`` - ``4.37.1``. MLflow Models integrations with transformers may not succeed when used with package versions outside of this range. mlflow.transformers.log_model( /local_disk0/.ephemeral_nfs/envs/pythonEnv-c293ec89-c81a-4ae3-9018-8cafcce6d23a/lib/python3.11/site-packages/mlflow/models/model.py:622: FutureWarning: The 'transformers' MLflow Models integration is known to be compatible with the following package version ranges: ``4.25.1`` - ``4.37.1``. MLflow Models integrations with transformers may not succeed when used with package versions outside of this range. flavor.save_model(path=local_path, mlflow_model=mlflow_model, **kwargs)
README.md: 0%| | 0.00/92.9k [00:00<?, ?B/s]
2024/02/03 03:45:13 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /local_disk0/repl_tmp_data/ReplId-76ca8-086a3-cc6fe-b/tmp32g7gqxv/model, flavor: transformers), fall back to return ['transformers==4.37.2', 'torch==2.0.1', 'torchvision==0.15.2', 'accelerate==0.26.1']. Set logging level to DEBUG to see the full traceback. /databricks/python/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
Uploading artifacts: 0%| | 0/17 [00:00<?, ?it/s]
2024/02/03 03:45:13 INFO mlflow.store.artifact.cloud_artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false Successfully registered model 'ml.models.bge-large'.
Uploading artifacts: 0%| | 0/17 [00:00<?, ?it/s]
2024/02/03 03:45:20 INFO mlflow.store.artifact.cloud_artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false Created version '1' of model 'ml.models.bge-large'.

Step 2: View optimization information for your model

Modify the cell below to change the model name. After calling the model optimization information API, you will be able to retrieve throughput chunk size information for your model. This is the number of tokens/second that corresponds to 1 throughput unit for your specific model.

Step 3: Configure and create your model serving GPU endpoint

Modify the cell below to change the endpoint name. After calling the create endpoint API, the logged MPT-7B model is automatically deployed with optimized LLM serving.

{ "name": "bge-large", "creator": "ankit.mathur@databricks.com", "creation_timestamp": 1706931925000, "last_updated_timestamp": 1706931925000, "state": { "ready": "NOT_READY", "config_update": "IN_PROGRESS" }, "pending_config": { "start_time": 1706931925000, "served_models": [ { "name": "bge-large-1", "model_name": "ml.models.bge-large", "model_version": "1", "workload_size": "Small", "workload_type": "GPU_MEDIUM", "min_provisioned_throughput": 128000, "max_provisioned_throughput": 128000, "dbus": 48.0, "state": { "deployment": "DEPLOYMENT_CREATING", "deployment_state_message": "Creating resources for served entity." }, "creator": "ankit.mathur@databricks.com", "creation_timestamp": 1706931925000 } ], "served_entities": [ { "name": "bge-large-1", "entity_name": "ml.models.bge-large", "entity_version": "1", "workload_size": "Small", "workload_type": "GPU_MEDIUM", "min_provisioned_throughput": 128000, "max_provisioned_throughput": 128000, "dbus": 48.0, "state": { "deployment": "DEPLOYMENT_CREATING", "deployment_state_message": "Creating resources for served entity." }, "creator": "ankit.mathur@databricks.com", "creation_timestamp": 1706931925000 } ], "config_version": 1, "traffic_config": { "routes": [ { "served_model_name": "bge-large-1", "traffic_percentage": 100, "served_entity_name": "bge-large-1" } ] } }, "id": "36c5af8e5b0448d3afda0fc071977d5b", "permission_level": "CAN_MANAGE", "route_optimized": false }

View your endpoint

To see your more information about your endpoint, go to the Serving UI section on the left navigation bar and search for your endpoint name.

Step 3: Query your endpoint

Once your endpoint is ready, you can query it by making an API request. Depending on the model size and complexity, it can take 30 minutes or more for the endpoint to get ready.

{"object": "list", "data": [{"object": "embedding", "embedding": [-0.25390625, 0.23193359375, -0.09454345703125, -0.5009765625, -0.17333984375, 0.2076416015625, 0.78125, -0.01076507568359375, 0.16845703125, 0.18212890625, -0.46826171875, 0.04534912109375, 0.92333984375, -0.8173828125, -0.392822265625, -0.07244873046875, -0.68505859375, 0.03460693359375, -0.6396484375, -0.154541015625, 0.37890625, 0.544921875, -1.1337890625, -0.409423828125, -0.00724029541015625, 0.8037109375, 0.51171875, 0.1689453125, 1.763671875, 0.767578125, -0.4853515625, -0.80859375, 0.467041015625, -1.0390625, -0.46826171875, 0.135986328125, 0.411865234375, -0.1678466796875, -0.60107421875, -0.4599609375, 0.2890625, 0.311767578125, 0.3447265625, -1.03125, -0.82373046875, -0.193603515625, -0.49951171875, -0.1519775390625, 0.1708984375, -0.95654296875, -0.4140625, -0.232421875, 0.61328125, -0.27587890625, -0.43359375, -0.04901123046875, -0.368408203125, 0.3369140625, -0.83447265625, -0.008758544921875, 0.2275390625, 0.44677734375, 0.69140625, -1.232421875, 1.0947265625, 0.163330078125, -0.45458984375, -0.80126953125, -0.0625, -0.07171630859375, -1.015625, 0.568359375, -0.041412353515625, -0.61767578125, -0.64453125, -0.002773284912109375, 0.2174072265625, 0.14306640625, -0.6767578125, -0.09100341796875, 0.39794921875, 0.4501953125, 0.378662109375, 0.787109375, -0.361083984375, -0.9208984375, 1.091796875, -0.29833984375, -0.45947265625, -0.332275390625, -0.0133056640625, 0.390869140625, 0.8525390625, 0.311767578125, 0.978515625, 0.51806640625, -1.1982421875, 0.85595703125, 0.129638671875, 0.191650390625, 0.275634765625, 1.0263671875, -0.3408203125, 0.364013671875, 0.00872802734375, 0.3193359375, 0.256591796875, -0.421142578125, -0.09490966796875, 0.05157470703125, 0.2120361328125, -0.17919921875, 0.77099609375, 0.1553955078125, 0.345947265625, 0.521484375, 0.275390625, 0.736328125, -0.2098388671875, -0.4736328125, 0.325927734375, 0.70703125, 0.51513671875, -1.158203125, -0.459228515625, -0.92431640625, -0.447265625, 0.869140625, 0.09857177734375, -0.53564453125, -0.0626220703125, -0.2086181640625, -0.2159423828125, 0.7177734375, -0.0819091796875, -0.16650390625, 0.38134765625, 1.0830078125, 0.59765625, -0.499267578125, 0.335693359375, -0.496826171875, 0.27978515625, 1.9677734375, -0.501953125, 0.0244140625, -0.1849365234375, -1.056640625, -1.1474609375, 0.9892578125, -0.332763671875, -0.043304443359375, 0.7998046875, 0.030853271484375, -0.29345703125, 0.051483154296875, 0.14501953125, 0.349365234375, 0.17626953125, -0.03936767578125, -0.407958984375, -0.137939453125, -0.62548828125, 0.77685546875, 0.422607421875, 0.61865234375, -0.353515625, 0.16552734375, -0.29296875, -0.73828125, -0.340576171875, 0.11248779296875, -0.0665283203125, 0.73876953125, 0.403564453125, 0.5830078125, -0.05023193359375, 0.51611328125, 0.443115234375, 0.8681640625, -0.36328125, -0.57470703125, 0.4658203125, 0.0689697265625, -0.12335205078125, 0.1307373046875, -0.06268310546875, -0.58154296875, -0.06915283203125, 0.2215576171875, -0.27392578125, 0.97314453125, -0.60302734375, 0.60498046875, -0.60205078125, -0.2281494140625, -0.85693359375, -0.06048583984375, -0.2474365234375, -1.0068359375, -0.75, 0.5166015625, -0.09619140625, 0.2498779296875, 0.27783203125, -0.359375, -0.2183837890625, 0.94775390625, -0.27001953125, -0.51318359375, 0.64306640625, -0.03472900390625, 0.544921875, -0.490234375, 0.8798828125, -0.3837890625, -0.371337890625, 0.744140625, 0.086669921875, -0.677734375, 0.34375, -0.1282958984375, 0.724609375, 0.96923828125, -0.056304931640625, 0.046630859375, -0.5830078125, 0.712890625, -0.2413330078125, -0.237060546875, -0.2471923828125, 0.8486328125, 0.04193115234375, 1.107421875, 1.09765625, 0.77734375, 1.560546875, 0.7861328125, 0.0133514404296875, 0.133544921875, 0.023345947265625, -0.26904296875, -0.1492919921875, 0.544921875, 0.20947265625, -0.11663818359375, 0.50537109375, 0.015625, 0.483154296875, 0.828125, -0.91259765625, 0.55859375, 0.35107421875, 0.2415771484375, -0.92822265625, -0.60302734375, 0.39208984375, 0.734375, -0.326904296875, -0.050018310546875, 0.09307861328125, 0.7705078125, 0.26220703125, -0.0170745849609375, 0.332763671875, 0.67724609375, 0.041412353515625, 0.255126953125, -0.373779296875, -0.358642578125, -1.0234375, -1.1943359375, -0.357177734375, -0.0577392578125, -0.716796875, 0.45263671875, -0.150390625, -0.892578125, 0.353515625, -0.94677734375, -0.50048828125, -0.10595703125, 0.436767578125, 0.7001953125, 1.3388671875, 0.1295166015625, -1.0517578125, 0.42578125, -0.7421875, 0.2259521484375, -0.498779296875, 0.2205810546875, 0.2236328125, -0.1302490234375, -0.0804443359375, 0.29296875, -0.51953125, -0.109130859375, -0.75634765625, -0.033843994140625, 0.162109375, 0.5888671875, -0.388916015625, -0.69091796875, -0.59375, 0.53271484375, 0.1849365234375, -0.46337890625, 0.8876953125, 0.281494140625, -0.96142578125, 0.7158203125, 0.95751953125, -0.40234375, -0.775390625, 1.1708984375, 1.029296875, 0.311279296875, -0.724609375, -0.60791015625, -0.6826171875, 0.5322265625, -0.0256805419921875, -0.413330078125, -0.734375, 0.91455078125, -0.368408203125, -0.84375, 0.254150390625, -0.158203125, -0.97509765625, -0.7001953125, -0.349609375, 0.44140625, 0.321533203125, 0.7626953125, -0.8701171875, -0.32666015625, 0.447021484375, 0.5859375, 0.5185546875, -0.4013671875, 0.5556640625, 0.61669921875, -0.52734375, 0.115234375, 0.3466796875, -0.2958984375, -0.24951171875, -0.2435302734375, 0.272216796875, -0.8232421875, 0.4033203125, 0.31396484375, 0.304443359375, 0.46728515625, -1.66796875, 0.152099609375, 0.53857421875, 0.09918212890625, 0.369873046875, 0.60009765625, 0.459716796875, 0.003688812255859375, -0.50048828125, -0.47265625, -0.060577392578125, 0.212646484375, 0.73291015625, -1.2705078125, 1.34375, -0.08343505859375, -0.434814453125, 0.5478515625, -0.56787109375, -0.476806640625, 0.61865234375, -0.736328125, 0.84375, -0.6630859375, 0.1558837890625, 0.03521728515625, 0.178955078125, 0.5869140625, 1.40625, 0.46533203125, 0.21875, 0.22265625, -0.103271484375, 0.18408203125, -0.75341796875, 0.36767578125, -0.2490234375, 0.2265625, -0.6240234375, -0.728515625, 0.97607421875, 0.611328125, 0.9833984375, 0.054962158203125, 0.91015625, 0.032684326171875, -0.389892578125, 0.364501953125, 0.78857421875, 0.224853515625, -0.552734375, 0.07110595703125, 0.488037109375, -0.280517578125, -0.2064208984375, 0.06097412109375, -0.51025390625, 0.428955078125, -0.34521484375, 0.12152099609375, -0.6943359375, -0.058013916015625, -0.58837890625, 0.54638671875, -0.85693359375, 0.31005859375, -0.495361328125, -0.1826171875, 0.1824951171875, -0.68115234375, -0.6015625, -0.225830078125, 0.1650390625, 0.4677734375, -0.95263671875, -0.94775390625, -0.393310546875, 0.032379150390625, -0.5556640625, 0.458984375, 0.243896484375, 0.312255859375, 0.3779296875, -1.1865234375, 0.662109375, 0.306396484375, 0.46435546875, 0.01157379150390625, 0.1461181640625, 0.7646484375, -0.11688232421875, 0.1070556640625, 0.72900390625, -0.258056640625, 0.2091064453125, -0.1522216796875, 0.03564453125, -0.1634521484375, -0.08807373046875, -0.26220703125, -0.349853515625, 0.365966796875, 0.489501953125, -0.1112060546875, 0.386474609375, -0.097412109375, 0.62060546875, -0.1444091796875, -1.0283203125, 0.68798828125, 0.114990234375, -0.31201171875, 0.703125, 0.353271484375, -0.449462890625, 0.2467041015625, 0.2744140625, -0.2078857421875, 0.8115234375, -0.1585693359375, 0.288818359375, -0.189208984375, -0.344970703125, 0.10443115234375, -0.408447265625, 0.65673828125, -0.2919921875, 0.06231689453125, -0.385986328125, -1.53125, 0.0115203857421875, 0.0687255859375, -0.53564453125, 0.720703125, -0.123046875, -0.242919921875, -0.463134765625, -0.325927734375, -0.1630859375, -0.267333984375, -0.1082763671875, -0.0230560302734375, -0.36328125, 0.09539794921875, 0.1630859375, -0.4453125, -0.88720703125, 0.041778564453125, -0.0904541015625, 0.10589599609375, -0.148193359375, 0.096435546875, -0.128173828125, -0.276611328125, -0.5205078125, -0.3349609375, -0.87939453125, 0.2294921875, -0.432373046875, 0.485595703125, 0.4873046875, 0.1898193359375, -0.56103515625, 0.732421875, -0.0194854736328125, -1.04296875, -0.599609375, 0.64306640625, -0.1448974609375, 0.3349609375, -0.71533203125, -0.34814453125, 0.045806884765625, -0.48876953125, 0.8603515625, -0.59619140625, 0.2166748046875, -0.11627197265625, -1.046875, -0.019134521484375, 0.2481689453125, 1.0478515625, -0.87451171875, 0.267578125, -0.483154296875, -0.54931640625, -1.0966796875, -0.82470703125, 0.060333251953125, 0.2435302734375, -0.1541748046875, 0.86767578125, -0.73779296875, 0.11224365234375, -0.08599853515625, 0.3603515625, -0.398193359375, 1.0380859375, -0.81640625, -0.50244140625, -0.06768798828125, 0.211669921875, 0.133056640625, 0.77587890625, -0.8076171875, 0.1390380859375, -0.4697265625, 0.032501220703125, -0.33447265625, -0.25341796875, -0.33203125, -0.08526611328125, 0.57470703125, -1.025390625, 0.76611328125, -0.07037353515625, 0.79052734375, 0.32568359375, 0.281982421875, -0.256103515625, -0.474853515625, -0.888671875, -1.03125, -0.300537109375, -0.5205078125, 0.13037109375, -0.2012939453125, -0.09332275390625, 0.446044921875, -0.096435546875, 1.3642578125, 0.353515625, 0.8857421875, 0.2381591796875, -0.01514434814453125, -0.468994140625, 0.44921875, -0.779296875, -0.01422119140625, -0.955078125, -0.9560546875, -0.1322021484375, -1.03515625, -0.68701171875, -0.48486328125, 0.10491943359375, 0.77490234375, -0.3828125, 0.86376953125, 0.71875, -0.1578369140625, -1.142578125, 0.283935546875, -0.468505859375, 0.2308349609375, 0.49365234375, -0.0232086181640625, 0.296142578125, 1.3193359375, -0.17626953125, -0.140380859375, -0.4716796875, 1.1201171875, 0.06573486328125, -0.89501953125, -0.09515380859375, 0.45263671875, -1.224609375, -1.0224609375, 0.272705078125, -0.6103515625, 0.247314453125, -0.6044921875, 0.6025390625, -0.77734375, -1.466796875, 0.1248779296875, 0.24560546875, 0.265869140625, 0.1942138671875, 0.8134765625, 0.498291015625, -0.265380859375, 1.1083984375, 0.79052734375, -0.787109375, -0.03912353515625, -0.2841796875, -0.26025390625, -0.318115234375, -0.0911865234375, 0.351806640625, 0.03173828125, 0.47021484375, 0.73291015625, 0.6103515625, 0.1317138671875, 0.0175018310546875, -0.34716796875, 0.235595703125, -0.411865234375, -1.267578125, -0.83642578125, -0.27685546875, -0.471923828125, 0.60107421875, -0.2191162109375, 0.5380859375, 0.8525390625, 0.433349609375, -0.32080078125, -1.484375, -0.2445068359375, -0.8017578125, -0.5068359375, -0.190185546875, -0.39599609375, -0.31591796875, 0.44189453125, 0.281494140625, -1.0703125, -0.09527587890625, 1.041015625, -0.322998046875, -0.0269012451171875, -0.76708984375, -0.1680908203125, -0.9951171875, -0.261474609375, -0.63525390625, 0.1806640625, 0.767578125, 0.8896484375, -0.290283203125, 0.251220703125, -0.21923828125, 0.525390625, 0.4208984375, 0.2423095703125, -0.82763671875, -0.6513671875, -0.151611328125, 0.02532958984375, -0.541015625, 0.1671142578125, 0.269775390625, -1.046875, 0.1593017578125, 0.08251953125, 0.176025390625, -0.33935546875, 0.333740234375, 0.91357421875, 0.2225341796875, -0.437255859375, -0.6171875, -0.06634521484375, -0.461181640625, 0.10174560546875, 0.003810882568359375, -0.3212890625, -0.4228515625, 0.1484375, 0.09674072265625, 0.5927734375, 0.1478271484375, 0.52685546875, -0.2042236328125, -0.2880859375, -0.86767578125, -0.09112548828125, 0.5849609375, -0.345947265625, -0.82568359375, -0.1763916015625, 0.8583984375, 0.73388671875, -0.36669921875, 0.087890625, 0.11297607421875, -0.8017578125, -0.56884765625, 0.2325439453125, -0.2108154296875, 1.7548828125, 0.295166015625, 0.51025390625, 0.32275390625, -0.64111328125, -0.73095703125, 0.73291015625, -1.1171875, -0.5166015625, 0.466796875, -0.7177734375, 0.053192138671875, 0.18994140625, -0.6748046875, 0.27294921875, 0.3466796875, -0.121337890625, -0.78857421875, 0.129150390625, 0.85009765625, -0.308837890625, -0.6845703125, -0.9482421875, 0.199951171875, 0.12200927734375, -0.6513671875, -0.3427734375, 0.054290771484375, 0.487548828125, 0.406005859375, -0.260009765625, -0.0965576171875, -0.1534423828125, 0.46533203125, 0.395751953125, -0.00939178466796875, 0.79443359375, -0.203369140625, 0.2469482421875, -0.1212158203125, 0.37841796875, -1.359375, 0.491943359375, 0.45654296875, 0.8046875, -0.84326171875, 1.2265625, 0.263916015625, -0.1502685546875, 0.67626953125, 0.96337890625, -0.07061767578125, 0.116455078125, -0.3134765625, 0.9033203125, 1.3291015625, 1.1396484375, 0.853515625, -0.08343505859375, 0.66259765625, 0.4169921875, 0.4072265625, -0.205078125, 0.466796875, 0.6259765625, -0.251708984375, 0.36962890625, -0.487060546875, 0.1636962890625, 0.45361328125, 0.29052734375, -0.170654296875, 0.587890625, 0.06591796875, -0.31982421875, -0.299072265625, 0.334228515625, -0.29345703125, -0.60107421875, -0.256103515625, 0.59912109375, 0.09375, 0.517578125, 0.5849609375, 0.4912109375, 0.208740234375, 0.54248046875, -0.240966796875, 0.64599609375, 0.1217041015625, 0.16943359375, -0.56591796875, 0.0506591796875, -0.5830078125, 0.68505859375, -0.339111328125, -0.134033203125, -0.52685546875, 1.302734375, 0.21533203125, -0.5458984375, 0.33740234375, -0.0628662109375, -0.70166015625, -0.78955078125, 0.59619140625, -1.064453125, -0.1253662109375, 0.2294921875, -0.62353515625, 0.11334228515625, 0.66015625, -0.0131072998046875, 0.65283203125, 0.2861328125, 0.650390625, 0.1419677734375, 0.50244140625, 1.150390625, 0.068603515625, 0.1512451171875, 0.3525390625, -0.63134765625, -0.1248779296875, -0.57666015625, -0.93212890625, -0.44287109375, -0.10821533203125, -0.276611328125, 0.23828125, 0.60888671875, 0.181640625, -0.134765625, 0.79736328125, 0.6728515625, -0.705078125, 0.556640625, -0.0579833984375, 0.8994140625, -0.06317138671875, -0.321533203125, 0.1883544921875, -0.4912109375, 0.179931640625, -1.42578125, 0.498046875, 0.290283203125, -0.52001953125, -0.40234375, -0.8251953125, -0.0064697265625, 0.0396728515625, -0.353515625, -0.205322265625, 0.03594970703125, 0.43408203125, 0.5615234375, 0.211181640625, -0.6015625, -0.6591796875, 0.43212890625, 0.9013671875, -0.337890625, 0.61083984375, 0.331787109375, 0.70751953125, 0.64697265625, -0.07427978515625, 0.51611328125, -0.228515625, -0.192626953125, 0.442138671875, 0.299560546875, -0.30615234375, -1.3291015625, 0.24462890625, 0.437255859375, -0.302734375, 0.288818359375, -0.0284423828125, 0.25537109375, -1.1328125, -0.67724609375, -0.95751953125, 0.55810546875, 0.051971435546875, -0.751953125, 0.2314453125, -0.943359375, 3.30078125, 1.2431640625, 0.97607421875, -0.12890625, 0.388427734375, 0.890625, 0.348388671875, -0.27978515625, 0.7509765625, -1.6787109375, -0.56005859375, 0.458251953125, 0.37646484375, 0.3955078125, -0.1302490234375, 1.052734375, -0.77587890625, -0.1639404296875, 0.6298828125, -0.6826171875, -1.6787109375, 0.5458984375, 0.50244140625, 0.89599609375, 0.07769775390625, 0.1519775390625, -0.2498779296875, -0.83984375, -0.08544921875, 0.059722900390625, 0.291015625, -1.099609375, 0.0506591796875, -0.41455078125, -0.0628662109375, 0.701171875, -0.08477783203125, -0.416259765625, -0.00249481201171875, 0.26806640625, -0.40576171875, -0.41015625, 0.0305328369140625, -0.47607421875, -0.3359375, 0.900390625, -0.5302734375, -0.53466796875, 0.38720703125, -0.95849609375, 0.496826171875, -0.9658203125, -0.1380615234375, -1.107421875, -0.88330078125, -0.63330078125, -0.5390625, -0.85595703125, -0.57470703125, 0.3271484375, 0.45947265625, 0.490234375, -0.46142578125, -0.1304931640625, -0.76708984375, -0.5654296875, 0.493896484375, -0.2142333984375, -0.34814453125, -0.345458984375, -0.78564453125, -0.130126953125, -0.1981201171875, -0.341064453125, 0.46337890625, 0.814453125, -0.11590576171875, 0.609375, -0.62109375, 0.0665283203125, 0.111328125, -0.07086181640625, -0.119140625, -0.422119140625, 0.330810546875, 0.87353515625, -0.93408203125, -0.58203125, 0.186767578125, 0.23779296875, 0.9140625, 0.5576171875, -0.05426025390625, 0.164306640625, 0.89501953125], "index": 0}], "usage": {"prompt_tokens": 8, "total_tokens": 8}}