bert-batch-inference-ai-query(Python)

Loading...

Batch inference using a BERT model for named entity recognition

This notebook demonstrates how to do the following tasks:

  • Build a pyfunc model encapsulating a BERT language model for named entity recognition (NER).
  • Deploy the pyfunc model to a Mosaic AI Model Serving endpoint.
  • Perform batch inference using ai_query(AWS | Azure) on the Mosaic AI Model Serving endpoint

To test the model before deploying, run this notebook on a cluster with a GPU.

Download and import libraries

Download and install the latest versions of torch, torchvision, transformers, and mlflow.

3

4

Set registry for the model

The following sets the model registry to use the Unity Catalog model registry.

7

    Define PyFunc to load and create pipeline

    Define an MLflow pyfunc to take input text and return the NER results from the model.

    10

    Test the BERT model

    12

    13

      Register the BERT model

      First, create an MLflow signature to tell MLflow what inputs and outputs the model expects.

      16

      Create the dependencies list, so that your endpoint has all of the necessary libraries at runtime.

      18

      Register the model to the model registry

      20

      Deploy model to Mosaic AI Model Serving endpoint

      Use the Databricks Python SDK to create the model serving endpoint. Building the container and bringing it online can take some time, so the timeout is set to 30 minutes.

      23

      Now, you can run a test query.

      25

      QueryEndpointResponse(choices=[], created=None, data=[], id=None, model=None, object=None, predictions=[{'ner': [{'entity': 'B-PER', 'score': 0.9939519762992859, 'index': 4, 'word': 'wolfgang', 'start': 11, 'end': 19}, {'entity': 'B-LOC', 'score': 0.9978950023651123, 'index': 9, 'word': 'berlin', 'start': 34, 'end': 40}]}, {'ner': [{'entity': 'B-PER', 'score': 0.995381772518158, 'index': 4, 'word': 'colton', 'start': 11, 'end': 17}, {'entity': 'B-ORG', 'score': 0.9861130118370056, 'index': 9, 'word': 'data', 'start': 29, 'end': 33}, {'entity': 'I-ORG', 'score': 0.9936231970787048, 'index': 10, 'word': '##brick', 'start': 33, 'end': 38}, {'entity': 'I-ORG', 'score': 0.9867315888404846, 'index': 11, 'word': '##s', 'start': 38, 'end': 39}]}], served_model_name='ner', usage=None)

      Batch inference using ai_query

      To run inference for several records at a time, you can perform batch inference using the ai_query function.

      Call the display function to execute the spark code above.

      30

        Now you have performed batch inference for 1000 data points on a BERT NER model serving endpoint!