2:
Load the required libraries
4
5
8:
Define the Json Path to extract the input and output values
9
Compute the Input / Output text evaluation metrics (e.g., toxicity, perplexity, readability)
Now that our input and output are unpacked and available as a string, we can compute their metrics. These will be analyzed by Lakehouse Monitoring so that we can understand how these metrics change over time.
Feel free to add your own custom evaluation metrics here.
11
12
We can now incrementally consume new payload from the inference table, unpack them, compute metrics and save them to our final processed table:
14
16
17
Our table is now monitored
Databricks Lakehouse Monitoring automatically builds dashboard to track your metrics and their evolution over time.
You can leverage your metric table to track your LLM model behavior over time, and setup alerts to detect potential changes in model perplexity or toxicity.