Mosaic AI Agent Evaluation example
The following code shows how to call and test Agent Evaluation on previously generated outputs. It returns a dataframe with evaluation scores calculated by LLM judges that are part of Agent Evaluation.
Evaluating: 0%| | 0/2 [Elapsed: 00:00, Remaining: ?]
Table
To pick up a draggable item, press the space bar.
While dragging, use the arrow keys to move the item.
Press space again to drop the item in its new position, or press escape to cancel.