Synthesize evaluations from documents
This notebook shows how you can synthesize evaluations for an agent that uses document retrieval. It uses the generate_evals_df
method that is part of the databricks-agents
Python package.
2
Documentation
The API is shown below. For more details, see the documentation (AWS | Azure).
API:
def generate_evals_df(
docs: Union[pd.DataFrame, "pyspark.sql.DataFrame"], # noqa: F821
*,
num_evals: int,
agent_description: Optional[str] = None,
question_guidelines: Optional[str] = None,
) -> pd.DataFrame:
"""
Generate an evaluation dataset with questions and expected answers.
Generated evaluation set can be used with Databricks Agent Evaluation
AWS: (https://docs.databricks.com/en/generative-ai/agent-evaluation/evaluate-agent.html)
Azure: (https://learn.microsoft.com/azure/databricks/generative-ai/agent-evaluation/evaluate-agent).
:param docs: A pandas/Spark DataFrame with a text column `content` and a `doc_uri` column.
:param num_evals: The number of questions (and corresponding answers) to generate in total.
:param agent_description: Optional, a task description of the agent.
:param question_guidelines: Optional guidelines to help guide the synthetic question generation. This is a free-form string that will
be used to prompt the generation. The string can be formatted in markdown and may include sections like:
- User Personas: Types of users the agent should support
- Example Questions: Sample questions to guide generation
- Additional Guidelines: Extra rules or requirements
"""
The following code block synthesizes evaluations from a DataFrame of documents.
- The input can be a Pandas DataFrame or a Spark DataFrame.
- The output DataFrame can be directly used with
mlflow.evaluate()
.
5