synthetic-evals-notebook(Python)

Loading...

Synthesize evaluations from documents

This notebook shows how you can synthesize evaluations for an agent that uses document retrieval. It uses the generate_evals_df method that is part of the databricks-agents Python package.

2

Documentation

The API is shown below. For more details, see the documentation (AWS | Azure).

API:

def generate_evals_df(
    docs: Union[pd.DataFrame, "pyspark.sql.DataFrame"],  # noqa: F821
    *,
    num_evals: int,
    agent_description: Optional[str] = None,
    question_guidelines: Optional[str] = None,
) -> pd.DataFrame:
    """
    Generate an evaluation dataset with questions and expected answers.
    Generated evaluation set can be used with Databricks Agent Evaluation
    AWS: (https://docs.databricks.com/en/generative-ai/agent-evaluation/evaluate-agent.html)
    Azure: (https://learn.microsoft.com/azure/databricks/generative-ai/agent-evaluation/evaluate-agent).

    :param docs: A pandas/Spark DataFrame with a text column `content` and a `doc_uri` column.
    :param num_evals: The number of questions (and corresponding answers) to generate in total.
    :param agent_description: Optional, a task description of the agent.
    :param question_guidelines: Optional guidelines to help guide the synthetic question generation. This is a free-form string that will
        be used to prompt the generation. The string can be formatted in markdown and may include sections like:
        - User Personas: Types of users the agent should support
        - Example Questions: Sample questions to guide generation
        - Additional Guidelines: Extra rules or requirements
    """

The following code block synthesizes evaluations from a DataFrame of documents.

  • The input can be a Pandas DataFrame or a Spark DataFrame.
  • The output DataFrame can be directly used with mlflow.evaluate().
5