optimize-migrated-dspy(Python)

Loading...

Optimize your migrated DSPy program

After you migrate your LangChain code to DSPy, you can optimize your application's quality through automatic prompt engineering using DSPy. To demonstrate, this notebook walks you through the training phase and compares the optimized RAG's performance against the original LangChain LCEL chain.

To optimize a DSPy program, you need:

  1. A scoring function or metric that measures how your program performs.
  2. A few examples for automatic prompt engineering, like trainset that supervises the training.
  3. A few examples for evaluation, like valset that validates the optimization result.

This notebook assumes you have migrated your LangChain code to DSPy

Prepare training dataset

This RAG example is built on top of the article https://lilianweng.github.io/posts/2023-06-23-agent/. You can manually select a few examples that can be answered by the article and provide the answer. Or, if your goal is to save cost by using a small language model (LM) to achieve the performance of a large language model, you can have the large language model generate the question-label pairs for you.

3

The following shows 6 prepared question-answer pairs that need to be converted to a DSPy dataset. To build a DSPy dataset, simply wrap each question-answer pair in a dspy.Example, and specify the name of the input field: question.

5

Define metrics

Now that you have a training set, you can define how to score your program's performance.

A common way to evaluate how well a RAG application works is to use an LLM to judge application responses according to specific criteria. This example uses gpt-4o-mini as the judge.:

  • If the answer is faithful to the retrieved document context, meaning there are no hallucinations, faithfulness will be 1, otherwise 0.
  • If the answer is correct, correctness will be 1, otherwise 0.

The total score is the sum of faithfulness and correctness, which can be 0, 1, or 2.

The following code defines an LLM judge evaluation function, metric() that evaluates your RAG application:

8

Optimize your DSPy program with DSPy Optimizers

Now that you have your training dataset and metrics, the last step is to put these pieces together! Similar to PyTorch training, you need to create an optimizer that manages the optimization process. This example uses dspy.teleprompt.BootstrapFewShotWithRandomSearch as the optimizer. See the DSPy optimizer guide for more information.

10

Call the compile() method to kick off training.

12

    The following cells test the optimized RAG application and compare it with the original LCEL chain. Because the text generation is a randomized process, the output could vary on each call, but generally the optimized RAG application output is more concise and informative.

    14

    15

    Congratulations! Now you have an optimized RAG application. For more DSPy examples and tutorials, see the DSPy documentation.