migrate-langchain-dspy(Python)

Loading...

Migrate LangChain model code to DSPy

In this notebook, you learn how to convert your LangChain model code into DSPy using a LangChain RAG example.

2

    3

      LangChain RAG example

      The following code is from the official LangChain RAG example: https://python.langchain.com/v0.2/docs/tutorials/rag/, which builds a vector store from a blog page as the retriever, and uses a one-hop language model to answer article-based questions. The following retrieves the Databricks secret that contains OpenAI credentials. See the Databricks secrets (AWS| Azure) documentation for more information about retrieving keys.

      5

      6

      Convert your LangChain code to DSPY

      The following is the code to convert to DSPy:

      rag_chain = (
          {"context": retriever | format_docs, "question": RunnablePassthrough()}
          | [prompt](url)
          | llm
          | StrOutputParser()
      )
      

      This standard RAG chain consists of 3 parts:

      1. A prompt template that organizes the LLM query.
      2. A retriever module that fetches the relevant context.
      3. An LLM to answer the query.

      The following sections show how you can convert each of these parts into DSPy, where the

      1. Prompt template is converted to a dspy.Signature.
      2. Retriever can be as simple as a callable, or using DSPy global language together with dspy.Retrieve module.
      3. LLM call can be wrapped by a dspy.Module. You can choose to use a basic dspy.Predict or more advanced modules like dspy.ChainOfThought.

      Convert prompt template to DSPy signature

      A DSPy signature is an abstraction of a prompt that consists of 3 components:

      1. Input fields to define the input.
      2. Output fields to define the output.
      3. Instruction to help the language model understand expected behavior.

      These components are combined into the actual prompt sent to the language model. The DSPy signature does not contain every piece of information included in the actual prompt sent to the language model. For example, signatures do not include information regarding "training" (optimizer.compile() calls). See the DSPy signature documentation for more information.

      The easiest way to construct the input and output fields in a dspy.Signature is using the syntax "{input_field_name_1}, {input_field_name_2}, ... -> {output_field_name_1}, {output_field_name_2}..." and wrapping it in the dspy.signatures.make_signature utility function. instructions is optional, but can help the LLM function better.

      The following creates your signature and instructions for LLM calls. For this RAG use case, you can simply write instructions="Answer the question based on context".

      9

      10

        Define the DSPy program

        Next, you can convert the LangChain chain into a DSPy program. These are callables expressed in different contexts. Similar to writing a PyTorch model, you can write a DSPy program in two parts:

        1. Define all submodules of your program inside the __init__() method.
        2. Implement the forward() method to define prediction / inference logic.

        Let's take a look at your code!

        The following directly passes the vector store-based LangChain retriever to DSPy as the retriever and wraps the language model calls in dspy.ChainOfThought along with the signature you previously created.

        12

        Now you can tell DSPy which language model you want to use, and instantiate your RAG application that is now rewritten in DSPy format.

        14

        Now that you have all the components together, call your converted DSPy program for RAG!

        16

        Summary

        You have successfully converted our Langchain LCEL chain into a DSPy program. To recap,

        1. You created a class that subclasses dspy.Module to represent your program.
        2. You defined a dspy.ChainOfThought module to generate responses using a configured LLM.
        3. You translated other parts of the LCEL LangChain into Python function calls within the DSPy forward() method. For example, self.retriever() is directly called inside the forward() method, and the retrieved documents are formatted. This is equivalent to the "context": retriever | format_docs part of the LCEL chain.

        These 3 steps apply to other LCEL chains, not just RAG use cases.

        If you are only interested in the conversion part, then that's it! To use DSPy to improve the performance of your application, see how to optimize a migrated RAG chain.