PySpark DataFrame Loader in Langchain(Python)

Loading...

PySpark DataFrame Loader and MLFlow in Langchain

This notebook showcases the integration between PySpark and Langchain and includes how to:

  1. Create a Langchain document loader from a PySpark Dataframe
  2. Create a Langchain RetrievalQA instance using that document loader
  3. Use Mlflow to save the RetrievalQA example

Requirements

  • Databricks Runtime 13.3 ML and above
  • MLflow 2.5 and above

Imports

%pip install --upgrade langchain faiss-cpu mlflow

# For GPU clusters use the following 
# %pip install --upgrade langchain faiss-gpu mlflow
Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages. Requirement already satisfied: langchain in /local_disk0/.ephemeral_nfs/envs/pythonEnv-d99e329b-1aa4-455e-8f42-f00c2383c3de/lib/python3.10/site-packages (0.0.235) Requirement already satisfied: faiss-cpu in /local_disk0/.ephemeral_nfs/envs/pythonEnv-d99e329b-1aa4-455e-8f42-f00c2383c3de/lib/python3.10/site-packages (1.7.4) Requirement already satisfied: mlflow in /local_disk0/.ephemeral_nfs/envs/pythonEnv-d99e329b-1aa4-455e-8f42-f00c2383c3de/lib/python3.10/site-packages (2.5.0) Requirement already satisfied: dataclasses-json<0.6.0,>=0.5.7 in /databricks/python3/lib/python3.10/site-packages (from langchain) (0.5.9) Requirement already satisfied: pydantic<2,>=1 in /databricks/python3/lib/python3.10/site-packages (from langchain) (1.10.6) Requirement already satisfied: requests<3,>=2 in /databricks/python3/lib/python3.10/site-packages (from langchain) (2.28.1) Requirement already satisfied: numexpr<3.0.0,>=2.8.4 in /databricks/python3/lib/python3.10/site-packages (from langchain) (2.8.4) Requirement already satisfied: openapi-schema-pydantic<2.0,>=1.2 in /databricks/python3/lib/python3.10/site-packages (from langchain) (1.2.4) Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /databricks/python3/lib/python3.10/site-packages (from langchain) (3.8.4) Requirement already satisfied: langsmith<0.0.8,>=0.0.7 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-d99e329b-1aa4-455e-8f42-f00c2383c3de/lib/python3.10/site-packages (from langchain) (0.0.7) Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /databricks/python3/lib/python3.10/site-packages (from langchain) (8.1.0) Requirement already satisfied: numpy<2,>=1 in /databricks/python3/lib/python3.10/site-packages (from langchain) (1.21.5) Requirement already satisfied: SQLAlchemy<3,>=1.4 in /databricks/python3/lib/python3.10/site-packages (from langchain) (1.4.39) Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /databricks/python3/lib/python3.10/site-packages (from langchain) (4.0.2) Requirement already satisfied: PyYAML>=5.4.1 in /databricks/python3/lib/python3.10/site-packages (from langchain) (6.0) Requirement already satisfied: querystring-parser<2 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-d99e329b-1aa4-455e-8f42-f00c2383c3de/lib/python3.10/site-packages (from mlflow) (1.2.4) Requirement already satisfied: pyarrow<13,>=4.0.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (8.0.0) Requirement already satisfied: gitpython<4,>=2.1.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (3.1.27) Requirement already satisfied: cloudpickle<3 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (2.0.0) Requirement already satisfied: packaging<24 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (21.3) Requirement already satisfied: sqlparse<1,>=0.4.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (0.4.2) Requirement already satisfied: alembic!=1.10.0,<2 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-d99e329b-1aa4-455e-8f42-f00c2383c3de/lib/python3.10/site-packages (from mlflow) (1.11.1) Requirement already satisfied: Jinja2<4,>=2.11 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (2.11.3) Requirement already satisfied: docker<7,>=4.0.0 in /local_disk0/.ephemeral_nfs/envs/pythonEnv-d99e329b-1aa4-455e-8f42-f00c2383c3de/lib/python3.10/site-packages (from mlflow) (6.1.3) Requirement already satisfied: markdown<4,>=3.3 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (3.3.4) Requirement already satisfied: scikit-learn<2 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (1.1.1) Requirement already satisfied: importlib-metadata!=4.7.0,<7,>=3.7.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (4.11.3) Requirement already satisfied: protobuf<5,>=3.12.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (3.19.4) Requirement already satisfied: scipy<2 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (1.9.1) Requirement already satisfied: Flask<3 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (1.1.2+db1) Requirement already satisfied: pandas<3 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (1.4.4) Requirement already satisfied: matplotlib<4 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (3.5.2) Requirement already satisfied: click<9,>=7.0 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (8.0.4) Requirement already satisfied: gunicorn<21 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (20.1.0) Requirement already satisfied: entrypoints<1 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (0.4) Requirement already satisfied: pytz<2024 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (2022.1) Requirement already satisfied: databricks-cli<1,>=0.8.7 in /databricks/python3/lib/python3.10/site-packages (from mlflow) (0.17.7) Requirement already satisfied: multidict<7.0,>=4.5 in /databricks/python3/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.0.4) Requirement already satisfied: aiosignal>=1.1.2 in /databricks/python3/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1) Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /databricks/python3/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (2.0.4) Requirement already satisfied: attrs>=17.3.0 in /databricks/python3/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (21.4.0) Requirement already satisfied: frozenlist>=1.1.1 in /databricks/python3/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.4.0) Requirement already satisfied: yarl<2.0,>=1.0 in /databricks/python3/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.9.2) Requirement already satisfied: typing-extensions>=4 in /databricks/python3/lib/python3.10/site-packages (from alembic!=1.10.0,<2->mlflow) (4.3.0) Requirement already satisfied: Mako in /databricks/python3/lib/python3.10/site-packages (from alembic!=1.10.0,<2->mlflow) (1.2.0) Requirement already satisfied: urllib3<2.0.0,>=1.26.7 in /databricks/python3/lib/python3.10/site-packages (from databricks-cli<1,>=0.8.7->mlflow) (1.26.11) Requirement already satisfied: pyjwt>=1.7.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow) (2.3.0) Requirement already satisfied: tabulate>=0.7.7 in /databricks/python3/lib/python3.10/site-packages (from databricks-cli<1,>=0.8.7->mlflow) (0.8.10) Requirement already satisfied: six>=1.10.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow) (1.16.0) Requirement already satisfied: oauthlib>=3.1.0 in /usr/lib/python3/dist-packages (from databricks-cli<1,>=0.8.7->mlflow) (3.2.0) Requirement already satisfied: typing-inspect>=0.4.0 in /databricks/python3/lib/python3.10/site-packages (from dataclasses-json<0.6.0,>=0.5.7->langchain) (0.9.0) Requirement already satisfied: marshmallow<4.0.0,>=3.3.0 in /databricks/python3/lib/python3.10/site-packages (from dataclasses-json<0.6.0,>=0.5.7->langchain) (3.19.0) Requirement already satisfied: marshmallow-enum<2.0.0,>=1.5.1 in /databricks/python3/lib/python3.10/site-packages (from dataclasses-json<0.6.0,>=0.5.7->langchain) (1.5.1) Requirement already satisfied: websocket-client>=0.32.0 in /databricks/python3/lib/python3.10/site-packages (from docker<7,>=4.0.0->mlflow) (0.58.0) Requirement already satisfied: Werkzeug>=0.15 in /databricks/python3/lib/python3.10/site-packages (from Flask<3->mlflow) (2.0.3) Requirement already satisfied: itsdangerous>=0.24 in /databricks/python3/lib/python3.10/site-packages (from Flask<3->mlflow) (2.0.1) Requirement already satisfied: gitdb<5,>=4.0.1 in /databricks/python3/lib/python3.10/site-packages (from gitpython<4,>=2.1.0->mlflow) (4.0.10) Requirement already satisfied: setuptools>=3.0 in /databricks/python3/lib/python3.10/site-packages (from gunicorn<21->mlflow) (63.4.1) Requirement already satisfied: zipp>=0.5 in /databricks/python3/lib/python3.10/site-packages (from importlib-metadata!=4.7.0,<7,>=3.7.0->mlflow) (3.8.0) Requirement already satisfied: MarkupSafe>=0.23 in /databricks/python3/lib/python3.10/site-packages (from Jinja2<4,>=2.11->mlflow) (2.0.1) Requirement already satisfied: python-dateutil>=2.7 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow) (2.8.2) Requirement already satisfied: cycler>=0.10 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow) (0.11.0) Requirement already satisfied: pyparsing>=2.2.1 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow) (3.0.9) Requirement already satisfied: fonttools>=4.22.0 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow) (4.25.0) Requirement already satisfied: kiwisolver>=1.0.1 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow) (1.4.2) Requirement already satisfied: pillow>=6.2.0 in /databricks/python3/lib/python3.10/site-packages (from matplotlib<4->mlflow) (9.2.0) Requirement already satisfied: certifi>=2017.4.17 in /databricks/python3/lib/python3.10/site-packages (from requests<3,>=2->langchain) (2022.9.14) Requirement already satisfied: idna<4,>=2.5 in /databricks/python3/lib/python3.10/site-packages (from requests<3,>=2->langchain) (3.3) Requirement already satisfied: joblib>=1.0.0 in /databricks/python3/lib/python3.10/site-packages (from scikit-learn<2->mlflow) (1.2.0) Requirement already satisfied: threadpoolctl>=2.0.0 in /databricks/python3/lib/python3.10/site-packages (from scikit-learn<2->mlflow) (2.2.0) Requirement already satisfied: greenlet!=0.4.17 in /databricks/python3/lib/python3.10/site-packages (from SQLAlchemy<3,>=1.4->langchain) (1.1.1) Requirement already satisfied: smmap<6,>=3.0.1 in /databricks/python3/lib/python3.10/site-packages (from gitdb<5,>=4.0.1->gitpython<4,>=2.1.0->mlflow) (5.0.0) Requirement already satisfied: mypy-extensions>=0.3.0 in /databricks/python3/lib/python3.10/site-packages (from typing-inspect>=0.4.0->dataclasses-json<0.6.0,>=0.5.7->langchain) (0.4.3) Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.
dbutils.library.restartPython()

Creating a RetrievalQA chain with PySpark-based document loading

Let's use the Wikipedia datasets within /databricks-datasets/. In the following cell, please add your OpenAI API Key

import os

os.environ["OPENAI_API_KEY"] = ""
number_of_articles = 20

wikipedia_dataframe = spark.read.parquet("databricks-datasets/wikipedia-datasets/data-001/en_wikipedia/articles-only-parquet/*").limit(number_of_articles)
display(wikipedia_dataframe)
 
title
id
revisionId
revisionTimestamp
revisionUsername
revisionUsernameId
text
1
2
3
4
5
6
7
8
9
10
11
12
13
14
History of physics
13758
679701588
2015-09-06T07:20:54.000+0000
Thony C.
3940951
[[File:Newtons cradle animation book 2.gif|thumb|"If I have seen further, it is only by standing on the shoulders of giants." &ndash;&nbsp;[[Isaac Newton]]&#8201;<ref>Letter to [[Robert Hooke]] (15 February 1676 by Gregorian reckonings with January 1 as New Year's Day). equivalent to 5 February 1675 using the [[Julian calendar]] with March 25 as New Year's Day</ref>]] [[Physics]] (from the [[Ancient Greek]] φύσις ''[[physis]]'' meaning "[[nature]]") is the fundamental branch of [[science]] that...
Hydrofoil
13761
683250181
2015-09-29T02:46:44.000+0000
Fountains of Bryn Mawr
2700175
{{about| Hydrofoils|other types of foil|Foil (fluid mechanics)}} {{Use dmy dates|date=July 2012}} [[File:Carl XCH-4.jpg|thumb|The [[United States Navy|U.S. Navy's]] ''XCH-4'', with hydrofoils clearly lifting the hull out of the water]] A '''hydrofoil''' is a lifting surface, or [[foil (fluid mechanics)|foil]], which operates in water. They are similar in appearance and purpose to [[aerofoil]]s used by [[aeroplane]]s. [[Boat]]s using hydrofoil technology are also simply termed hydrofoils. As spee...
Henri Chopin
13763
679361892
2015-09-04T02:58:58.000+0000
Ser Amantio di Nicolao
753665
{{Use dmy dates|date=December 2013}} {{Refimprove|date=January 2011}} {{See also|Chopin (disambiguation)}} '''Henri Chopin''' (18 June 1922 – 3 January 2008) was an avant-garde poet and musician. ==Life== Henri Chopin was born in Paris,18 June 1922, one of three brothers, and the son of an accountant. Both his siblings died during the war. One was shot by a German soldier the day after an armistice was declared in Paris, the other while sabotaging a train (Acquaviva 2008). Chopin was a French ...
Hassium
13764
683759652
2015-10-02T09:21:19.000+0000
Double sharp
10274643
{{infobox hassium}} '''Hassium''' is a [[chemical element]] with symbol '''Hs''' and [[atomic number]] 108, named after the German state of [[Hesse]]. It is a [[synthetic element]] (an element that can be created in a laboratory but is not found in nature) and [[radioactive]]; the most stable known [[isotope]], <sup>269</sup>Hs, has a [[half-life]] of approximately 9.7&nbsp;seconds, although an unconfirmed [[metastable state]], <sup>277m</sup>Hs, may have a longer half-life of about 11&nbsp;minu...
Hydrus
13768
681684774
2015-09-18T20:20:20.000+0000
Marcocapelle
14965160
{{distinguish2|[[Hydra (constellation)]]. For other uses, see [[Hydrus (disambiguation)]]}} {{featured article}} {{Use dmy dates|date=July 2012}} {{Infobox Constellation | name = Hydrus | abbreviation = Hyi | genitive = Hydri | pronounce = {{IPAc-en|ˈ|h|aɪ|d|r|ə|s}}, genitive {{IPAc-en|ˈ|h|aɪ|d|r|aɪ}} | symbolism = the water snake | RA = {{RA|00|06.1}} to {{RA|04|35.1}}<ref name="boundary"/> | dec= −57.85° to −82.06°<ref name="boundary"/> | family = [[Bayer Family|Bayer]] | quadrant = SQ1 | a...
Hercules
13770
681213961
2015-09-15T21:36:46.000+0000
Djkeddie
1884088
{{About|Hercules in classical mythology|the Greek divine hero from which Hercules was adapted|Heracles|other uses|Hercules (disambiguation)}} {{pp-semi-indef|small=yes}}{{Infobox deity | type = Roman | name = Hercules | image = Pieter paul rubens, ercole e i leone nemeo, 02.JPG | image_size = | alt = | birth_place = | death_place = | caption = ''Hercules fighting the Nemean lion''{{br}}by [[Peter Paul Rubens]] | god_of = | abode = | symbol = | consort = [[Juventas]] | parents = [[Jupiter (my...
History of Poland
13772
681480608
2015-09-17T13:31:08.000+0000
Orczar
704319
{{History of Poland}} The '''history of [[Poland]]''' results from the [[Poland in the Early Middle Ages|migrations of Slavs]] who established permanent settlements on the [[geography of Poland|Polish lands]] during the [[Early Middle Ages]]. In 966 AD, Duke [[Mieszko I]] of the [[Piast dynasty]] [[Baptism of Poland|adopted Western Christianity]]; in 1025 Mieszko's son [[Bolesław I Chrobry]] formally established a [[High Middle Ages|medieval kingdom]]. The period of the [[Jagiellonian dynasty]] ...
Hradčany
13773
670738476
2015-07-09T22:01:34.000+0000
DagErlingSmørgrav
904663
{{For|other meanings|Hradčany (disambiguation)}} {{See also|Prague Castle}} [[Image:Hradcany2.jpg|thumb|Hradčany from the Petřín Tower]] '''Hradčany''' (common {{IPA-cs|ˈɦrat͡ʃanɪ|-|Cs-Hradcany.ogg}}; {{lang-de|[[:de:Hradschin|''Hradschin'']]}}), the '''Castle District''', is the [[Districts of Prague|district]] of the city of [[Prague]], [[Czech Republic]], surrounding the [[Prague Castle]]. The castle is said to be the biggest castle in the world<ref>{{cite web|url=http://www.prague-wiki.com...
Houston
13774
683856784
2015-10-02T23:38:55.000+0000
187Ernest
18740411
{{About|the U.S. city}} {{Hatnote|Note that the city is unrelated to [[Houston County, Texas]], which is located in another part of the state.}} :''Houstonian redirects here. For other uses see: [[The Houstonian (disambiguation)]].'' {{Use mdy dates|date=June 2015}} {{Infobox settlement |name = Houston, Texas |official_name = City of Houston |settlement_type = [[City]] |nickname = <!--DO NOT CHANGE! -->Space City (OFFICIAL) <small>[[Nick...
Hard disk drive
13777
683469505
2015-09-30T14:08:32.000+0000
Dsimic
6479630
{{Redirect|Hard drive}} {{infobox computer hardware | image = Laptop-hard-drive-exposed.jpg|300px | caption = A 2.5-inch SATA hard drive | invent-date = 24 December 1954{{Efn|This is the original filing date of the application which led to US Patent 3,503,060, generally accepted as the definitive [[disk drive]] patent.<ref>Kean, David W., "IBM San Jose, A quarter century of innovation", 1977.</ref>}} | invent-name = [[IBM]] team led by [[Reynold B. Johnson|Rey Johnson]] }} [[File:Hard...
Hermetic Order of the Golden Dawn
13787
679828629
2015-09-07T01:35:00.000+0000
Quick and Dirty User Account
14161215
{{About|the historical organization of the late 19th century|other meanings|Golden Dawn (disambiguation){{!}}Golden Dawn}} {{Hermeticism}} {{Golden Dawn}} The '''Hermetic Order of the Golden Dawn''' (or, more commonly, '''The Golden Dawn''') was an organization devoted to the study and practice of the [[occult]], [[metaphysics]], and [[paranormal]] activities during the late 19th and early 20th centuries. Known as a [[Magical organization|magical order]], the Hermetic Order of the Golden Dawn wa...
High jump
13791
682739222
2015-09-25T18:10:37.000+0000
L1975p
13761122
{{other uses}} {{Use mdy dates|date=January 2013}} {{Infobox athletics event |event= High jump |image= [[File:Yelena Slesarenko failing 2007.jpg|240px]] |caption= [[Yelena Slesarenko]] using the [[Fosbury Flop]] technique at [[2004 Summer Olympics]]. |WRmen= [[Javier Sotomayor]] {{T&Fcalc|2.45}} (1993) |ORmen= [[Charles Austin]] {{T&Fcalc|2.39}} (1996) |WRwomen= [[Stefka Kostadinova]] {{T&Fcalc|2.09}} (1987) |ORwomen= [[Yelena Slesarenko]] {{T&Fcalc|2.06}} (2004) }} The '''high jump''' is a...
Heraclitus
13792
681338233
2015-09-16T16:55:59.000+0000
Widr
13975403
{{Other people|Heracleitus}} {{Infobox philosopher |region = Western Philosophy |era = [[Ancient philosophy]] |image = Utrecht Moreelse Heraclite.JPG |caption = ''Heraclitus'' by [[Johannes Moreelse]]. The image depicts him as "the weeping philosopher" wringing his hands over the world, and as "the obscure" dressed in dark clothing—both traditional motifs |name = Heraclitus |birth_date = c. 535 BCE |birth_place = [[Ephesus]] |death_date = c. 475 BCE (aged around 60) |death_place = |mai...
Harrison Schmitt
13793
680759307
2015-09-13T00:27:17.000+0000
KConWiki
1994682
{{Infobox officeholder | birth_name = Harrison Hagan Schmitt | image = Sen Harrison Schmitt.jpg | imagesize = 180px | jr/sr = United States Senator | state = [[New Mexico]] | party = [[Republican Party (United States)|Republican]] | term_start = January 3, 1977 | term_end = January 3, 1983 | preceded = [[Joseph Montoya]] | succeeded = [[Jeff Bingaman]] | birth_date = {{birth date and age|1935|7|3}} | birth_place = [[Santa Rita, New Mexico|Santa Rita]], [[New Mexico]], USA | death_date = | death_...
20 rows

The following lines are all that is needed for loading data from a PySpark Dataframe into Langchain

from langchain.document_loaders import PySparkDataFrameLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = PySparkDataFrameLoader(spark, wikipedia_dataframe, page_content_column="text")
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
print(f"Number of documents: {len(texts)}")
Number of documents: 421

Create a FAISS vector store using HuggingfaceEmbeddings

This FAISS vector store is the intermediate step to ensure you can log the model with MLflow.

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS

embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(texts, embeddings)

Create a RetrievalQA chain

from langchain.chains import RetrievalQA
from langchain import OpenAI

retrieval_qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=db.as_retriever())

Query the RetrievalQA chain

query = "Who is Harrison Schmitt"
result = retrieval_qa({"query": query})
print("Result:", result["result"])
Result: Harrison Hagan "Jack" Schmitt is an American geologist, retired NASA astronaut, university professor and former U.S. senator from New Mexico. He was the twelfth person to set foot on the Moon, and the second-to-last person to step off of the Moon. He is also the first and only professional scientist to have flown beyond low Earth orbit and to have visited the Moon.

Logging the chain with Mlflow

import mlflow

persist_directory = "langchain/faiss_index"
db.save_local(persist_directory)

def load_retriever(persist_directory):
  embeddings = OpenAIEmbeddings()
  db = FAISS.load_local(persist_directory, embeddings)
  return db.as_retriever()

# Log the RetrievalQA chain
with mlflow.start_run() as mlflow_run:
  logged_model = mlflow.langchain.log_model(
    retrieval_qa,
    "retrieval_qa_chain",
    loader_fn=load_retriever,
    persist_dir=persist_directory,
  )

Loading the chain using MLFlow

model_uri = f"runs:/{ mlflow_run.info.run_id }/retrieval_qa_chain"

loaded_pyfunc_model = mlflow.pyfunc.load_model(model_uri)
langchain_input = {"query": "Who is Harrison Schmitt"}
loaded_pyfunc_model.predict([langchain_input])
Out[72]: [' Harrison Schmitt is an American geologist, retired NASA astronaut, university professor and former U.S. senator from New Mexico. He was the twelfth person to set foot on the Moon and is the second-to-last person to step off the Moon. He was influential within the community of geologists supporting the Apollo program and, before starting his own preparations for an Apollo mission, had been one of the scientists training those Apollo astronauts chosen to visit the lunar surface. He was appointed as Secretary of the New Mexico Energy, Minerals and Natural Resources Department in the cabinet of Governor Susana Martinez, but was forced to give up the appointment the following month after refusing to submit to a required background investigation.']