API Reference
This page provides a comprehensive index of MLflow APIs used in GenAI applications, with direct links to the official MLflow documentation.
MLflow features marked as "Databricks only" are only available on Databricks-managed MLflow.
Official documentation links
- All MLflow Python APIs
- MLflow Core APIs
- MLflow GenAI Module
- MLflow Tracing APIs
- MLflow Client APIs
- MLflow Entities
Some of the APIs referenced on this page are currently in the Beta or Experimental stages. These APIs are subject to change or removal in future releases. Experimental APIs are available to all customers, and Beta APIs are available to most customers automatically. If you do not have access to a Beta API and need to request access, contact your Databricks support representative.
Experiment management
Manage MLflow experiments and runs for tracking GenAI application development:
SDKs
mlflow.search_runs()
- Search and filter runs by criteriamlflow.set_experiment()
- Set the active MLflow experimentmlflow.start_run()
- Start a new MLflow run for tracking
Entities
mlflow.entities.Experiment
- Experiment metadata and configurationmlflow.entities.Run
- Run metadata, metrics, and parameters
Prompt management
Version control and lifecycle management for prompts used in GenAI applications:
SDKs
mlflow.genai.load_prompt()
- Load a versioned prompt from the registrymlflow.genai.optimize_prompt()
- Automatically improve prompts using optimization algorithmsmlflow.genai.register_prompt()
- Register a new prompt to the registrymlflow.genai.search_prompts()
- Search for prompts by name or tagsmlflow.genai.delete_prompt_alias()
- Remove an alias from a prompt versionmlflow.genai.set_prompt_alias()
- Assign an alias to a prompt version
Entities
mlflow.entities.Prompt
- Prompt metadata and version information
Evaluation and monitoring
Scorer lifecycle management (Databricks only)
This feature is in Beta.
Scorer lifecycle management for continuous quality tracking in production:
Scorer instance methods
Scorer.register()
- Register custom scorer with serverScorer.start()
- Begin online evaluation with samplingScorer.update()
- Modify sampling configurationScorer.stop()
- Stop online evaluationScorer.delete()
- Remove scorer entirely
Scorer registry functions
mlflow.genai.scorers.get_scorer()
- Retrieve registered scorer by namemlflow.genai.scorers.list_scorers()
- List all registered scorersmlflow.genai.scorers.delete_scorer()
- Delete registered scorers by name
Scorer properties
Scorer.sample_rate
- Current sampling rate (0.0-1.0)Scorer.filter_string
- Current trace filter
Configuration classes
mlflow.genai.ScorerSamplingConfig
- Sampling configuration data class
Core evaluation APIs
Core APIs for offline evaluation and production monitoring:
mlflow.genai.evaluate()
- Evaluation harness to orchestrate offline evaluation with scorers and datasetsmlflow.genai.to_predict_fn()
- Convert model output to standardized prediction function formatmlflow.genai.Scorer
- Custom scorer class for object-oriented implementation with state managementmlflow.genai.scorer()
- Scorer decorator for scorer creation and evaluation logic
Predefined scorers
Quality assessment scorers ready for immediate use:
mlflow.genai.scorers.Safety
- Content safety evaluationmlflow.genai.scorers.Correctness
- Answer accuracy assessmentmlflow.genai.scorers.RelevanceToQuery
- Query relevance scoringmlflow.genai.scorers.Guidelines
- Custom guideline compliancemlflow.genai.scorers.ExpectationsGuidelines
- Guideline evaluation with expectationsmlflow.genai.scorers.RetrievalGroundedness
- RAG grounding assessmentmlflow.genai.scorers.RetrievalRelevance
- Retrieved context relevancemlflow.genai.scorers.RetrievalSufficiency
- Context sufficiency evaluation
Scorer helpers
mlflow.genai.scorers.get_all_scorers()
- Retrieve all built-in scorers
Judge functions
LLM-based assessment functions for direct use or scorer wrapping:
mlflow.genai.judges.is_safe()
- Safety assessmentmlflow.genai.judges.is_correct()
- Correctness evaluationmlflow.genai.judges.is_grounded()
- Grounding verificationmlflow.genai.judges.is_context_relevant()
- Context relevancemlflow.genai.judges.is_context_sufficient()
- Context sufficiencymlflow.genai.judges.meets_guidelines()
- Custom guideline assessmentmlflow.genai.make_judge()
- Create custom judges (recommended for MLflow 3.4.0 and above)mlflow.genai.judges.custom_prompt_judge()
- Custom prompt-based evaluation (deprecated in MLflow 3.4.0, usemake_judge()
instead)
Judge output entities
mlflow.genai.judges.CategoricalRating
- Enum for categorical judge responsesmlflow.genai.judges.CategoricalRating.YES
- Positive ratingmlflow.genai.judges.CategoricalRating.NO
- Negative ratingmlflow.genai.judges.CategoricalRating.UNKNOWN
- Uncertain rating
Evaluation datasets
Create and manage versioned test datasets for systematic evaluation:
SDKs
mlflow.genai.create_dataset()
- Create a new evaluation datasetmlflow.genai.delete_dataset()
- Delete an evaluation datasetmlflow.genai.get_dataset()
- Retrieve an existing evaluation dataset
Entities
mlflow.genai.datasets.EvaluationDataset
- Versioned test data containermerge_records()
- Combine records from multiple sourcesset_profile()
- Configure dataset profile settingsto_df()
- Convert dataset to pandas DataFrameto_evaluation_dataset()
- Convert to evaluation dataset format
Human labeling and review app (Databricks only)
Human feedback collection and review workflows for systematic quality assessment:
Entities
-
mlflow.genai.Agent
- Agent configuration for review app testing -
mlflow.genai.LabelingSession
- Human labeling workflow manageradd_dataset()
- Add evaluation dataset to labeling sessionadd_traces()
- Add traces for human reviewset_assigned_users()
- Assign reviewers to sessionsync()
- Synchronize session state
-
mlflow.genai.ReviewApp
- Interactive review applicationadd_agent()
- Add agent for testingremove_agent()
- Remove agent from review app
Labeling session SDKs
mlflow.genai.create_labeling_session()
- Create a new labeling sessionmlflow.genai.delete_labeling_session()
- Delete a labeling sessionmlflow.genai.get_labeling_session()
- Retrieve labeling session by IDmlflow.genai.get_labeling_sessions()
- List all labeling sessionsmlflow.genai.get_review_app()
- Retrieve review app instance
Label schema types
mlflow.genai.label_schemas.InputCategorical
- Categorical input field typemlflow.genai.label_schemas.InputCategoricalList
- Multi-select categorical inputmlflow.genai.label_schemas.InputNumeric
- Numeric input field typemlflow.genai.label_schemas.InputText
- Text input field typemlflow.genai.label_schemas.InputTextList
- Multi-text input field typemlflow.genai.label_schemas.LabelSchema
- Label schema definitionmlflow.genai.label_schemas.LabelSchemaType
- Schema type enummlflow.genai.label_schemas.LabelSchemaType.EXPECTATION
- Expectation schema typemlflow.genai.label_schemas.LabelSchemaType.FEEDBACK
- Feedback schema type
Label schema SDKs
mlflow.genai.label_schemas.create_label_schema()
- Create a new label schemamlflow.genai.label_schemas.delete_label_schema()
- Delete an existing label schemamlflow.genai.label_schemas.get_label_schema()
- Retrieve label schema by name
Prompt optimization
This feature is in Beta.
Automated prompt improvement using data-driven optimization algorithms:
Entities
mlflow.genai.optimize.LLMParams
- LLM configuration parametersmlflow.genai.optimize.OptimizerConfig
- Optimization algorithm configurationmlflow.genai.optimize.PromptOptimizationResult
- Optimization results and metrics
SDKs
mlflow.genai.optimize.optimize_prompt()
- Run prompt optimization process
Tracing
Instrument and capture execution traces from GenAI applications:
SDKs
mlflow.delete_trace_tag()
- Remove a tag from a tracemlflow.get_current_active_span()
- Get the currently active spanmlflow.get_last_active_trace()
- Retrieve the most recently completed tracemlflow.get_last_active_trace_id()
- Get ID of the last active tracemlflow.get_trace()
- Retrieve a trace by IDmlflow.search_traces()
- Search and filter tracesmlflow.set_trace_tag()
- Add a tag to a tracemlflow.start_span()
- Manually start a new spanmlflow.trace
- Decorator to automatically trace function executionmlflow.traceName
- Context manager to set trace namemlflow.traceOutputs
- Context manager to set trace outputsmlflow.tracing
- Tracing module with configuration functionsmlflow.tracing.disable
- Disable tracing globallymlflow.tracing.disable_notebook_display()
- Disable trace display in notebooksmlflow.tracing.enable
- Enable tracing globallymlflow.tracing.enable_notebook_display()
- Enable trace display in notebooksmlflow.update_current_trace()
- Update metadata for the current trace
Entities
mlflow.entities.Trace
- Complete trace with all spans and metadatamlflow.entities.TraceData
- Trace execution datamlflow.entities.TraceInfo
- Trace metadata and summary informationmlflow.entities.Span
- Individual span within a tracemlflow.entities.SpanEvent
- Event occurring within a spanmlflow.entities.SpanType
- Span type classification enummlflow.entities.Document
- Document entity for RAG applications
Assessment entities
Data structures for storing evaluation results and feedback:
mlflow.entities.Assessment
- Evaluation result containermlflow.entities.AssessmentError
- Assessment error detailsmlflow.entities.AssessmentSource
- Source of the assessmentmlflow.entities.AssessmentSourceType
- Assessment source type enummlflow.entities.Expectation
- Expected ground truth outcomemlflow.entities.Feedback
- Scorer output with value and rationale
Tracing integrations
Auto-instrumentation for popular GenAI frameworks and libraries:
mlflow.anthropic.autolog
- Anthropic Claude integrationmlflow.autogen.autolog
- Microsoft AutoGen integrationmlflow.bedrock.autolog
- AWS Bedrock integrationmlflow.crewai.autolog
- CrewAI integrationmlflow.dspy.autolog
- DSPy integrationmlflow.gemini.autolog
- Google Gemini integrationmlflow.groq.autolog
- Groq integrationmlflow.langchain.autolog
- LangChain integrationmlflow.litellm.autolog
- LiteLLM integrationmlflow.llama_index.autolog
- LlamaIndex integrationmlflow.mistral.autolog
- Mistral AI integrationmlflow.openai.autolog
- OpenAI integration
Version tracking
Track and manage GenAI application versions in production:
SDKs
mlflow.set_active_model()
- Set the active model for version trackingmlflow.clear_active_model()
- Clear the active model contextmlflow.get_active_model_id()
- Get the current active model IDmlflow.create_external_model()
- Register an external model deploymentmlflow.delete_logged_model_tag()
- Remove a tag from logged modelmlflow.finalize_logged_model()
- Finalize a logged modelmlflow.get_logged_model()
- Retrieve logged model by IDmlflow.initialize_logged_model()
- Initialize a new logged modelmlflow.last_logged_model()
- Get the most recently logged modelmlflow.search_logged_models()
- Search for logged modelsmlflow.set_logged_model_tags()
- Add tags to logged modelmlflow.log_model_params()
- Log parameters for a model
Entities
mlflow.entities.LoggedModel
- Logged model metadata and informationmlflow.entities.LoggedModelStatus
- Logged model status enummlflow.ActiveModel
- Active model context manager