Workspace assets

This article provides a high-level introduction to Databricks workspace assets.

Clusters

Databricks clusters provide a unified platform for various use cases such as running production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning.

For detailed information on managing and using clusters, see Clusters.

Notebooks

A notebook is a web-based interface to documents containing a series of runnable cells (commands) that operate on files and tables, visualizations, and narrative text. Commands can be run in sequence, referring to the output of one or more previously run commands.

Notebooks are one mechanism for running code in Databricks. The other mechanism is jobs.

For detailed information on managing and using notebooks, see Notebooks.

Jobs

Jobs are one mechanism for running code in Databricks. The other mechanism is notebooks.

For detailed information on managing and using jobs, see Jobs.

Libraries

A library makes third-party or locally-built code available to notebooks and jobs running on your clusters.

For detailed information on managing and using libraries, see Libraries.

Data

You can import data into a distributed file system mounted into a Databricks workspace and work with it in Databricks notebooks and clusters. You can also use a wide variety of Apache Spark data sources to access data.

For detailed information on managing and using data, see Data.

Models

The Model Registry is a centralized model store that enables you to manage the full lifecycle of MLflow Models. It provides chronological model lineage, model versioning, stage transitions, and model and model version annotations and descriptions.

For more detailed information, see MLflow Model Registry on Databricks.

Experiments

An MLflow experiment is the primary unit of organization and access control for MLflow machine learning model training runs; all MLflow runs belong to an experiment. Each experiment lets you visualize, search, and compare runs, as well as download run artifacts or metadata for analysis in other tools.

For detailed information on managing and using experiments, see Experiments.

Models

A model refers to an MLflow registered model, which lets you manage MLflow Models in production through stage transitions and versioning. A registered model has a unique name, versions, model lineage, and other metadata.

For detailed information on managing and using models, see MLflow Model Registry on Databricks.