Skip to main content

Data discovery and collaboration in the lakehouse

Databricks enables secure, governed collaboration across data, analytics, and AI workloads on the Lakehouse. Using Unity Catalog and open protocols like Delta Sharing, teams can discover, share, and analyze data at scale while maintaining governance, auditability, and privacy across use cases and collaborators.

Manage permissions at scale

Unity Catalog provides administrators a unified location to assign permissions for catalogs, databases, tables, and views to groups of users. Privileges and metastores are shared across workspaces, allowing administrators to set secure permissions once against groups synced from identity providers and know that end users only have access to the proper data in any Databricks workspace they enter.

Unity Catalog also allows administrators to define storage credentials, a secure method for storing and sharing permissions on cloud storage infrastructure. You can grant privileges on these securables to allow users in the organization to define external locations against cloud object storage locations, allowing data engineers to self-service for new workloads without needing to provide elevated permissions in cloud account consoles.

Discover data on Databricks

Users can browse available data objects in Unity Catalog using Catalog Explorer. Catalog Explorer uses the privileges configured by Unity Catalog administrators to ensure that users are only able to see catalogs, databases, tables, and views that they have permissions to query. Once users find a dataset of interest, they can review field names and types, read comments on tables and individual fields, and preview a sample of the data. Users can also review the full history of the table to understand when and how data has changed, and the lineage feature allows users to track how certain datasets are derived from upstream jobs and used in downstream jobs.

Storage credentials and external locations are also displayed in Catalog Explorer, allowing each user to see the privileges they need read and write data across available locations and resources.

Accelerate time to production with the lakehouse

Databricks supports workloads in SQL, Python, Scala, and R, allowing users with diverse skill sets and technical backgrounds to use their knowledge to derive analytic insights. You can use all languages supported by Databricks to define production jobs, and notebooks can use a combination of languages. This means that you can promote queries written by SQL analysts for last mile ETL into production data engineering code with almost no effort. Queries and workloads defined by personas across the organization use the same datasets, so there's no need to reconcile field names or make sure dashboards are up to date before sharing code and results with other teams. You can securely share code, notebooks, queries, and dashboards, all powered by the same scalable cloud infrastructure and defined against the same curated data sources.