Create a Meta Ads ingestion pipeline
The Meta Ads connector is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.
This article describes how to create a Meta Ads ingestion pipeline using Databricks Lakeflow Connect. The following interfaces are supported:
- Databricks Asset Bundles
- Databricks APIs
- Databricks SDKs
- Databricks CLI
The Meta Ads connector doesn't currently support UI-based pipeline creation.
Before you begin
To create an ingestion pipeline, you must meet the following requirements:
-
Your workspace must be enabled for Unity Catalog.
-
Serverless compute must be enabled for your workspace. See Serverless compute requirements.
-
If you plan to create a new connection: You must have
CREATE CONNECTIONprivileges on the metastore.If the connector supports UI-based pipeline authoring, an admin can create the connection and the pipeline at the same time by completing the steps on this page. However, if the users who create pipelines use API-based pipeline authoring or are non-admin users, an admin must first create the connection in Catalog Explorer. See Connect to managed ingestion sources.
-
If you plan to use an existing connection: You must have
USE CONNECTIONprivileges orALL PRIVILEGESon the connection object. -
You must have
USE CATALOGprivileges on the target catalog. -
You must have
USE SCHEMAandCREATE TABLEprivileges on an existing schema orCREATE SCHEMAprivileges on the target catalog.
To ingest from Meta Ads, see Set up Meta Ads as a data source.
Supported objects
The Meta Ads connector supports ingesting the following objects:
adsad_setscampaignsad_imagesad_insightsad_creativescustom_audiencesad_videoscustom_conversions
For the ad_insights object, you can configure breakdowns and action breakdowns to analyze performance data at different granularity levels (account, campaign, ad set, or ad).
Create the ingestion pipeline
Permissions required: USE CONNECTION or ALL PRIVILEGES on a connection.
This step describes how to create the ingestion pipeline. Each ingested table is written to a streaming table with the same name.
- Databricks notebook
- Databricks CLI
-
Import the following notebook into your workspace:
Create a Meta Ads ingestion pipeline
-
Leave the default values in cell 1. Don't modify this cell.
-
If you want to ingest all objects from your Meta Ads account, modify the schema spec in cell 2. If you only want to ingest specific objects, delete cell 2 and modify the table spec in cell 3 instead.
Don't modify
channel. This must bePREVIEW.Cell 2 values to modify:
name: A unique name for the pipeline.connection_name: The Unity Catalog connection that stores the authentication details for Meta Ads.source_schema: Your Meta Ads account ID.destination_catalog: A name for the destination catalog that will contain the ingested data.destination_schema: A name for the destination schema that will contain the ingested data.scd_type: The SCD method to use:SCD_TYPE_1orSCD_TYPE_2. The default is SCD type 1. For more information, see Enable history tracking (SCD type 2).
Cell 3 values to modify:
name: A unique name for the pipeline.connection_name: The Unity Catalog connection that stores the authentication details for Meta Ads.source_schema: Your Meta Ads account ID.source_table: The object name to ingest (for example,ads,campaigns, orad_insights).destination_catalog: A name for the destination catalog that will contain the ingested data.destination_schema: A name for the destination schema that will contain the ingested data.scd_type: The SCD method to use:SCD_TYPE_1orSCD_TYPE_2. The default is SCD type 1. For more information, see Enable history tracking (SCD type 2).
For
ad_insights, you can configure additional parameters:breakdown: Optional breakdown dimensions (for example,age,gender,country).action_breakdown: Optional action breakdown dimensions (for example,action_type,action_destination).granularity: Granularity level for insights:account,campaign,ad_set, orad. Default isad.
-
Click Run all.
Run the following command:
databricks pipelines create --json "<pipeline definition or json file path>"
Pipeline definition templates
If you want to ingest all objects from your Meta Ads account, use the schema spec format for your pipeline definition. If you only want to ingest specific objects, use the table spec definition format instead. Don't modify channel. This must be PREVIEW.
Schema spec values to modify:
name: A unique name for the pipeline.connection_name: The Unity Catalog connection that stores the authentication details for Meta Ads.source_schema: Your Meta Ads account ID.destination_catalog: A name for the destination catalog that will contain the ingested data.destination_schema: A name for the destination schema that will contain the ingested data.scd_type: The SCD method to use:SCD_TYPE_1orSCD_TYPE_2. The default is SCD type 1. For more information, see Enable history tracking (SCD type 2).
Schema spec template:
{
"name": "<YOUR_PIPELINE_NAME>",
"ingestion_definition": {
"connection_name": "<YOUR_CONNECTION_NAME>",
"objects": [
{
"schema": {
"source_schema": "<YOUR_META_ADS_ACCOUNT_ID>",
"destination_catalog": "<YOUR_DATABRICKS_CATALOG>",
"destination_schema": "<YOUR_DATABRICKS_SCHEMA>",
"table_configuration": {
"scd_type": "SCD_TYPE_1"
}
}
}
]
},
"channel": "PREVIEW"
}
Table spec values to modify:
name: A unique name for the pipeline.connection_name: The Unity Catalog connection that stores the authentication details for Meta Ads.source_schema: Your Meta Ads account ID.source_table: The object name to ingest (for example,ads,campaigns, orad_insights).destination_catalog: A name for the destination catalog that will contain the ingested data.destination_schema: A name for the destination schema that will contain the ingested data.scd_type: The SCD method to use:SCD_TYPE_1orSCD_TYPE_2. The default is SCD type 1. For more information, see Enable history tracking (SCD type 2).
Table spec template:
{
"name": "<YOUR_PIPELINE_NAME>",
"ingestion_definition": {
"connection_name": "<YOUR_CONNECTION_NAME>",
"objects": [
{
"table": {
"source_schema": "<YOUR_META_ADS_ACCOUNT_ID>",
"source_table": "<OBJECT_NAME>",
"destination_catalog": "<YOUR_DATABRICKS_CATALOG>",
"destination_schema": "<YOUR_DATABRICKS_SCHEMA>",
"table_configuration": {
"scd_type": "SCD_TYPE_1"
}
}
}
]
},
"channel": "PREVIEW"
}
For ad_insights, you can add configuration options:
{
"name": "<YOUR_PIPELINE_NAME>",
"ingestion_definition": {
"connection_name": "<YOUR_CONNECTION_NAME>",
"objects": [
{
"table": {
"source_schema": "<YOUR_META_ADS_ACCOUNT_ID>",
"source_table": "ad_insights",
"destination_catalog": "<YOUR_DATABRICKS_CATALOG>",
"destination_schema": "<YOUR_DATABRICKS_SCHEMA>",
"table_configuration": {
"scd_type": "SCD_TYPE_1",
"breakdown": ["age", "gender"],
"action_breakdown": ["action_type"],
"granularity": "ad"
}
}
}
]
},
"channel": "PREVIEW"
}
Additional CLI commands
To edit the pipeline:
databricks pipelines update --json "<pipeline definition or json file path>"
To get the pipeline definition:
databricks pipelines get "<pipeline-id>"
To delete the pipeline:
databricks pipelines delete "<pipeline-id>"
For more information, run:
databricks pipelines --help
databricks pipelines <create|update|get|delete|...> --help
Next steps
- Start, schedule, and set alerts on your pipeline.