Google Analytics Raw Data connector limitations

This page lists limitations and considerations for ingesting raw, event-level data from Google Analytics using Databricks Lakeflow Connect and Google BigQuery.

General SaaS connector limitations

The limitations in this section apply to all SaaS connectors in Lakeflow Connect.

When you run a scheduled pipeline, alerts don't trigger immediately. Instead, they trigger when the next update runs.
When a source table is deleted, the destination table is not automatically deleted. You must delete the destination table manually. This behavior is not consistent with Lakeflow Spark Declarative Pipelines behavior.
During source maintenance periods, Databricks might not be able to access your data.
If a source table name conflicts with an existing destination table name, the pipeline update fails.
Multi-destination pipeline support is API-only.
You can optionally rename a table that you ingest. If you rename a table in your pipeline, it becomes an API-only pipeline, and you can no longer edit the pipeline in the UI.
Column-level selection and deselection are API-only.
If you select a column after a pipeline has already started, the connector does not automatically backfill data for the new column. To ingest historical data, manually run a full refresh on the table.
Databricks can't ingest two or more tables with the same name in the same pipeline, even if they come from different source schemas.
The source system assumes that the cursor columns are monotonically increasing.
With SCD type 1 enabled, deletes don't produce an explicit delete event in the change data feed. For auditable deletions, use SCD type 2 if the connector supports it. For details, see Example: SCD type 1 and SCD type 2 processing with CDF source data.
The connector ingests raw data without transformations. Use downstream Lakeflow Spark Declarative Pipelines pipelines for transformations.

Connector-specific limitations

The limitations in this section are specific to the GA4 connector.

Authentication

The connector only supports authentication using a GCP service account.

Pipelines

Updates and deletes in GA4 are not ingested.
The connector only supports one GA4 property per pipeline.
Ingestion from Universal Analytics (UA) is not supported.

Tables

The connector can't reliably ingest BigQuery date-partitioned tables that are larger than 150 GB.
The connector only ingests raw data that you export from GA4 to BigQuery, and it inherits GA4 limits on the amount of historical data that you can export to BigQuery.
The initial load fetches the data for all dates that are present in your GA4/BigQuery project.
Databricks can't guarantee retention of events_intraday data for a given day after the data is available in the events table. This is because the events_intraday table is only intended for interim use until the events table is ready for that day.
The connector assumes that each row is unique. Databricks can't guarantee correct behavior if there are unexpected duplicates.

General SaaS connector limitations​

Connector-specific limitations​

Authentication​

Pipelines​

Tables​

General SaaS connector limitations

Connector-specific limitations

Authentication

Pipelines

Tables