Skip to main content

Google Drive connector limitations

This page lists limitations and considerations for ingestion from Google Drive using Databricks Lakeflow Connect.

General SaaS connector limitations

The limitations in this section apply to all SaaS connectors in Lakeflow Connect.

  • When you run a scheduled pipeline, alerts don't trigger immediately. Instead, they trigger when the next update runs.
  • When a source table is deleted, the destination table is not automatically deleted. You must delete the destination table manually. This behavior is not consistent with Lakeflow Spark Declarative Pipelines behavior.
  • During source maintenance periods, Databricks might not be able to access your data.
  • If a source table name conflicts with an existing destination table name, the pipeline update fails.
  • Multi-destination pipeline support is API-only.
  • You can optionally rename a table that you ingest. If you rename a table in your pipeline, it becomes an API-only pipeline, and you can no longer edit the pipeline in the UI.
  • Column-level selection and deselection are API-only.
  • If you select a column after a pipeline has already started, the connector does not automatically backfill data for the new column. To ingest historical data, manually run a full refresh on the table.
  • Databricks can't ingest two or more tables with the same name in the same pipeline, even if they come from different source schemas.
  • The source system assumes that the cursor columns are monotonically increasing.
  • The connector ingests raw data without transformations. Use downstream Lakeflow Spark Declarative Pipelines pipelines for transformations.

Connector-specific limitations

  • Unstructured (BINARYFILE) ingestion supports only SCD_TYPE_1 storage mode. Structured ingestion (CSV, JSON, XML, EXCEL, and other formats) supports only APPEND_ONLY storage mode. SCD type 2 is not supported. When configuring storage mode, set storage_mode in table_configuration. Setting the scd_type field throws an error.
  • Individual file selection is not supported. The connector ingests all files in a configured folder or drive. To narrow which files are ingested, use file_filters with a path_filter glob pattern.
  • During unstructured (BINARYFILE) ingestion, file deletions are tracked only when ingesting from a shared drive. File deletions are not tracked when ingesting from a folder. File updates are tracked in both cases.
  • BINARYFILE, CSV, JSON, XML, EXCEL, PARQUET, AVRO, ORC are supported. Unsupported formats (for example, Google Forms, Google Sites) are skipped during ingestion.