Skip to main content

Troubleshoot query-based connectors

Preview

Query-based connectors are in Public Preview. Contact your Databricks account team to request access.

Invalid cursor columns

Symptom: The pipeline fails with the error INVALID_CURSOR_COLUMNS.

Cause: The cursor column isn't configured correctly. The most common cause is specifying more than one cursor column. Query-based connectors require exactly one cursor column.

Resolution:

  1. Open your pipeline configuration (bundle YAML or CLI JSON).
  2. Check the cursor_column field (foreign connection ingestion) or the cursor_columns list (foreign catalog ingestion).
  3. Confirm that exactly one column name is specified. Remove any additional entries.
  4. Confirm that the column you specified exists in the source table and has values that are monotonically increasing.
  5. Re-deploy or update the pipeline and trigger a new run.

Connection failures

Symptom: The pipeline fails to connect to the source database, or the connection test returns an error.

Resolution:

  1. Verify that the Unity Catalog connection object is valid:
    • In the Databricks workspace, go to Catalog > Connections and confirm the connection exists and the credentials are current.
    • If credentials have expired or changed, update the connection with the new values.
  2. Verify network connectivity from serverless compute to the source database:
    • Confirm that the database host is reachable from the serverless compute network.
    • Check that firewall rules, security group settings, or VPC/VNet peering allow traffic from the serverless compute IP ranges to the database port.
    • For on-premises databases, confirm that the network path (such as AWS Direct Connect or Azure ExpressRoute) is active.
  3. For foreign catalog ingestion, confirm that the Unity Catalog foreign catalog is accessible:
    • In the Databricks workspace, go to Catalog and confirm the foreign catalog is visible and queryable.
    • Try running a simple SELECT against a table in the foreign catalog to verify the Lakehouse Federation connection is working.

Pipeline not ingesting new rows

Symptom: The pipeline runs successfully but the destination table isn't receiving new rows, even though new data exists in the source.

Possible causes and resolutions:

  • Cursor column value isn't advancing. Check whether the cursor column in the source is being updated when rows change. If the column value doesn't change when a row is modified, the connector won't detect the change. Consider using a different column, such as a last_modified timestamp that's updated on every write.
  • NULL cursor values. Rows where the cursor column is NULL are excluded from ingestion. If many rows have NULL in the cursor column, those rows are never ingested. Ensure the cursor column is populated for all rows you want to ingest.
  • Late arrival data or clock skew. If rows are written with a future timestamp, if data arrives late relative to the pipeline schedule, or if there's significant clock drift on the source, those rows might have cursor values above the current high-water mark but not yet be visible to the next run. In most cases this resolves itself on the next run.
  • Full refresh needed. If you changed the cursor column or reset the source data, perform a full refresh to reingest from the beginning. See Fully refresh target tables.

Deletion tracking not working

Symptom: Rows that are soft-deleted in the source are still present in the destination table after a pipeline run.

Cause: Soft-deletion tracking requires the deletion_condition parameter in your pipeline configuration. This parameter is only configurable using the API.

Resolution:

  1. Confirm that the deletion_condition parameter is set in your pipeline configuration and that its SQL expression correctly identifies soft-deleted rows. For example:

    JSON
    "deletion_condition": "deleted_at IS NOT NULL"
  2. Verify that the expression evaluates correctly against the actual data in the source table. Run the equivalent query directly against the source to confirm it returns the rows you expect to be deleted.

  3. If you recently added the deletion_condition to an existing pipeline, trigger a pipeline run and verify the destination table reflects the deletions after the run completes.

For the reference syntax of deletion_condition, see Deletion condition.