SQL Server connector FAQs

This page answers frequently asked questions about the SQL Server connector in Databricks Lakeflow Connect.

General managed connector FAQs

The answers in Managed connector FAQs apply to all managed connectors in Lakeflow Connect. Keep reading for connector-specific FAQs.

If the pipeline fails, does ingestion resume without data loss?

Yes. Databricks keeps track of what the connector has extracted from the source and applied in the destination. If anything happens, Databricks can resume at that point as long as the logs remain on the source database. This can be impacted if the pipeline does not run before the log retention period deletes the logs, requiring a full refresh on the target tables.

Does the connector capture timezones for date and time columns?

No. Date and time are ingested in UTC format.

Can I customize the schedule of the ingestion gateway?

No the ingestion gateway must run in continuous mode to avoid changes being dropped due to log retention. If changes have been dropped, a full refresh is required for all tables.

How does the connector handle a table without a primary key?

The connector treats all columns except large objects as a bundled primary key. If there are duplicate rows in the source table, these rows are ingested as a single row in the destination table.

How often can I schedule the ingestion pipeline to run?

There is no limit on how often the ingestion pipeline can be scheduled to run. However, Databricks recommends at least 5 minutes between intervals because it takes some time for the serverless compute to startup. Databricks doesn't support running the ingestion pipeline in continuous mode.

Why am I not seeing all of the rows from my database in the initial pipeline run?

The ingestion gateway extracts historical and CDC data as soon as it starts running. The ingestion pipeline might run before all of this data has been extracted, resulting in a partial application of data into target tables. It can take a few runs of the ingestion pipeline to have all of the data extracted and applied to target tables.

Can I ingest from a read replica or a secondary instance?

No. Support is limited to primary SQL Server instances. This is because change tracking and change data capture are not supported on read replicas or secondary instances.

General managed connector FAQs​

If the pipeline fails, does ingestion resume without data loss?​

Does the connector capture timezones for date and time columns?​

Can I customize the schedule of the ingestion gateway?​

How does the connector handle a table without a primary key?​

How often can I schedule the ingestion pipeline to run?​

Why am I not seeing all of the rows from my database in the initial pipeline run?​

Can I ingest from a read replica or a secondary instance?​