Troubleshoot SQL Server connections
Preview
LakeFlow Connect is in gated Public Preview. To participate in the preview, contact your Databricks account team.
Note
The SQL Server connector supports Azure SQL Database and Amazon RDS for SQL Server.
Generic troubleshooting steps
If a pipeline fails, click on the step that failed and see if the error message or code provides sufficient information about the nature of the error.
You can also check the cluster logs from the pipeline details page. Scan the logs for errors or exceptions. You can also download the log files locally for more detailed inspection.
Check if CDC is enabled for a database or a table
To check if CDC is enabled for database <database-name>
:
select is_cdc_enabled from sys.databases where name='<database-name>';
To check if CDC is enabled for table <schema-name>.<table-name>
:
select t.is_tracked_by_cdc
from sys.tables t join sys.schemas s on t.schema_id = s.schema_id
where s.name='<schema-name>' and t.name='<table-name>';
Check if change tracking is enabled for a database or a table
To check if change tracking is enabled for database`<database-name>`:
select ctdb.*
from sys.change_tracking_databases ctdb join sys.databases db
on db.database_id = ctdb.database_id
where db.name = '<MyDatabaseName>'
To check if change tracking is enabled for table <schema-name>.<table-name>
:
select s.name schema_name, t.name table_name, ct.*
from sys.change_tracking_tables ct join sys.tables t
on ct.object_id = t.object_id
join sys.schemas s on t.schema_id = s.schema_id
where s.name = '<MySchemaName>' and t.name = '<MyTableName>'
Time out on waiting for table token
The ingestion pipeline might time out while waiting for information to be provided by the gateway. This might be because:
You are running an older version of the gateway.
There was an error generating the necessary information. Check the gateway driver logs for errors.
default auth: cannot configure default credentials
If you receive this error, there is an issue with discovering the current user credentials. Try replacing the following:
w = WorkspaceClient()
with:
w = WorkspaceClient(host=input('Databricks Workspace URL: '), token=input('Token: '))
See Authentication in the Databricks SDK for Python documentation.
tech.replicant.common.ExtractorException: com.microsoft.sqlserver.jdbc.SQLServerException: Invalid column name ‘SERIAL_NUMBER’.
You might receive this error if you’re using an older version of an internal table. Run the following on the connected database:
drop table dbo.replicate_io_audit_ddl_trigger_1;
PERMISSION_DENIED: You are not authorized to create clusters. Please contact your administrator.
Contact a Databricks account admin to grant you Unrestricted cluster creation
permissions.
org.apache.spark.sql.catalyst.ExtendedAnalysisException
Error message:
Ingestion pipeline error: “org.apache.spark.sql.catalyst.ExtendedAnalysisException: Cannot have multiple queries named `XYZ_snapshot_load` for `XYZ`. Additional queries on that table must be named. Note that unnamed queries default to the same name as the table.
Issue: This indicates that there is a name conflict due to multiple source tables named XYZ
in different source schemas that are being ingested by the same ingestion pipeline to the same destination schema.
Solution: Create multiple gateway-ingestion pipeline pairs writing these conflicting tables to different destination schemas.
Incompatible schema changes
An incompatible schema change causes the ingestion pipeline to fail with an INCOMPATIBLE_SCHEMA_CHANGE
error. To continue replication, trigger a full refresh of the affected tables.
Note
Databricks can not guarantee that at time the ingestion pipeline fails for an incompatible schema change, all rows prior to the schema change have been ingested.