Enable change tracking in SQL Server
Preview
LakeFlow Connect is in gated Public Preview. To participate in the preview, contact your Databricks account team.
This article describes how to enable change tracking in SQL Server. Either change tracking or CDC is required for ingestion into Databricks. For guidance on which option to choose, see Change tracking vs. change data capture.
Enable change tracking on a database
Run the following, replacing <database-name>
with the name of the database you want to enable change tracking on. Set CHANGE_RETENTION
to the maximum time that the gateway is likely to be down. After this amount of time has passed, a full refresh is required to resume the gateway.
ALTER DATABASE <database-name> SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 3 DAYS, AUTO_CLEANUP = ON)
Enable change tracking on a table
Run the following, replacing <schema-name>.<table-name>
with the name of the schema and table you want to enable change tracking on.
ALTER TABLE <schema-name>.<table-name> ENABLE CHANGE_TRACKING
Grant VIEW CHANGE TRACKING on ingested tables
In addition to the privileges described in the source setup, the database user needs the VIEW CHANGE TRACKING
privilege on the ingested tables or on a schema that contains tables being tracked.
To grant schema-level permissions, run the following:
GRANT VIEW CHANGE TRACKING ON SCHEMA::<schema-name> TO <cdc-username>;
To grant table-level permissions, run the following:
GRANT VIEW CHANGE TRACKING ON OBJECT::<schema-name>.<table-name> TO <cdc-username>;
Set up DDL capture and schema evolution
The SQL Server connector can track the data definition language (DDL) on ingested database objects and apply relevant table schema changes to the destination tables or add new tables in case of full schema replication.
To perform DDL capture, additional database object setup is required (for example, internal tables, stored procedures, and triggers). There are two ways to do this:
Automatic: Requires additional privileges to be granted to the database user configured for ingestion into Databricks.
Manual: Choose this if you want a read-only user to perform DDL capture and schema evolution. Requires manual creation of database objects.
Option 1: Automatic setup for DDL capture
Run the following commands to grant automatic DDL capture privileges to the database user configured for ingestion into Databricks:
CREATE PROCEDURE
on the database:GRANT CREATE PROCEDURE TO <cdc-username>;
CREATE TABLE
on the database:GRANT CREATE TABLE TO <cdc-username>;
SELECT
,EXECUTE
, andINSERT
on the schema:GRANT SELECT,INSERT,EXECUTE ON SCHEMA::dbo TO <cdc-username>;
ALTER
on the database:GRANT ALTER ON DATABASE::<database-name> TO <cdc-username>;
ALTER
on the schema or on all tables to ingest:GRANT ALTER ON SCHEMA::[<schema-name>] TO [<cdc-username>];
Option 2: Manual setup for DDL capture
If you want a read-only user to perform DDL capture and schema evolution, run the following commands to manually set up DDL capture:
CREATE TABLE "dbo"."replicate_io_audit_ddl_1"(
"SERIAL_NUMBER" INT IDENTITY NOT NULL,
"CURRENT_USER" NVARCHAR(128),
"SCHEMA_NAME" NVARCHAR(128),
"TABLE_NAME" NVARCHAR(128),
"TYPE" NVARCHAR(30),
"OPERATION_TYPE" NVARCHAR(30),
"SQL_TXT" NVARCHAR(2000),
"LOGICAL_POSITION" BIGINT,
CONSTRAINT "replicate_io_audit_ddlPK" PRIMARY KEY("SERIAL_NUMBER","LOGICAL_POSITION")
)
ALTER TABLE dbo.replicate_io_audit_ddl_1 ENABLE CHANGE_TRACKING;
GRANT VIEW CHANGE TRACKING ON OBJECT::dbo.replicate_io_audit_ddl_1 TO [CdcUserName];
CREATE TRIGGER "replicate_io_audit_ddl_trigger_1"
ON DATABASE
AFTER ALTER_TABLE
AS
SET NOCOUNT ON;
DECLARE @DbName nvarchar(255), @SchemaName nvarchar(max), @TableName nvarchar(255), @data XML, @operation NVARCHAR(30), @isCTEnabledDBLevel bit, @isCTEnabledTableLevel bit;
SET @data = EVENTDATA();
SET @DbName = DB_NAME();
SET @SchemaName = @data.value(' (/EVENT_INSTANCE/SchemaName) [1] ', ' NVARCHAR(MAX) ');
SET @TableName = @data.value(' (/EVENT_INSTANCE/ObjectName) [1] ', ' NVARCHAR(255) ');
SET @operation = @data.value(' (/EVENT_INSTANCE/EventType) [1] ', ' NVARCHAR(30) ');
SET @isCTEnabledDBLevel =
(SELECT COUNT(*) FROM sys.change_tracking_databases WHERE database_id = DB_ID(@DbName));
SET @isCTEnabledTableLevel = (SELECT COUNT(*)
FROM sys.change_tracking_tables
WHERE object_id = object_id('[' + @SchemaName + '].[' + @TableName + ']'));
IF (@isCTEnabledDBLevel = 1 AND @isCTEnabledTableLevel = 1)
BEGIN
INSERT INTO dbo.replicate_io_audit_ddl_1 ("CURRENT_USER", "SCHEMA_NAME", "TABLE_NAME", "TYPE",
"OPERATION_TYPE", "SQL_TXT", "LOGICAL_POSITION")
VALUES (SUSER_NAME(),
@data.value(' (/EVENT_INSTANCE/SchemaName) [1] ', ' NVARCHAR(128) '),
@data.value(' (/EVENT_INSTANCE/EventType) [1] ', ' NVARCHAR(128) '),
@data.value(' (/EVENT_INSTANCE/ObjectType) [1] ', ' NVARCHAR(30) '),
@data.value(' (/EVENT_INSTANCE/EventType) [1] ', ' NVARCHAR(30) '),
@data.value(' (/EVENT_INSTANCE/TSQLCommand/CommandText) [1] ', ' NVARCHAR(2000) '),
CHANGE_TRACKING_CURRENT_VERSION());
END
GRANT VIEW DEFINITION ON DATABASE::[YourDatabaseName] TO [CdcUserName];