Enable change tracking in SQL Server

Preview

LakeFlow Connect is in gated Public Preview. To participate in the preview, contact your Databricks account team.

This article describes how to enable change tracking in SQL Server. Either change tracking or CDC is required for ingestion into Databricks. For guidance on which option to choose, see Change tracking vs. change data capture.

Enable change tracking on a database

Run the following, replacing <database-name> with the name of the database you want to enable change tracking on. Set CHANGE_RETENTION to the maximum time that the gateway is likely to be down. After this amount of time has passed, a full refresh is required to resume the gateway.

ALTER DATABASE <database-name> SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 3 DAYS, AUTO_CLEANUP = ON)

Enable change tracking on a table

Run the following, replacing <schema-name>.<table-name> with the name of the schema and table you want to enable change tracking on.

ALTER TABLE <schema-name>.<table-name> ENABLE CHANGE_TRACKING

Grant VIEW CHANGE TRACKING on ingested tables

In addition to the privileges described in the source setup, the database user needs the VIEW CHANGE TRACKING privilege on the ingested tables or on a schema that contains tables being tracked.

To grant schema-level permissions, run the following:

GRANT VIEW CHANGE TRACKING ON SCHEMA::<schema-name> TO <cdc-username>;

To grant table-level permissions, run the following:

GRANT VIEW CHANGE TRACKING ON OBJECT::<schema-name>.<table-name> TO <cdc-username>;

Set up DDL capture and schema evolution

The SQL Server connector can track the data definition language (DDL) on ingested database objects and apply relevant table schema changes to the destination tables or add new tables in case of full schema replication.

To perform DDL capture, additional database object setup is required (for example, internal tables, stored procedures, and triggers). There are two ways to do this:

  • Automatic: Requires additional privileges to be granted to the database user configured for ingestion into Databricks.

  • Manual: Choose this if you want a read-only user to perform DDL capture and schema evolution. Requires manual creation of database objects.

Option 1: Automatic setup for DDL capture

Run the following commands to grant automatic DDL capture privileges to the database user configured for ingestion into Databricks:

  • CREATE PROCEDURE on the database:

    GRANT CREATE PROCEDURE TO <cdc-username>;
    
  • CREATE TABLE on the database:

    GRANT CREATE TABLE TO <cdc-username>;
    
  • SELECT, EXECUTE, and INSERT on the schema:

    GRANT SELECT,INSERT,EXECUTE ON SCHEMA::dbo TO <cdc-username>;
    
  • ALTER on the database:

    GRANT ALTER ON DATABASE::<database-name> TO <cdc-username>;
    
  • ALTER on the schema or on all tables to ingest:

    GRANT ALTER ON SCHEMA::[<schema-name>] TO [<cdc-username>];
    

Option 2: Manual setup for DDL capture

If you want a read-only user to perform DDL capture and schema evolution, run the following commands to manually set up DDL capture:

CREATE TABLE "dbo"."replicate_io_audit_ddl_1"(
    "SERIAL_NUMBER" INT IDENTITY NOT NULL,
    "CURRENT_USER" NVARCHAR(128),
    "SCHEMA_NAME" NVARCHAR(128),
    "TABLE_NAME" NVARCHAR(128),
    "TYPE" NVARCHAR(30),
    "OPERATION_TYPE" NVARCHAR(30),
    "SQL_TXT" NVARCHAR(2000),
    "LOGICAL_POSITION" BIGINT,
    CONSTRAINT "replicate_io_audit_ddlPK" PRIMARY KEY("SERIAL_NUMBER","LOGICAL_POSITION")
)
ALTER TABLE dbo.replicate_io_audit_ddl_1 ENABLE CHANGE_TRACKING;
GRANT VIEW CHANGE TRACKING ON OBJECT::dbo.replicate_io_audit_ddl_1 TO [CdcUserName];
CREATE TRIGGER "replicate_io_audit_ddl_trigger_1"
            ON DATABASE
            AFTER ALTER_TABLE
            AS
            SET NOCOUNT ON;
        DECLARE @DbName nvarchar(255), @SchemaName nvarchar(max), @TableName nvarchar(255), @data XML, @operation NVARCHAR(30), @isCTEnabledDBLevel bit, @isCTEnabledTableLevel bit;
        SET @data = EVENTDATA();
        SET @DbName = DB_NAME();
        SET @SchemaName = @data.value(' (/EVENT_INSTANCE/SchemaName) [1] ', ' NVARCHAR(MAX) ');
        SET @TableName = @data.value(' (/EVENT_INSTANCE/ObjectName) [1] ', ' NVARCHAR(255) ');
        SET @operation = @data.value(' (/EVENT_INSTANCE/EventType) [1] ', ' NVARCHAR(30) ');
        SET @isCTEnabledDBLevel =
                (SELECT COUNT(*) FROM sys.change_tracking_databases WHERE database_id = DB_ID(@DbName));
        SET @isCTEnabledTableLevel = (SELECT COUNT(*)
                                      FROM sys.change_tracking_tables
                                      WHERE object_id = object_id('[' + @SchemaName + '].[' + @TableName + ']'));
        IF (@isCTEnabledDBLevel = 1 AND @isCTEnabledTableLevel = 1)
            BEGIN
                INSERT INTO dbo.replicate_io_audit_ddl_1 ("CURRENT_USER", "SCHEMA_NAME", "TABLE_NAME", "TYPE",
                                                          "OPERATION_TYPE", "SQL_TXT", "LOGICAL_POSITION")
                VALUES (SUSER_NAME(),
                        @data.value(' (/EVENT_INSTANCE/SchemaName) [1] ', ' NVARCHAR(128) '),
                        @data.value(' (/EVENT_INSTANCE/EventType) [1] ', ' NVARCHAR(128) '),
                        @data.value(' (/EVENT_INSTANCE/ObjectType) [1] ', ' NVARCHAR(30) '),
                        @data.value(' (/EVENT_INSTANCE/EventType) [1] ', ' NVARCHAR(30) '),
                        @data.value(' (/EVENT_INSTANCE/TSQLCommand/CommandText) [1] ', ' NVARCHAR(2000) '),
                        CHANGE_TRACKING_CURRENT_VERSION());
            END
GRANT VIEW DEFINITION ON DATABASE::[YourDatabaseName] TO [CdcUserName];