Enable change tracking in Microsoft SQL Server
The Microsoft SQL Server connector is in Public Preview.
This page provides instructions to enable change tracking in Microsoft SQL Server. You must enable either change tracking or change data capture (CDC) to use the SQL Server connector. Databricks recommends using change tracking for any table that has a primary key to minimize the load on the source database. If change tracking and CDC are both enabled, the connector uses change tracking. For guidance on which option to choose, see Change tracking vs. change data capture.
Enable change tracking on a database
Run the following, replacing <database-name>
with the name of the database that you want to enable change tracking on. Set CHANGE_RETENTION
to the maximum time that the gateway is likely to be down. After this amount of time has passed, a full refresh is required to resume the gateway.
ALTER DATABASE <database-name> SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 3 DAYS, AUTO_CLEANUP = ON)
Enable change tracking on a table
Run the following, replacing <schema-name>.<table-name>
with the name of the schema and table that you want to enable change tracking on.
ALTER TABLE <schema-name>.<table-name> ENABLE CHANGE_TRACKING
Grant VIEW CHANGE TRACKING on ingested tables
In addition to the privileges described in the source setup, the database user needs the VIEW CHANGE TRACKING
privilege on the ingested tables or on a schema that contains tables being tracked.
To grant schema-level permissions, run the following:
GRANT VIEW CHANGE TRACKING ON SCHEMA::<schema-name> TO <database-user>;
To grant table-level permissions, run the following:
GRANT VIEW CHANGE TRACKING ON OBJECT::<schema-name>.<table-name> TO <database-user>;
Set up DDL capture and schema evolution
The SQL Server connector can track the data definition language (DDL) on ingested database objects and apply relevant table schema changes to the destination tables or add new tables in case of full schema replication.
To perform DDL capture, additional database object setup is required (for example, internal tables, stored procedures, and triggers). The Transact-SQL (T-SQL) script provided in this article (ddl_support_objects.sql
) drops any pre-existing DDL support objects and creates the DDL support objects required to capture DDL changes happening on the database.
-
Download the ddl_support_objects.sql script.
-
Modify the script to set the
mode
value:BOTH
: Initializes both CT and CDC objects (default)CT
: Initializes CT objectsCDC
: Initializes CDC objectsNONE
: Deletes all pre-existing CT and CDC objects
-
(Recommended) Optionally, modify the script to set the
replicationUser
variable to your SQL Server database user.If
replicationUser
is defined, the script grants all privileges that are required to work with the DDL support objects to the user. Otherwise, you must grant each privilege manually. -
Run the script on each database that you want to ingest.
importantDo not run the script on the master database.
If you run the script in a third-party tool, select the entire script before you run it.
Change tracking privilege requirements
If you set the replicationUser
variable in the script, the script grants the required privileges on the DDL support objects to the database user. The required privileges are:
VIEW CHANGE TRACKING
on thelakeflowDdlAudit_1_1
objectVIEW DEFINITION
on the database that you want to ingest
If replicationUser
is unset in the script, you must manually grant the required privileges for change tracking. To do this, run the following T-SQL commands, replacing <database-user>
:
GRANT VIEW CHANGE TRACKING ON OBJECT::dbo.lakeflowDdlAudit_1_1 TO <database-user>;
GRANT VIEW DEFINITION ON DATABASE::<database-name> TO <database-user>;