SQL Server 2016: Temporal Tables. Introduction
SQL Server 2016 CTP2 finally available for the public and it brings many new features, which are worth checking. However, one of those new initiatives, I believe, can bring some discussions in DWH/BI environments. The name of it is Temporal table.
Foreword
It seems, the feature can bring another naming confusion. Currently SQL Server ships temporary tables and table variables. But, with a new release, temporal table is a new guy in a toolbox, however, the usage of the new tool is completely different:
A temporal table is a table for which a PERIOD definition exists and which contains system columns with a datatype of datetime2 into which the period of validity is recorded by the system, and which has an associated history table into which the system records all prior versions of each record with their period of validity. With a temporal table, the value of each record at any point in time can be determined, rather than just the current value of each record. A temporal table is also referred to as a system-versioned table. Read more in online documentation.
Playground preparation
First of all, a new database has to be created with a compatibility level SQL 2016 (or 130):
-- step 1: Create database, tables and seed dummy data CREATE DATABASE SQL2016Demo; GO USE SQL2016Demo; CREATE TABLE dbo.Product ( ProductID INT IDENTITY PRIMARY KEY, ProductName VARCHAR(100) NOT NULL ) CREATE TABLE dbo.Project ( ProjectID INT IDENTITY PRIMARY KEY, ProjectName VARCHAR(100) NOT NULL, ProductID INT FOREIGN KEY REFERENCES dbo.Product(ProductID) ) GO INSERT INTO dbo.Product (ProductName) VALUES ('Product A'); INSERT INTO dbo.Project (ProjectName, ProductID) VALUES ('Project A' , ( SELECT ProductID FROM Product WHERE ProductName = 'Product A' ) ); SELECT * FROM dbo.Product; SELECT * FROM dbo.Project; GO
As a result, very simple data model prepared with Just one row per entity:
Meet a new friend – “system versioning”
The most important part starts by executing the following block of the code:
-- step 2: Enable system versioning BEGIN TRAN -- Product ALTER TABLE dbo.Product ADD PERIOD FOR SYSTEM_TIME (Valid_From, Valid_Till), Valid_From datetime2 GENERATED ALWAYS AS ROW START NOT NULL DEFAULT GETUTCDATE(), Valid_Till datetime2 GENERATED ALWAYS AS ROW END NOT NULL DEFAULT CONVERT(DATETIME2, '9999-12-31 23:59:59.9999999'); ALTER TABLE dbo.Product SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Product_History, DATA_CONSISTENCY_CHECK = ON)) -- Project ALTER TABLE dbo.Project ADD PERIOD FOR SYSTEM_TIME (Valid_From, Valid_Till), Valid_From datetime2 GENERATED ALWAYS AS ROW START NOT NULL DEFAULT GETUTCDATE(), Valid_Till datetime2 GENERATED ALWAYS AS ROW END NOT NULL DEFAULT CONVERT(DATETIME2, '9999-12-31 23:59:59.9999999'); ALTER TABLE dbo.Project SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Project_History, DATA_CONSISTENCY_CHECK = ON)) COMMIT TRAN GO -- check table content again: SELECT * FROM dbo.Product; SELECT * FROM dbo.Project;
As you can see, two system-maintained columns were added to both tables:
Yes, that is right, since columns are system-maintained they hold non-static values and SQL Server will keep them actual on every DML operations. It can be shown in the following example:
-- step 3: update data in the tables BEGIN TRAN UPDATE dbo.Product SET ProductName = 'Product B' WHERE ProductID = 1 INSERT INTO dbo.Project (ProjectName, ProductID) VALUES ('Project B', 1 ); COMMIT TRAN SELECT * FROM dbo.Product; SELECT * FROM dbo.Project;
As a result, the name of the product changed to “Product B” and the new project created. While data in a second dataset looks valid and both operations are reflected, data of the first dataset represents the current state only, in other words, the previous version of the row is not appearing. The reason is simple – by default, only current and active members are queryable. If the previous version of the record has to be retrieved and shown, special construction can be used as a part of the FROM clause:
-- Step 4: Query historic data SELECT * FROM dbo.Product FOR SYSTEM_TIME FROM '2000-01-01' to '9999-12-31' Order by Valid_From;
This time, the result looks like expected and both versions of the row retrieved.
Making it work
Let’s put all information together and solve the simple task: perform a join of entities and retrieve historic information, so data should be valid for some particular point in time. The generic query looks a little bit overwhelmed, since sub-queries to versioned data written as derived tables. The first dataset is retrieving the state of data at the moment of initial seeding:
DECLARE @dtPointInHistory DATETIME2 = '2020-02-13 14:36:48' SELECT prj.ProjectID, prj.ProjectName, prd.ProductName -- wrapper for a project FROM ( SELECT ProjectID, ProjectName, ProductID FROM dbo.Project FOR SYSTEM_TIME AS OF @dtPointInHistory ) prj -- wrapper for a product JOIN ( SELECT ProductID, ProductName FROM dbo.Product FOR SYSTEM_TIME AS OF @dtPointInHistory ) prd ON prj.ProductID = prd.ProductID
The result looks correct, since only one row per entity had added and product still had name: “Product A”.
To retrieve the actual state of data, current timestamp assigned to the variable:
DECLARE @dtPointInHistory DATETIME2 = GETUTCDATE()
The result looks as expected again – product name changed and two projects assigned to that single product.
In the next post, I going to cover how temporal tables can deal with two other common potential pain points: model schema change and performance.