1. CodersCay »
  2. SSAS »
  3. Different Types of SCD (Slowly Changing Dimensions) with example

Published On: 6/29/2022

CodersCay Logo

Different Types of SCD (Slowly Changing Dimensions) with example

What is SCD (Slowly Changing Dimension) and types in Data Warehouse

Slowly Changing Dimensions (SCD) are actual dimensions in data warehouse database and mainly used to maintaining or tracking different level of slowly changeable data from source.
Types of SCD (Slowly Changing Dimensions)

Data warehouse is mainly focused to analyse the business statistics by reports which holds multiple subset information about the business such as products, locations, employees, customers, sales and so on. The data warehouse database consist of two objects such as Dimensions and Facts and those data actually coming from real time database (OLTP - OnLine Analytical Processing). Dimensions are hold textual description and Fact table contains measures of sales records. In order to view the historical report from data warehouse database we need to track the data changes in the dimension attributes. Say for example a customer has relocated his residential to many places, he has ordered the products from each location and now we have to analyse the sales metrics for the each location. So we need slowly changing dimension concept to identify the data changes from the source and we have six different options to track the changes.

SCD TypeDescription
Type - 0Retaining Original Data
Type - 1Overwriting on Existing Data
Type - 2Adding new records on same table with flag
Type - 3Adding new column on same table
Type - 4Adding new records on new table (History Table)
Type - 6Hybrid approach (Type 1 + Type 2 + Type 3)

Each type is having some advantage and disadvantages.

SCD Type 0: Retaining Original

This dimension attributes never change and holds the static data as it was inserted at first time and the remaining attributes may be overwritten. For example, if we are handling customer data in our database and few data would not be changed after insertion such as Date of Birth and SSN number. This type mostly preferred for Date dimension attributes because the attributes value would not be modified after insertion.

SCD Type 1: Overwrite on existing

This type overwrites the old data with new data and this will not track the historical data in the dimension.

For Instance in customer dimension, the customer may change his residential from one place to another and the address information will be tracked based on the transferential.

Old Data
Cust_IDCustomer NameAddress LineCityZip CodeState
1001BaskaranR122, East West StreetChennai600001TamilNadu
1002George13A, Church Circle RoadChennai600051TamilNadu
1xxx-------------------------------------

New Data
Cust_IDCustomer NameAddress LineCityZip CodeState
1001BaskaranR521, North South StreetChennai600002TamilNadu
1002George13A, Church Circle RoadChennai600051TamilNadu
1xxx-------------------------------------

In the above example, the customer has relocated from one place to another and the recent address has been overwritten on the existing records. But the negative thing is we cannot retrieve his previous address information from this scenario.

SCD Type 2: Adding new Records

This type is satisfying the previous minus that track the historical data by creating new entry on the same dimension table. The recent and previous data will be identified by flag or start date and end date fields on same dimension table.

Cust_IDCustomer NameAddress LineCityZip CodeStateActive
1001BaskaranR122, East West StreetChennai600001TamilNaduY
1001BaskaranR521, North South StreetChennai600002TamilNaduN
1002George13A, Church Circle RoadChennai600051TamilNaduY
1xxx-------------------------------------Y

From the above example, the customer recent address has been inserted as new record and previous address retained in the same table. Both records are identified by using active flag and the flag Y is indicating the customer current address and N is representing the previous address.

SCD Type 3: Adding new Fields

This type tracks the recent changes in separate field on same table and this type is preserving limited history of data.

Cust_IDCustomer NameCurrent AddressPrevious AddressCityZip CodeState
1001BaskaranR122, East West Street521, North South StreetChennai600001TamilNadu
1002George13A, Church Circle RoadChennai600051TamilNadu
1xxx-------------------------------------

Based on the above instance, the previous address has been moved from current address and the recent address placed on current address field. This type would track the most recent historical changes and we cannot retrieve few old records from this scenario.

SCD Type 4: Adding new Table (History Table)

This type of tracks having separate table to track the changes from source and usually this method called as history table. We can retrieve all the history changes from this table if required. This method is advantageous for high volatile or frequently used attributes in very large size dimension.

Customer Table
Cust_IDCustomer NameAddress LineCityZip CodeState
1001BaskaranR122, East West StreetChennai600001TamilNadu
1002George13A, Church Circle RoadChennai600051TamilNadu
1xxx-------------------------------------

Customer History Table
Cust_IDCustomer NameAddress LineCityZip CodeStateCreateDate
1001BaskaranR521, North South StreetChennai600001TamilNadu25-JUN-2013

Based on the above example, the historical data will be tracked on separate table which holds recent and previous data identified by date field.

SCD Type 5: (Type 4 + Type 1)

This type is combination of type 1, type 4 methods and type 1 dimension table on top of the type 4 design of base and history dimension. The Type 1 dimension table is joined directly to the base dimension without having to join directly to the fact table. This will hold current state attributes such as current address for the customer. This constrains the ETL team to update this Type 1 table as opposed to the whole base table on any changes.

SCD Type 6: Hybrid Approach (Type 1 + Type 2 + Type 3)

It an another approach to handle the data changes from source which is hybrid of type 1, type 2 and type 3 methods.

Cust_IDCustomer NameAddress LineCityZip CodeStateEffectiveFromEffectiveToIsActive
1001BaskaranR122, East West StreetChennai600001TamilNadu26-JUN-2013Y
1001BaskaranR521, North South StreetChennai600001TamilNadu10-AUG-200925-JUN-2013N
1002George13A, Church Circle RoadChennai600051TamilNadu6-SEP-2009Y
1xxx--------------------------------xx-xxx-xxxxY

As shown in above example, the history data tracked in new row which is identified by start date and end date fields. Additionally we can have isActive flag field to identify the current record from the list. We can maintain a history of all the changes whilst simultaneously updating the current value on existing records.


1 comment: