Title: Type 2 Slow Changing Dimension Tree
design pattern data structure:
Version: 1.0.0109.01
– January 9, 2018.
Objective: Architect,
design, and implementation review for type 2 Slow Changing Dimension (SCD) Tree
design pattern data structure. It has a simple Parent and Child input
SharePoint list. Uses SharePoint List OData ETL Extract, SQL ETL, and DAX
Tabular Cube in SQL DataStore Enterprise Data Warehouse (EDW).
Article Body: Very recently, I faced the business need to implement a solution collection to
support type 2 SCD Tree design pattern data structure. The requirements are as
follows.
-
Minimum to no data
stewards support to manipulate the data or report after deployment.
-
Repeatable across
different parent-child Tree structure data sets. E.g., Customer Hierarchy,
Sales Hierarchy, Region Hierarchy, Product Configuration Hierarchy, Territory
Manager Hierarchy, etc.
-
Proof of concept must
be applied to a practical business need.
-
Use SQL environment
(SSIS, SSAS, and SSRS), Tabular Cube, and Corporate SharePoint.
-
Supports
incremental builds based on the scheduled execution frequency.
In this
article, the readers should be knowledgeable about Tree Data Structure, Parent Child Adjacency List, SharePoint List, Microsoft Visual Studio (VS), SQL Server Integration Services
(SSIS), SQL Analysis Server (SSAS), SQL Tabular Modeling (SSAS), and Data Analysis eXpressions (DAX) language.
Utilized
Parent Child Adjacency SharePoint list to communicate the tree hierarchy
between end users and the EDW system. The SharePoint list is maintained by Data
Stewards. The list has limited access rights constraints controlled by SharePoint
security and the site administrator. The list uses SharePoint list row IDs to
minimize users’ impacts. Data Stewards edit and view texts (e.g., Customer Name).
The SharePoint list drop down assists to identify the Child node’s Parent for a
given row. Parent Child Adjacency SharePoint list maintenance is made easier. The
SharePoint list IDs, are hidden from the Data Stewards. The SharePoint list IDs
communicate the data structure’s details to EDW. SharePoint list configuration is out of scope
of this article.
I outlined
how to use OData
technology to ETL extract the SharePoint list data from SharePoint to EDW Data
Store in the following article. It is a generic solution that can be applied to
any tree data structure data set.
For
practical application, I am applying the type 2 SCD Tree design pattern data
structure to Customer Hierarchy. Customer Hierarchy is a tree structure
summarizing sales based on Customer Type, e.g., Domestic, International,
School, Governments. I refer to Type 2 SCD Customer Hierarchy as CHN (Customer
Hierarchy Normalized).
<Aside>
Normalized! This is a complex algorithm; a solution collection across multiple
technologies. There are no off-the-shelves solution I have found, yet. Type 2
SCD Customer Hierarchy at the end of my design is a scalable table with
incremental add-on rows. The incremental add-ons are based on frequency of
production execution schedule. The incremental add-on makes it normalized. A
Tabular Cube DAX dynamic Type 2 SCD Child-Parent resolve; built on
off-the-shelves Type 2 SCD practices will work, too. Thereafter, the data presentation
layer, in our case power BI and SSRS, must perform the heavy lifting that I perform
in SQL. However, presentation layer solution requires different business
requirements.
There
might be multiple design strategies to implement CHN. The following is my
current solution collection. Please feel free to share your inputs and
potential enhancements.
Best
practices adhere to as follows.
-
Dynamic
table creation in memory can become performance disadvantaged very fast with
large tables. Use temp tables as much as possible to minimize SQL execution
time memory demand.
-
Create
and destroy temp tables in the same work flow.
-
Have
temp tables in their own database. In this project referred to
“DataStore_Utility” database.
-
Keep
production DBA happy.
The CHN
table structure as follows.
CREATE TABLE [DataStore].[dbo].[CustomerHierarchyNormalized](
[NameKey] [int] IDENTITY(1,1) NOT NULL, -- Key
reference for Name
[ParentNameKey] [int] NULL, --
Key reference for Parent Name
[Name] [nvarchar](100) NULL, -- Name
[NameID] [int] NULL, -- Name ID assigned by
SharePoint
[ParentNameID] [int] NULL, --
Parent Name ID using SharePoint ID.
[NameCustomerAccountID] [nvarchar](20) NULL, -- A data steward and fact table reference
[Effectivity] [date] NULL, --
Start date for the valid information. Part of Type 2 SCD need.
[Expiry] [date] NULL) -- End date for the
valid information. Part of Type 2 SCD need.
ETL
Extract the SharePoint adjacency list assigned ID to staging and data store. They will be
needed during normalization, good source of debugging, and production issues
route cause analysis.
Sample data
as follows.
The
solution has three major components.
-
SharePoint
and EDW staging ETL Extract, covered in my prior article. Out of scope of this
article.
-
Data
Store ETL
-
Tabular
Cube model
Data Store ETL
VS
preparation steps.
1.
If
you do not have a project collection in VS solution, add a new Business
Intelligence, Integration, Integration Service, and name it
“<YourCompanyName>.EDW.ETL.Datastore”. In my case it is
“Adec.EDW.ETL.Datastore”.
2.
Add
a “CustomerHierarchyNormalized.dtsx” project in “Adec.EDW.ETL.Datastore”.
3.
Add
all the needed Connection Managers.
4.
Henceforth,
perform all the tasks in VS “CustomerHierarchyNormalized.dtsx” project.
The data
store ETL solution has multiple steps.
1-
Prepare
Temp Tables
2-
Track
changes and ETL Extract the Adjacency SharePoint list data
3-
Update
information for changes
4-
Get
the changes for descendants
5-
Update
the CHN table
6-
Clean
up
1-
Prepare
Temp Tables.
a.
Add
Sequence Container. Rename it “Prepare Temp Tables Sequence Container”. In the
Sequence Container add four Execute SQL Tasks. Rename them appropriately.
b. In each Execute
SQL Tasks connect to “DataStore_Utility” database. Copy and paste the following
code in the SQL Tasks, appropriately.
1st – Create a table to have initial CHM NameKey.
IF
OBJECT_ID('dbo.zTmp_CustomerHierarchyNormalized_NameKey')IS NOT NULL
BEGIN
TRUNCATE
TABLE [dbo].[zTmp_CustomerHierarchyNormalized_NameKey];
END
ELSE
BEGIN
CREATE
TABLE [dbo].[zTmp_CustomerHierarchyNormalized_NameKey](
[NameKey] INT NULL)
ON
[PRIMARY];
END
Next
- Create a table to have the added / changed CHM NameKey.
IF OBJECT_ID('dbo.zTmp_CustomerHierarchyNormalized_Added')IS
NOT NULL
BEGIN
TRUNCATE
TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Added];
END
ELSE
BEGIN
CREATE
TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Added](
[NameKey] INT NULL)
ON
[PRIMARY];
END
Next – Create a table to contain the added / changed CHN information.
IF
OBJECT_ID('dbo.zTmp_CustomerHierarchyNormalized_Updates')IS NOT NULL
BEGIN
TRUNCATE TABLE
[dbo].[zTmp_CustomerHierarchyNormalized_Updates];
END
ELSE
BEGIN
CREATE TABLE
[dbo].[zTmp_CustomerHierarchyNormalized_Updates](
[NameKey] [int]
NOT NULL,
[ParentNameKey]
[int] NULL,
[Name]
[nvarchar](100) NULL,
[NameID] [int]
NULL,
[ParentNameID]
[int] NULL,
[NameCustomerAccountID]
[nvarchar](20) NULL,
[Effectivity]
[date] NULL,
[Expiry] [date]
NULL)
ON [PRIMARY];
END
Next - Create a table to contain the added / changed CHN descendants.
IF OBJECT_ID('dbo.zTmp_CHN_Updates_Descendants')IS NOT NULL
BEGIN
TRUNCATE TABLE
[dbo].[zTmp_CHN_Updates_Descendants];
END
ELSE
BEGIN
CREATE TABLE
[dbo].[zTmp_CHN_Updates_Descendants](
[NameKey] [int]
NOT NULL,
[ParentNameKey]
[int] NOT NULL,
[Name]
[nvarchar](100) NULL,
[NameID] [int]
NULL,
[ParentNameID]
[int] NULL,
[NameCustomerAccountID]
[nvarchar](20) NULL,
[Effectivity] [date]
NULL,
[Expiry] [date]
NULL)
END
2-
Track
changes and ETL Extract the Adjacency SharePoint list data
a. Track the CHM present state.
After Sequence Counter add a Data Flow Task and rename it “Track Existing
CustomerHierarchyNormalized Data Flow Task”.
Hint:
The temp tables are created
during run time. Therefore, all the SSIS Data Flow interacting with temp tables
“Delay Validation” proprieties are set to True.
Add an OLE DB Source. Rename
it “Set the NameKey OLE DB Source”. With the OLE DB connection manager set to
Datastore, select SQL command, and copy / paste the following code in SQL
command.
SELECT [NameKey]
FROM
[dbo].[CustomerHierarchyNormalized]
Add an OLA DB Destination.
Rename it “Insert Destination”. Connect the “Set the NameKey OLE DB Source” to
“Insert Destination”. Save the information in the temp table
[dbo].[zTmp_CustomerHierarchyNormalized_NameKey].
b. Get the SharePoint adjacency list and persist the data steward’s changes from the previous
execution time. In the DataStore_Utility create a view, as follows.
CREATE
VIEW [dbo].[UpdateCustomerHierarchyNormalizedInit]
AS
/*** Get Parent and Child information for
CustomerHierarchyNormalized from
SharePoint List ***/
Select
cast(CustomerHierarchy.[Name]
as nvarchar(100)) AS 'Name'
, CustomerHierarchy.[ID] AS
'NameID'
, CASE WHEN parent.[Name] in
('-1', 'All') THEN NULL
ELSE
CustomerHierarchy.[ParentNameID]
END AS [ParentNameID]
,cast(CustomerHierarchy.[CustomerAccount]
AS nvarchar(20)) AS 'NameCustomerAccountID'
FROM [DataStore_Staging].[sp].[CustomerHierarchy]
CustomerHierarchy
inner join
[DataStore_Staging].[sp].[CustomerHierarchy] parent
on
parent.ID=CustomerHierarchy.ParentNameID
where CustomerHierarchy.Name not in
('All')
GO
<Aside> ‘-1’ and ‘All’ are
used in SharePoint
adjacency list to assist the data stewards visually and minimize errors.
While it is needed for legacy type 2 SCD, there is no need for them in this
design for the presentation layers. The users prefer not to see ‘-1’
and ‘All’ in the reports, Power BI, and Pivot tables.
After
“Track Existing CustomerHierarchyNormalized Data Flow Task” add a “Get
SharePoint Data Flow Task SCD”.
Add an OLE
DB Source. Rename it “CustomerHierarchyNormalized Row Count OLE DB Source”.
With the OLE DB connection manager set to DataStore_Utility, select SQL command, and
copy / paste the following code in SQL command.
SELECT [Name]
,[NameID]
,[ParentNameID]
,[NameCustomerAccountID]
FROM
[dbo].[UpdateCustomerHierarchyNormalizedInit]
After
“CustomerHierarchyNormalized Row Count OLE DB Source” add SSIS tool Slowly
Changing Dimension. Its Insert Destination is the
[DataStore].[dbo].[CustomerHierarchyNormalized].
c. Track the added / change CHN
data. After “Get SharePoint Data Flow Task SCD” add a Data Flow Task and rename
it “Track Added CustomerHierarchyNormalized Data Flow Task”.
Add an OLE
DB Source. Rename it “Set the NameKey OLE DB Source”. With the OLE DB
connection manager set to Datastore_Utility, select SQL command, and copy /
paste the following code in SQL command.
SELECT NowList.[NameKey]
FROM
[dbo].[zTmp_CustomerHierarchyNormalized_NameKey] AS InitList
INNER JOIN
[DataStore].[dbo].[CustomerHierarchyNormalized] AS NowList
ON
NowList.NameKey=InitList.[NameKey]
where InitList.[NameKey] != NowList.NameKey
Add an OLA DB Destination.
Rename it “Insert Destination”. Connect the “Set the NameKey OLE DB Source” to
“Insert Destination”. Save the information in the temp table [dbo].[zTmp_CustomerHierarchyNormalized_Added].
At this point during the
package execution the following temp table contains the NameKey of the added
changes to Customer Hierarchy Normalized table. Deleted rows are handled with
Expiry date. There are no delete rows in the SCD structure. Will address updates
at later time in this article.
[DataStore_Utility].[dbo].[zTmp_CustomerHierarchyNormalized_Added]
3-
Update
information for changes has two components.
a.
The
added rows do not have the correct ParentNameKey. In this step, update the
ParentNameKey in the following table.
[DataStore].[DBO].[CustomerHierarchyNormalized]
In the DataStore_Utility data base
create a new Stored Procedure as follows.
USE [DataStore_Utility]
GO
CREATE PROCEDURE
[dbo].[sp_CustomerHierarchyNormalized_UpdateAddedParentNameKey]
AS
UPDATE Child
set ParentNameKey
= Parent.NameKey
FROM
[DataStore].[DBO].[CustomerHierarchyNormalized]
Child
INNER
JOIN [DataStore].[DBO].[CustomerHierarchyNormalized] Parent
ON Child.ParentNameID =
Parent.NameID
INNER
JOIN [dbo].[zTmp_CustomerHierarchyNormalized_Added] Added
ON
Added.NameKey=Child.NameKey
GO
In VS after “Track
Added CustomerHierarchyNormalized Data Flow Task” add “Update Added
ParentNameKey Execute SQL Task”. Edit the SQL task and set the SQL Statement as
follows.
EXEC [dbo].[sp_CustomerHierarchyNormalized_UpdateAddedParentNameKey];
b.
Create
a new view as follows.
USE [DataStore_Utility]
GO
CREATE VIEW [dbo].[UpdateCustomerHierarchyNormalizedUpdates]
AS
SELECT
CHNAdded.[NameKey]
,[ParentNameKey]
,[Name]
,[NameID]
,[ParentNameID]
,[NameCustomerAccountID]
,[Effectivity]
,[Expiry]
FROM
[DataStore].[dbo].[CustomerHierarchyNormalized] AS CHN
INNER JOIN
[dbo].[zTmp_CustomerHierarchyNormalized_Added] AS CHNAdded on
CHNAdded.[NameKey]=CHN.NameKey
GO
After “Update Added
ParentNameKey Execute SQL Task” add “Get Added CustomerHierarchyNormalized
Details Flow Task”. Add “Set the NameKey OLE DB Source”. Select
DataStore_Utility, SQL command, and copy / paste the following code.
SELECT [NameKey]
,[ParentNameKey]
,[Name]
,[NameID]
,[ParentNameID]
,[NameCustomerAccountID]
,[Effectivity]
,[Expiry]
FROM
[dbo].[UpdateCustomerHierarchyNormalizedUpdates]
After “Set the NameKey OLE DB
Source” add “OLE DB Destination” and save the data in the following table.
[DataStore_Utility].[dbo].[zTmp_CustomerHierarchyNormalized_Updates].
4-
Get
the changes for descendants. In this section, identify the descendants for the
added changes. In the DataStore_Utility data base create a new Stored Procedure as
follows.
USE [DataStore_Utility]
GO
CREATE PROCEDURE
[dbo].[sp_CustomerHierarchyNormalized_AddedUpdatesDescendants]
AS
DECLARE @SCDDate
datetime = getdate();
WITH OrgUpdates AS
(
--Our
Changed Records
SELECT
Updates.[NameKey]
,Updates.[ParentNameKey]
,Updates.[Name]
,Updates.[NameID]
,Updates.[ParentNameID]
,Updates.[NameCustomerAccountID]
,Updates.[Effectivity]
,Updates.[Expiry]
FROM
[dbo].[zTmp_CustomerHierarchyNormalized_Updates] AS Updates
LEFT
JOIN [DataStore].[DBO].CustomerHierarchyNormalized AS CHN
on
Updates.NameID = CHN.NameID
and
@SCDDate > CHN.Effectivity
and
ISNULL(CHN.Expiry, '') = ''
UNION
ALL
--
Find Descendants
SELECT
CHN.[NameKey]
,CHN.[ParentNameKey]
,CHN.[Name]
,CHN.[NameID]
,CHN.[ParentNameID]
,CHN.[NameCustomerAccountID]
,CHN.[Effectivity]
,CHN.[Expiry]
FROM
[DataStore].[DBO].CustomerHierarchyNormalized AS CHN
INNER
JOIN OrgUpdates AS OrgUpdates
on
CHN.ParentNameID = OrgUpdates.NameID
and
@SCDDate > CHN.Effectivity
and
ISNULL(CHN.Expiry, '') = ''
)
INSERT INTO
[dbo].[zTmp_CHN_Updates_Descendants]
SELECT DISTINCT
[NameKey]
,[ParentNameKey]
,[Name]
,[NameID]
,[ParentNameID]
,[NameCustomerAccountID]
,[Effectivity]
,[Expiry]
FROM OrgUpdates
Delete CUD
from
[dbo].[zTmp_CHN_Updates_Descendants] CUD
Inner JOIN
[dbo].[zTmp_CustomerHierarchyNormalized_Updates] CHN on CHN.NameKey=CUD.NameKey
GO
In VS after “Get Added
CustomerHierarchyNormalized Details Flow Task” add “Get Added
Updates_Descendants Details Flow Task”. Edit the SQL task and set the SQL
Statement as follows.
EXEC
[dbo].[sp_CustomerHierarchyNormalized_AddedUpdatesDescendants];
5-
Update
the CHN table. There are three steps.
a.
Expire
effected data in CHN table. In the DataStore_Utility data base create a new Stored Procedure as
follows.
USE [DataStore_Utility]
GO
CREATE PROCEDURE
[dbo].[sp_CustomerHierarchyNormalized_ExireEffected]
AS
DECLARE @SCDDate
datetime = getdate();
UPDATE CHN
SET CHN.[Expiry] =
@SCDDate
FROM
[dbo].[zTmp_CHN_Updates_Descendants] AS CPD
INNER
JOIN [DataStore].[DBO].CustomerHierarchyNormalized AS CHN on CPD.NameKey =
CHN.NameKey
and
@SCDDate > CHN.Effectivity
and
ISNULL(CHN.Expiry, '') = ''
UPDATE
[dbo].[zTmp_CHN_Updates_Descendants]
SET [Effectivity]
= @SCDDate
GO
In VS after “Get Added
Updates_Descendants Details Flow Task” add “Expire Effected ROWS Execute SQL
Task”. Edit the SQL task and set the SQL Statement as follows.
EXEC
[dbo].[sp_CustomerHierarchyNormalized_ExireEffected];
b.
Insert
New CustomerHierarchyNormalized Details Flow Task. After “Expire Effected ROWS
Execute SQL Task” add Data Flow “Insert New CustomerHierarchyNormalized Details
Flow Task”. Add “Get New OLE DB Source”, with DataStore_Utility and SQL command
as follows.
SELECT [ParentNameKey]
,[Name]
,[NameID]
,[ParentNameID]
,[NameCustomerAccountID]
,[Effectivity]
,[Expiry]
FROM
[dbo].[zTmp_CHN_Updates_Descendants]
Add OLE DB Destination after Get
New OLE DB Source. Set the destination to CHN table as follows and map the
fields.
[DataStore].[dbo].[CustomerHierarchyNormalized]
The Destination Name key is not
mapped to anything. It is auto incremented primary key.
c.
Updates
Current ParentNameKey Execute SQL Task. The effected rows ParentNameKey require
updates.
Add a new Stored Procedure
as follows.
USE [DataStore_Utility]
GO
CREATE PROCEDURE
[dbo].[sp_CustomerHierarchyNormalized_UpdatesParentNameKey]
AS
DECLARE @SCDDate
datetime = getdate();
UPDATE Child
SET ParentNamekey
= Parent.NameKey
FROM
[DataStore].[dbo].[CustomerHierarchyNormalized] AS Child
INNER
JOIN [DataStore].[dbo].[CustomerHierarchyNormalized] AS Parent
ON
Child.ParentNameID=Parent.NameID
WHERE @SCDDate
> Child.Effectivity
and
ISNULL(Child.Expiry, '') = ''
and
@SCDDate > Parent.Effectivity
and
ISNULL(Parent.Expiry, '') = ''
GO
After “Insert New
CustomerHierarchyNormalized Details Flow Task” add “Updates Current
ParentNameKey Execute SQL Task”. Set the SQL Statement as follows.
EXEC
[dbo].[sp_CustomerHierarchyNormalized_UpdatesParentNameKey];
6-
Clean
up
a.
Add
Sequence Container. Rename it “Clean up Sequence Container” after “Updates
Current ParentNameKey Execute SQL Task”. In the Sequence Container add four
Execute SQL Tasks. Rename them appropriately.
b. In each Execute
SQL Tasks connect to “DataStore_Utility” database. Copy and paste the following
code in the SQL Tasks, appropriately.
First – Cleanup table for initial CHM NameKey.
IF OBJECT_ID('zTmp_CustomerHierarchyNormalized_NameKey')IS
NOT NULL
DROP TABLE [dbo].[zTmp_CustomerHierarchyNormalized_NameKey];
Next
- Cleanup table that have the added / changed CHM NameKey.
IF OBJECT_ID('zTmp_CustomerHierarchyNormalized_Added')IS NOT
NULL
DROP TABLE
[dbo].[zTmp_CustomerHierarchyNormalized_Added];
Next – Cleanup table that contains the added / changed CHN information.
IF OBJECT_ID('zTmp_CustomerHierarchyNormalized_Updates')IS
NOT NULL
DROP TABLE
[dbo].[zTmp_CustomerHierarchyNormalized_Updates];
Next - Create a table to contain the added / changed CHN descendants.
IF OBJECT_ID('
zTmp_CHN_Updates_Descendant')IS NOT NULL
DROP TABLE
[dbo].[zTmp_CHN_Updates_Descendant];
For
testing, feel free to use the following data for CHN.
The “Data
Store ETL” selection is complete. Let us focus on model, next.
Tabular Cube model
The tabular
data model must be updated to include the CHN table and its appropriate
measures. In this section, let us explore the solution together.
In the
tabular model, in my case “Adec.EDW.Datastore.Tabular.Finance” import the following
table.
[DataStore].[dbo].[CustomerHierarchyNormalized]
In the
model table view add the following columns. Copy and paste the DAX code for
each added column, respectively.
Identify
the nodes path, AKA, note heritage. Add Column “Path”.
DAX Code
as follows.
=path('Customer Hierarchy Normalized'[NameKey], 'Customer
Hierarchy Normalized'[ParentNameKey])
Calculate
the depth of node’s hierarchy. Add Column “HierarchyDepth”.
DAX Code
as follows.
=PATHLENGTH('Customer Hierarchy Normalized'[Path])
The level
1 through 8 identifies the node parent. Leaf through trunk of tree. If more
than eight levels are needed additional level need to be added to the model. Add
Column “Level1”.
DAX Code
as follows.
=LOOKUPVALUE (
'Customer
Hierarchy Normalized'[Name],
'Customer
Hierarchy Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 1, 1))
Add Column
“Level2”.
DAX Code
as follows.
=IF (
'Customer Hierarchy
Normalized'[HierarchyDepth] >= 2,
LOOKUPVALUE
(
'Customer
Hierarchy Normalized'[Name],
'Customer
Hierarchy Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 2, 1)),
'Customer Hierarchy
Normalized'[Level1]
)
Add Column
“Level3”.
DAX Code
as follows.
= IF (
'Customer Hierarchy
Normalized'[HierarchyDepth] >= 3,
LOOKUPVALUE
(
'Customer
Hierarchy Normalized'[Name],
'Customer
Hierarchy Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 3, 1)),
'Customer Hierarchy
Normalized'[Level2]
)
Add Column
“Level4”.
DAX Code
as follows.
=
IF (
'Customer Hierarchy
Normalized'[HierarchyDepth] >= 4,
LOOKUPVALUE
(
'Customer
Hierarchy Normalized'[Name],
'Customer
Hierarchy Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 4, 1)),
'Customer Hierarchy
Normalized'[Level3]
)
Add Column
“Level5”.
DAX Code
as follows.
= IF (
'Customer Hierarchy
Normalized'[HierarchyDepth] >= 5,
LOOKUPVALUE
(
'Customer
Hierarchy Normalized'[Name],
'Customer
Hierarchy Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 5, 1)),
'Customer Hierarchy
Normalized'[Level4]
)
Add Column
“Level6”.
DAX Code
as follows.
= IF (
'Customer Hierarchy
Normalized'[HierarchyDepth] >= 6,
LOOKUPVALUE
(
'Customer
Hierarchy Normalized'[Name],
'Customer
Hierarchy Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 6, 1)),
'Customer Hierarchy
Normalized'[Level5]
)
Add Column
“Level7”.
DAX Code
as follows.
= IF (
'Customer Hierarchy
Normalized'[HierarchyDepth] >= 7,
LOOKUPVALUE
(
'Customer
Hierarchy Normalized'[Name],
'Customer
Hierarchy Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 7, 1)),
'Customer Hierarchy
Normalized'[Level6]
)
Add Column
“Level8”.
DAX Code
as follows.
= IF (
'Customer Hierarchy
Normalized'[HierarchyDepth] >= 8,
LOOKUPVALUE
(
'Customer
Hierarchy Normalized'[Name],
'Customer Hierarchy
Normalized'[NameKey],
PATHITEM
( 'Customer Hierarchy Normalized'[Path], 8, 1)),
'Customer Hierarchy
Normalized'[Level7]
)
Add Column
“IsLeaf”.
DAX Code
as follows.
= CALCULATE (
COUNTROWS('Customer Hierarchy Normalized'),
All('Customer Hierarchy Normalized'),
'Customer Hierarchy Normalized'[ParentNameKey] =
EARLIER('Customer Hierarchy Normalized'[NameKey])
) = 0
In the
model diagram view, build the cube hierarchy for Customer Hierarchy Normalized.
Connect the Customer Hierarchy Normalized NameKey to the date dependent fact
table key for the given row.
The fact
table lookup must be date dependent. In my case, I use the following lookup
logic.
"select chn.Namekey as CustomerHierarchyKey,
chn.Name,
chn.NameCustomerAccountID,
dd.DateKey
from [dbo].[CustomerHierarchyNormalized] chn
INNER JOIN DateDimension dd ON dd.Date >=
chn.EFFECTIVITY
AND (dd.Date < chn.EXPIRY OR chn.EXPIRY IS
NULL)
AND dd.Date <= '" +
(DT_WSTR,25)(DT_DBDATE)@[User::MaxDate] +"'
AND dd.Date >= '" +
(DT_WSTR,25)(DT_DBDATE)@[User::MinDate] +"'"
The max
and min dates are resolved during runtime. Use Expression Builder to create the
date dependent look up. Thereafter, use the variable for lookup. In my case,
User:CustomerHierarchySQLCommand.
The Event
Handlers in SSIS package are very powerful and easy to use. I use the SQL
standard system variables to track and report error bread crumbs. At some later
time, the error table can be used for fault tolerant system architecture and
system error’s alerts. At this point, it can be a good source for the
production support team to perform some route cause analysis before contacting
me.
Type-2 SCD Tree Design Pattern is needed in
multiple areas. The above solution collection is a good and solid start.
Special thanks to
Tracie Wagner with her right questions and Robert Lefebvre encouragement to follow good productionability
practices.
Happy designing. Make
the computer dance!
Enjoy what you do, you
never work a day in your life.
Cheers, Toraj
<Aside>
Jan-16-2018. Applied the same pattern to the Sales Region Hierarchy Type 2 Slow Changing Dimension (SCD). The
pattern proved to be very repeatable and scalable.
References:
- Dynamically flatten Parent-Child Hierarchies in DAX and PowerBI by The BICCOUNTANT
- Dynamic Parent Child Security using DAX, by BI Monkey
- Joining to a Slowly Changing Dimension, by Darren Gosball
- Slowly-Changing Parent-Child Dimensions – Part 2: The Implementation by Simon Whiteley.