Tuesday, January 9, 2018

Title: Type 2 Slow Changing Dimension Tree design pattern data structure:

Version: 1.0.0109.01 – January 9, 2018.

Objective: Architect, design, and implementation review for type 2 Slow Changing Dimension (SCD) Tree design pattern data structure. It has a simple Parent and Child input SharePoint list. Uses SharePoint List OData ETL Extract, SQL ETL, and DAX Tabular Cube in SQL DataStore Enterprise Data Warehouse (EDW).

Article Body: Very recently, I faced the business need to implement a solution collection to support type 2 SCD Tree design pattern data structure. The requirements are as follows.
-          Minimum to no data stewards support to manipulate the data or report after deployment.
-          Repeatable across different parent-child Tree structure data sets. E.g., Customer Hierarchy, Sales Hierarchy, Region Hierarchy, Product Configuration Hierarchy, Territory Manager Hierarchy, etc.
-          Proof of concept must be applied to a practical business need.
-          Use SQL environment (SSIS, SSAS, and SSRS), Tabular Cube, and Corporate SharePoint.
-          Supports incremental builds based on the scheduled execution frequency.

In this article, the readers should be knowledgeable about Tree Data Structure, Parent Child Adjacency List, SharePoint List, Microsoft Visual Studio (VS), SQL Server Integration Services (SSIS), SQL Analysis Server (SSAS), SQL Tabular Modeling (SSAS), and Data Analysis eXpressions (DAX) language.

Utilized Parent Child Adjacency SharePoint list to communicate the tree hierarchy between end users and the EDW system.  The SharePoint list is maintained by Data Stewards. The list has limited access rights constraints controlled by SharePoint security and the site administrator. The list uses SharePoint list row IDs to minimize users’ impacts. Data Stewards edit and view texts (e.g., Customer Name). The SharePoint list drop down assists to identify the Child node’s Parent for a given row. Parent Child Adjacency SharePoint list maintenance is made easier. The SharePoint list IDs, are hidden from the Data Stewards. The SharePoint list IDs communicate the data structure’s details to EDW.  SharePoint list configuration is out of scope of this article.

I outlined how to use OData technology to ETL extract the SharePoint list data from SharePoint to EDW Data Store in the following article. It is a generic solution that can be applied to any tree data structure data set.

For practical application, I am applying the type 2 SCD Tree design pattern data structure to Customer Hierarchy. Customer Hierarchy is a tree structure summarizing sales based on Customer Type, e.g., Domestic, International, School, Governments. I refer to Type 2 SCD Customer Hierarchy as CHN (Customer Hierarchy Normalized).

<Aside> Normalized! This is a complex algorithm; a solution collection across multiple technologies. There are no off-the-shelves solution I have found, yet. Type 2 SCD Customer Hierarchy at the end of my design is a scalable table with incremental add-on rows. The incremental add-ons are based on frequency of production execution schedule. The incremental add-on makes it normalized. A Tabular Cube DAX dynamic Type 2 SCD Child-Parent resolve; built on off-the-shelves Type 2 SCD practices will work, too. Thereafter, the data presentation layer, in our case power BI and SSRS, must perform the heavy lifting that I perform in SQL. However, presentation layer solution requires different business requirements.

There might be multiple design strategies to implement CHN. The following is my current solution collection. Please feel free to share your inputs and potential enhancements.  

Best practices adhere to as follows.
-          Dynamic table creation in memory can become performance disadvantaged very fast with large tables. Use temp tables as much as possible to minimize SQL execution time memory demand.
-          Create and destroy temp tables in the same work flow.
-          Have temp tables in their own database. In this project referred to “DataStore_Utility” database.
-          Keep production DBA happy.

The CHN table structure as follows.
CREATE TABLE [DataStore].[dbo].[CustomerHierarchyNormalized](
            [NameKey] [int] IDENTITY(1,1) NOT NULL,            -- Key reference for Name
            [ParentNameKey] [int] NULL, -- Key reference for Parent Name
            [Name] [nvarchar](100) NULL,           -- Name
            [NameID] [int] NULL, -- Name ID assigned by SharePoint
            [ParentNameID] [int] NULL,   -- Parent Name ID using SharePoint ID.
            [NameCustomerAccountID] [nvarchar](20) NULL,    -- A data steward and fact table reference
            [Effectivity] [date] NULL,       -- Start date for the valid information. Part of Type 2 SCD need.
            [Expiry] [date] NULL) -- End date for the valid information. Part of Type 2 SCD need.

ETL Extract the SharePoint adjacency list assigned ID to staging and data store. They will be needed during normalization, good source of debugging, and production issues route cause analysis.

Sample data as follows.


The solution has three major components.
-          SharePoint and EDW staging ETL Extract, covered in my prior article. Out of scope of this article.
-          Data Store ETL
-          Tabular Cube model

Data Store ETL
VS preparation steps.
1.      If you do not have a project collection in VS solution, add a new Business Intelligence, Integration, Integration Service, and name it “<YourCompanyName>.EDW.ETL.Datastore”. In my case it is “Adec.EDW.ETL.Datastore”.
2.      Add a “CustomerHierarchyNormalized.dtsx” project in “Adec.EDW.ETL.Datastore”.
3.      Add all the needed Connection Managers.
4.      Henceforth, perform all the tasks in VS “CustomerHierarchyNormalized.dtsx” project.

The data store ETL solution has multiple steps.
1-      Prepare Temp Tables
2-      Track changes and ETL Extract the Adjacency SharePoint list data
3-      Update information for changes
4-      Get the changes for descendants
5-      Update the CHN table
6-      Clean up


1-      Prepare Temp Tables.
a.       Add Sequence Container. Rename it “Prepare Temp Tables Sequence Container”. In the Sequence Container add four Execute SQL Tasks. Rename them appropriately.

b.      In each Execute SQL Tasks connect to “DataStore_Utility” database. Copy and paste the following code in the SQL Tasks, appropriately.

1st – Create a table to have initial CHM NameKey.
IF OBJECT_ID('dbo.zTmp_CustomerHierarchyNormalized_NameKey')IS NOT NULL
      BEGIN
                  TRUNCATE TABLE [dbo].[zTmp_CustomerHierarchyNormalized_NameKey];
      END
ELSE
      BEGIN
                  CREATE TABLE [dbo].[zTmp_CustomerHierarchyNormalized_NameKey](
                              [NameKey]                  INT      NULL)
                  ON [PRIMARY];
      END

 Next - Create a table to have the added / changed CHM NameKey.

IF OBJECT_ID('dbo.zTmp_CustomerHierarchyNormalized_Added')IS NOT NULL
      BEGIN
                  TRUNCATE TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Added];
      END
ELSE
      BEGIN
                  CREATE TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Added](
                              [NameKey]                  INT      NULL)
                  ON [PRIMARY];
      END

Next – Create a table to contain the added / changed CHN information.
IF OBJECT_ID('dbo.zTmp_CustomerHierarchyNormalized_Updates')IS NOT NULL
 BEGIN
  TRUNCATE TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Updates];
 END
ELSE
 BEGIN
  CREATE TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Updates](
      [NameKey] [int] NOT NULL,
      [ParentNameKey] [int] NULL,
      [Name] [nvarchar](100) NULL,
      [NameID] [int] NULL,
      [ParentNameID] [int] NULL,
      [NameCustomerAccountID] [nvarchar](20) NULL,
      [Effectivity] [date] NULL,
      [Expiry] [date] NULL)
  ON [PRIMARY];
 END

Next - Create a table to contain the added / changed CHN descendants.
IF OBJECT_ID('dbo.zTmp_CHN_Updates_Descendants')IS NOT NULL
 BEGIN
  TRUNCATE TABLE [dbo].[zTmp_CHN_Updates_Descendants];
 END
ELSE
 BEGIN
  CREATE TABLE [dbo].[zTmp_CHN_Updates_Descendants](
      [NameKey] [int] NOT NULL,
      [ParentNameKey] [int] NOT NULL,
      [Name] [nvarchar](100) NULL,
      [NameID] [int] NULL,
      [ParentNameID] [int] NULL,
      [NameCustomerAccountID] [nvarchar](20) NULL,
      [Effectivity] [date] NULL,
      [Expiry] [date] NULL)
 END

2-      Track changes and ETL Extract the Adjacency SharePoint list data
a.       Track the CHM present state. After Sequence Counter add a Data Flow Task and rename it “Track Existing CustomerHierarchyNormalized Data Flow Task”.

Hint:
The temp tables are created during run time. Therefore, all the SSIS Data Flow interacting with temp tables “Delay Validation” proprieties are set to True.

Add an OLE DB Source. Rename it “Set the NameKey OLE DB Source”. With the OLE DB connection manager set to Datastore, select SQL command, and copy / paste the following code in SQL command.

SELECT [NameKey]
  FROM [dbo].[CustomerHierarchyNormalized]


Add an OLA DB Destination. Rename it “Insert Destination”. Connect the “Set the NameKey OLE DB Source” to “Insert Destination”. Save the information in the temp table [dbo].[zTmp_CustomerHierarchyNormalized_NameKey].


b.      Get the SharePoint adjacency list and persist the data steward’s changes from the previous execution time. In the DataStore_Utility create a view, as follows.

CREATE VIEW [dbo].[UpdateCustomerHierarchyNormalizedInit]
      AS
      /*** Get Parent and Child information for CustomerHierarchyNormalized  from SharePoint List ***/
      Select
                  cast(CustomerHierarchy.[Name] as nvarchar(100)) AS 'Name'
                  , CustomerHierarchy.[ID] AS 'NameID'
                  , CASE WHEN parent.[Name] in ('-1', 'All') THEN NULL
                              ELSE CustomerHierarchy.[ParentNameID]
                  END AS [ParentNameID]
                  ,cast(CustomerHierarchy.[CustomerAccount] AS nvarchar(20)) AS 'NameCustomerAccountID'
      FROM [DataStore_Staging].[sp].[CustomerHierarchy] CustomerHierarchy
      inner join [DataStore_Staging].[sp].[CustomerHierarchy] parent
                  on parent.ID=CustomerHierarchy.ParentNameID
      where CustomerHierarchy.Name not in ('All')
GO

<Aside>  ‘-1’ and ‘All’ are used in SharePoint adjacency list to assist the data stewards visually and minimize errors. While it is needed for legacy type 2 SCD, there is no need for them in this design for the presentation layers. The users prefer not to see ‘-1’ and ‘All’ in the reports, Power BI, and Pivot tables.

After “Track Existing CustomerHierarchyNormalized Data Flow Task” add a “Get SharePoint Data Flow Task SCD”.

Add an OLE DB Source. Rename it “CustomerHierarchyNormalized Row Count OLE DB Source”. With the OLE DB connection manager set to DataStore_Utility, select SQL command, and copy / paste the following code in SQL command.

SELECT [Name]
      ,[NameID]
      ,[ParentNameID]
      ,[NameCustomerAccountID]
  FROM [dbo].[UpdateCustomerHierarchyNormalizedInit]

After “CustomerHierarchyNormalized Row Count OLE DB Source” add SSIS tool Slowly Changing Dimension. Its Insert Destination is the [DataStore].[dbo].[CustomerHierarchyNormalized].


c.       Track the added / change CHN data. After “Get SharePoint Data Flow Task SCD” add a Data Flow Task and rename it “Track Added CustomerHierarchyNormalized Data Flow Task”.

Add an OLE DB Source. Rename it “Set the NameKey OLE DB Source”. With the OLE DB connection manager set to Datastore_Utility, select SQL command, and copy / paste the following code in SQL command.

SELECT NowList.[NameKey]
  FROM [dbo].[zTmp_CustomerHierarchyNormalized_NameKey] AS InitList
  INNER JOIN [DataStore].[dbo].[CustomerHierarchyNormalized] AS NowList
      ON NowList.NameKey=InitList.[NameKey]
  where  InitList.[NameKey] != NowList.NameKey

Add an OLA DB Destination. Rename it “Insert Destination”. Connect the “Set the NameKey OLE DB Source” to “Insert Destination”. Save the information in the temp table [dbo].[zTmp_CustomerHierarchyNormalized_Added].


At this point during the package execution the following temp table contains the NameKey of the added changes to Customer Hierarchy Normalized table. Deleted rows are handled with Expiry date. There are no delete rows in the SCD structure. Will address updates at later time in this article.
[DataStore_Utility].[dbo].[zTmp_CustomerHierarchyNormalized_Added]

3-      Update information for changes has two components.
a.       The added rows do not have the correct ParentNameKey. In this step, update the ParentNameKey in the following table.
[DataStore].[DBO].[CustomerHierarchyNormalized]

In the DataStore_Utility data base create a new Stored Procedure as follows.

USE [DataStore_Utility]
GO

CREATE PROCEDURE [dbo].[sp_CustomerHierarchyNormalized_UpdateAddedParentNameKey]
AS
      UPDATE Child
      set ParentNameKey = Parent.NameKey         
      FROM [DataStore].[DBO].[CustomerHierarchyNormalized]  Child 
                  INNER JOIN [DataStore].[DBO].[CustomerHierarchyNormalized] Parent
ON Child.ParentNameID = Parent.NameID   
                  INNER JOIN [dbo].[zTmp_CustomerHierarchyNormalized_Added] Added
ON Added.NameKey=Child.NameKey
GO

In VS after “Track Added CustomerHierarchyNormalized Data Flow Task” add “Update Added ParentNameKey Execute SQL Task”. Edit the SQL task and set the SQL Statement as follows.
EXEC [dbo].[sp_CustomerHierarchyNormalized_UpdateAddedParentNameKey];


b.      Create a new view as follows.
USE [DataStore_Utility]
GO

CREATE VIEW [dbo].[UpdateCustomerHierarchyNormalizedUpdates]
      AS
      SELECT CHNAdded.[NameKey]
                    ,[ParentNameKey]
                    ,[Name]
                    ,[NameID]
                    ,[ParentNameID]
                    ,[NameCustomerAccountID]
                    ,[Effectivity]
                    ,[Expiry]
        FROM [DataStore].[dbo].[CustomerHierarchyNormalized] AS CHN
        INNER JOIN [dbo].[zTmp_CustomerHierarchyNormalized_Added] AS CHNAdded on
                  CHNAdded.[NameKey]=CHN.NameKey
GO

After “Update Added ParentNameKey Execute SQL Task” add “Get Added CustomerHierarchyNormalized Details Flow Task”. Add “Set the NameKey OLE DB Source”. Select DataStore_Utility, SQL command, and copy / paste the following code.

SELECT [NameKey]
      ,[ParentNameKey]
      ,[Name]
      ,[NameID]
      ,[ParentNameID]
      ,[NameCustomerAccountID]
      ,[Effectivity]
      ,[Expiry]
  FROM [dbo].[UpdateCustomerHierarchyNormalizedUpdates]


After “Set the NameKey OLE DB Source” add “OLE DB Destination” and save the data in the following table.
[DataStore_Utility].[dbo].[zTmp_CustomerHierarchyNormalized_Updates].

4-      Get the changes for descendants. In this section, identify the descendants for the added changes. In the DataStore_Utility data base create a new Stored Procedure as follows.

USE [DataStore_Utility]
GO

CREATE PROCEDURE [dbo].[sp_CustomerHierarchyNormalized_AddedUpdatesDescendants]
AS
      DECLARE @SCDDate datetime = getdate();

      WITH OrgUpdates AS
      (
                              --Our Changed Records
                  SELECT Updates.[NameKey]
                                          ,Updates.[ParentNameKey]
                                          ,Updates.[Name]
                                          ,Updates.[NameID]
                                          ,Updates.[ParentNameID]
                                          ,Updates.[NameCustomerAccountID]
                                          ,Updates.[Effectivity]
                                          ,Updates.[Expiry]
                              FROM [dbo].[zTmp_CustomerHierarchyNormalized_Updates] AS Updates
                                          LEFT JOIN [DataStore].[DBO].CustomerHierarchyNormalized AS CHN
                                                      on Updates.NameID = CHN.NameID
                                          and @SCDDate > CHN.Effectivity
                                          and ISNULL(CHN.Expiry, '') = ''

                  UNION ALL
                              -- Find Descendants
                              SELECT CHN.[NameKey]
                                                      ,CHN.[ParentNameKey]
                                                      ,CHN.[Name]
                                                      ,CHN.[NameID]
                                                      ,CHN.[ParentNameID]
                                                      ,CHN.[NameCustomerAccountID]
                                                      ,CHN.[Effectivity]
                                                      ,CHN.[Expiry]
                              FROM [DataStore].[DBO].CustomerHierarchyNormalized AS CHN
                                          INNER JOIN OrgUpdates AS OrgUpdates
                                                      on CHN.ParentNameID = OrgUpdates.NameID
                                                      and @SCDDate > CHN.Effectivity
                                                      and ISNULL(CHN.Expiry, '') = ''
      )

      INSERT INTO [dbo].[zTmp_CHN_Updates_Descendants]
      SELECT DISTINCT [NameKey]
                              ,[ParentNameKey]
                              ,[Name]
                              ,[NameID]
                              ,[ParentNameID]
                              ,[NameCustomerAccountID]
                              ,[Effectivity]
                              ,[Expiry]
      FROM OrgUpdates

      Delete CUD
      from [dbo].[zTmp_CHN_Updates_Descendants] CUD
      Inner JOIN [dbo].[zTmp_CustomerHierarchyNormalized_Updates] CHN on CHN.NameKey=CUD.NameKey

GO

In VS after “Get Added CustomerHierarchyNormalized Details Flow Task” add “Get Added Updates_Descendants Details Flow Task”. Edit the SQL task and set the SQL Statement as follows.
EXEC [dbo].[sp_CustomerHierarchyNormalized_AddedUpdatesDescendants];


5-      Update the CHN table. There are three steps.
a.       Expire effected data in CHN table. In the DataStore_Utility data base create a new Stored Procedure as follows.

USE [DataStore_Utility]
GO

CREATE PROCEDURE [dbo].[sp_CustomerHierarchyNormalized_ExireEffected]
AS
      DECLARE @SCDDate datetime = getdate();
      UPDATE CHN
      SET CHN.[Expiry] = @SCDDate
      FROM [dbo].[zTmp_CHN_Updates_Descendants] AS CPD
                  INNER JOIN [DataStore].[DBO].CustomerHierarchyNormalized AS CHN on CPD.NameKey = CHN.NameKey
                                                      and @SCDDate > CHN.Effectivity
                                                      and ISNULL(CHN.Expiry, '') = ''
      UPDATE [dbo].[zTmp_CHN_Updates_Descendants]
      SET [Effectivity] = @SCDDate
GO

In VS after “Get Added Updates_Descendants Details Flow Task” add “Expire Effected ROWS Execute SQL Task”. Edit the SQL task and set the SQL Statement as follows.
EXEC [dbo].[sp_CustomerHierarchyNormalized_ExireEffected];


b.      Insert New CustomerHierarchyNormalized Details Flow Task. After “Expire Effected ROWS Execute SQL Task” add Data Flow “Insert New CustomerHierarchyNormalized Details Flow Task”. Add “Get New OLE DB Source”, with DataStore_Utility and SQL command as follows.

SELECT [ParentNameKey]
      ,[Name]
      ,[NameID]
      ,[ParentNameID]
      ,[NameCustomerAccountID]
      ,[Effectivity]
      ,[Expiry]
  FROM [dbo].[zTmp_CHN_Updates_Descendants]

Add OLE DB Destination after Get New OLE DB Source. Set the destination to CHN table as follows and map the fields.
[DataStore].[dbo].[CustomerHierarchyNormalized]

The Destination Name key is not mapped to anything. It is auto incremented primary key.

c.       Updates Current ParentNameKey Execute SQL Task. The effected rows ParentNameKey require updates.

Add a new Stored Procedure as follows.
USE [DataStore_Utility]
GO

CREATE PROCEDURE [dbo].[sp_CustomerHierarchyNormalized_UpdatesParentNameKey]
AS
      DECLARE @SCDDate datetime = getdate();
      UPDATE Child
      SET ParentNamekey = Parent.NameKey
      FROM [DataStore].[dbo].[CustomerHierarchyNormalized] AS Child
                  INNER JOIN [DataStore].[dbo].[CustomerHierarchyNormalized] AS Parent
                  ON Child.ParentNameID=Parent.NameID
      WHERE @SCDDate > Child.Effectivity
                  and ISNULL(Child.Expiry, '') = ''
                  and @SCDDate > Parent.Effectivity
                  and ISNULL(Parent.Expiry, '') = ''
GO

After “Insert New CustomerHierarchyNormalized Details Flow Task” add “Updates Current ParentNameKey Execute SQL Task”. Set the SQL Statement as follows.

EXEC [dbo].[sp_CustomerHierarchyNormalized_UpdatesParentNameKey];


6-      Clean up
a.       Add Sequence Container. Rename it “Clean up Sequence Container” after “Updates Current ParentNameKey Execute SQL Task”. In the Sequence Container add four Execute SQL Tasks. Rename them appropriately.
b.      In each Execute SQL Tasks connect to “DataStore_Utility” database. Copy and paste the following code in the SQL Tasks, appropriately.

First – Cleanup table for initial CHM NameKey.
IF OBJECT_ID('zTmp_CustomerHierarchyNormalized_NameKey')IS NOT NULL
DROP TABLE [dbo].[zTmp_CustomerHierarchyNormalized_NameKey];

 Next - Cleanup table that have the added / changed CHM NameKey.
IF OBJECT_ID('zTmp_CustomerHierarchyNormalized_Added')IS NOT NULL
 DROP TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Added];

Next – Cleanup table that contains the added / changed CHN information.
IF OBJECT_ID('zTmp_CustomerHierarchyNormalized_Updates')IS NOT NULL
 DROP TABLE [dbo].[zTmp_CustomerHierarchyNormalized_Updates];

Next - Create a table to contain the added / changed CHN descendants.
IF OBJECT_ID(' zTmp_CHN_Updates_Descendant')IS NOT NULL
 DROP TABLE [dbo].[zTmp_CHN_Updates_Descendant];

For testing, feel free to use the following data for CHN.


The “Data Store ETL” selection is complete. Let us focus on model, next.

Tabular Cube model
The tabular data model must be updated to include the CHN table and its appropriate measures. In this section, let us explore the solution together.

In the tabular model, in my case “Adec.EDW.Datastore.Tabular.Finance” import the following table.
[DataStore].[dbo].[CustomerHierarchyNormalized]

In the model table view add the following columns. Copy and paste the DAX code for each added column, respectively.

Identify the nodes path, AKA, note heritage. Add Column “Path”.
DAX Code as follows.
=path('Customer Hierarchy Normalized'[NameKey], 'Customer Hierarchy Normalized'[ParentNameKey])

Calculate the depth of node’s hierarchy. Add Column “HierarchyDepth”.
DAX Code as follows.
=PATHLENGTH('Customer Hierarchy Normalized'[Path])

The level 1 through 8 identifies the node parent. Leaf through trunk of tree. If more than eight levels are needed additional level need to be added to the model. Add Column “Level1”.
DAX Code as follows.
=LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 1, 1))

Add Column “Level2”.
DAX Code as follows.
=IF (
    'Customer Hierarchy Normalized'[HierarchyDepth] >= 2,
                  LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 2, 1)),
    'Customer Hierarchy Normalized'[Level1]
)

Add Column “Level3”.
DAX Code as follows.
= IF (
    'Customer Hierarchy Normalized'[HierarchyDepth] >= 3,
                  LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 3, 1)),
    'Customer Hierarchy Normalized'[Level2]
)

Add Column “Level4”.
DAX Code as follows.
=
IF (
    'Customer Hierarchy Normalized'[HierarchyDepth] >= 4,
                  LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 4, 1)),
    'Customer Hierarchy Normalized'[Level3]
)


Add Column “Level5”.
DAX Code as follows.
= IF (
    'Customer Hierarchy Normalized'[HierarchyDepth] >= 5,
                  LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 5, 1)),
    'Customer Hierarchy Normalized'[Level4]
)

Add Column “Level6”.
DAX Code as follows.
= IF (
    'Customer Hierarchy Normalized'[HierarchyDepth] >= 6,
                  LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 6, 1)),
    'Customer Hierarchy Normalized'[Level5]
)

Add Column “Level7”.
DAX Code as follows.
= IF (
    'Customer Hierarchy Normalized'[HierarchyDepth] >= 7,
                  LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 7, 1)),
    'Customer Hierarchy Normalized'[Level6]
)

Add Column “Level8”.
DAX Code as follows.
= IF (
    'Customer Hierarchy Normalized'[HierarchyDepth] >= 8,
                  LOOKUPVALUE (
                              'Customer Hierarchy Normalized'[Name],
                              'Customer Hierarchy Normalized'[NameKey],
                              PATHITEM ( 'Customer Hierarchy Normalized'[Path], 8, 1)),
    'Customer Hierarchy Normalized'[Level7]
)

Add Column “IsLeaf”.
DAX Code as follows.
= CALCULATE (
COUNTROWS('Customer Hierarchy Normalized'),
All('Customer Hierarchy Normalized'),
'Customer Hierarchy Normalized'[ParentNameKey] =
EARLIER('Customer Hierarchy Normalized'[NameKey])
) = 0

In the model diagram view, build the cube hierarchy for Customer Hierarchy Normalized. Connect the Customer Hierarchy Normalized NameKey to the date dependent fact table key for the given row.

The fact table lookup must be date dependent. In my case, I use the following lookup logic.

"select chn.Namekey as CustomerHierarchyKey,
                  chn.Name,
                  chn.NameCustomerAccountID,
                  dd.DateKey
from [dbo].[CustomerHierarchyNormalized] chn
       INNER JOIN DateDimension dd ON dd.Date >= chn.EFFECTIVITY            
       AND (dd.Date < chn.EXPIRY OR chn.EXPIRY IS NULL)                   
       AND dd.Date <= '" + (DT_WSTR,25)(DT_DBDATE)@[User::MaxDate] +"'
       AND dd.Date >= '" + (DT_WSTR,25)(DT_DBDATE)@[User::MinDate] +"'"

The max and min dates are resolved during runtime. Use Expression Builder to create the date dependent look up. Thereafter, use the variable for lookup. In my case, User:CustomerHierarchySQLCommand.


The Event Handlers in SSIS package are very powerful and easy to use. I use the SQL standard system variables to track and report error bread crumbs. At some later time, the error table can be used for fault tolerant system architecture and system error’s alerts. At this point, it can be a good source for the production support team to perform some route cause analysis before contacting me.
 

Type-2 SCD Tree Design Pattern is needed in multiple areas. The above solution collection is a good and solid start.

Special thanks to Tracie Wagner with her right questions and Robert Lefebvre encouragement to follow good productionability practices.

Happy designing. Make the computer dance!

Enjoy what you do, you never work a day in your life.

Cheers, Toraj

<Aside> Jan-16-2018. Applied the same pattern to the Sales Region Hierarchy Type 2 Slow Changing Dimension (SCD). The pattern proved to be very repeatable and scalable.

References:

No comments:

Post a Comment