Saturday, 14 January 2017

Hash Function With Slowly Change Dimension

hash function is an algorithm that maps data of arbitrary length to data of a fixed length. Using Hash Function we can make Slowly Change Dimension to clarify the past and current situations of Employee


Let's make an example to explain:

Step 1:  Assume there was a Original Employee Table stored in QVD file:

--- Create an Employee Table in Qlikview for Original as below:


//Employee:
//LOAD 
//     Hash256(EmployeeID,Name,City) As HashKey,
//     EmployeeID, 
//     Name, 
//     City, 
//     From
//FROM
//[C:\Users\Peidu\Desktop\Project\2017-01-14\Original.txt]
//(txt, codepage is 936, embedded labels, delimiter is ',', msq);
//
//Store Employee Into Employee.QVD;

Original Table shows:


From Picture we can see Tom worked in Newyork from 2016/01/01,Now Tom shifted to Chicago From 2017/01/01,how to describe the status of Tom.

Step 2: Create a Temperate Table to begin slowly change dimension process, use Concatenate to make connection between Original and Present Tables, the Purpose of  Step 2 is to design From column:
Temp:

Load HashKey,EmployeeID,Name,City,From From Employee.QVD(qvd);

Concatenate(Temp)

Load Hash256(EmployeeID,Name,City) As HashKey,EmployeeID,Name,City,From 

From
[C:\Users\Peidu\Desktop\Project\2017-01-14\Present.txt]
(txt, codepage is 936, embedded labels, delimiter is ',', msq) Where Not Exists(HashKey,Hash256(EmployeeID,Name,City));

Store Temp Into Employee.QVD;

the table shows:



*****************NOTICE***************
WHERE NOT EXISTS Clause plays a  important role in the process, it tells Qlikview Do Not Load records with the same HashKey( the HashKey we made in Employee already) and the same Hash256(EmployeeID,Name,City)-- we made in Temperate Table.

Step 4: Load Employee again and design the column To 

Employee:

Load EmployeeID,
        Name,
        City,
        From,
       Date(If(Previous(EmployeeID)<>EmployeeID,' ',Previous(From)-1),'YYYY-MM-DD') As To 
Resident Temp Order By EmployeeID,From Desc;
     
     Drop Table Temp;


the final result shows the current and the past status of Employee, we can see Tom worked in Newyork from 2016/01/01 to 2016/12/31, and started work in Chicago from 2017/01/01.For Jerry, not changed, she still worked in Miami.

the script above designed from,to columns with different locations,here is a question:
how about Tom go back Newyork? The new record will not be loaded in the table since there is a hashkey fot (001,Tom,Newyork) already. We have to add a flag column to solve this issue.



1 comment:

  1. I feel Qlik is a very good tool and platform to produce dashboards and reports for end users.This is mostly preferred by IT professionals.

    Qlik Rest Api Connection



    ReplyDelete