Monday, August 30, 2010

Using SSIS Batch Destination component to load the SCD Dimension Table

The SSIS Batch Destination component could be used to populate the SCD Type 1 or Type 2 table very easily.
image
  • Create an ADO.NET connection manager to be used.
  • Click on the Input Columns tab to select all the input columns needed.
  • Go to the Properties tab to write the TSQL merge statements.
 image
  • Type in the table name in the Working Table textbox. You may use a temporary table (denoted as #) or a regular table to hold your input data.
  • Type in the merge statements in the SQL Command textbox. The Batch Destination component supports multiple SQL Commands in a transaction.
  • For SCD Type 1 table, you may use the following TSQL statement to do both update or insert.
MERGE INTO DW.TableName AS T
USING 
#TableName AS S
ON T.ID = S.ID
/*Update Record*/
WHEN MATCHED AND S.UpdateFlag = 1
THEN UPDATE 
SET
.......
 
T.Column1=S.Column1,
T.Column2=S.Column2,
T.Column3=S.Column3,
T.Column4=S.Column4
/*Insert record : New record or type 2 current record*/
WHEN NOT MATCHED BY 
TARGET THEN INSERT 
(
ID,
Column1,
Column2,
Column3,
Column4,
... )
VALUES(
S.ID,
S.Column1,
S.Column2,
S.Column3,
S.Column4,
 
...);
  • For SCD Type 2 table, you may use the following TSQL statements to do update, insert, expire, and update type 1 columns in historical data.
MERGE INTO DW.TableName AS T
USING 
#TableName AS S
ON T.ID = S.ID
/*Update Record*/
WHEN MATCHED AND S.UpdateFlag = 1
THEN UPDATE 
SET
.......
 
T.Column1=S.Column1,
T.Column2=S.Column2,
T.Column3=S.Column3,
T.Column4=S.Column4
/*Insert record : New record or type 2 current record*/
WHEN NOT MATCHED BY 
TARGET THEN INSERT 
(
ID,
Column1,
Column2,
Column3,
Column4,
... )
VALUES(
S.ID,
S.Column1,
S.Column2,
S.Column3,
S.Column4,
 
...);
 
/*Expire type 2 existed Record*/  
MERGE INTO  DW.TableName AS T
USING    #TableName AS S
ON     T.ID = S.D
WHEN MATCHED AND S.UpdateFlag = 0 
THEN UPDATE SET
T.EffectiveEndDate = GetDate(), T.CurrentRecord ='NO'
;
/* Update Type 1 columns in the expired records */
UPDATE  T    
SET X.Type1Column1 =  S.Type1Column1, 
X.Type1Column2 =  S.Type1Column2 
FROM DW.Tablename AS X INNER JOIN DW.TableName AS S    
ON X.NK= S.NK /*Join table with Natural keys */    
WHERE S.CurrentRecord = 'YES' 
AND S.ID >  X.ID
;
There are some performance issues you may encountered while working with large table, so you will need to change the Rows Per Batch or Max Commit Size properties accordingly. Sometime you may find that creating an index in your temporary table could improve the performance of this load.

Wednesday, August 25, 2010

How to Process SSAS Dimensions Individually and Automatically

    Sometime there is need to process the cube dimensions individually, so you could create logs to trace the dimension that has been processed, or to recover from the failed point without restart the whole process. We can create a SSIS Task and schedule it in SQL Agent Job as following steps:
    • Create a new Integration Service project.
    • Add execute SQL task with OLE DB connect to the analysis service database
      • Query list of dimensions in a SSAS database using DMV as following and save it into a SSIS object variable
      SELECT DISTINCT DIMENSION_NAME 
      FROM  $SYSTEM.MDSCHEMA_DIMENSIONS
      WHERE LEFT(CUBE_NAME,1) = '$'
      AND [DIMENSION_UNIQUE_NAME] <> '[Measures]'
      ORDER BY DIMENSION_NAME

      • Create a ForEach Loop Container and set the collection to retrieve the dimension name from the above step

      • Add the Analysis Services Processing Task with a SSAS connection to the to be used in the container


      • Edit the task by using expression to set the XMLA dynamically in the ProcessingCommands property with the dimension that we want to process.

      • Here is the sample XMLA to be used in expression editor:
                  

      • You may use SSMS to connect to the SSAS and select a Dimensions in a database to use the Script function to create the XMLA for you as template. You may change the Process Options as needed.

      • Once you have the SSIS Package, go to SQL Server Agent and add a SQL Job with Job Type as SQL Server Integration Service Package. Select the Package Location and set the schedule option as per your requirement.
      That’s it. Now all the dimensions will be processed at the scheduled intervals.
      This way, any newly added dimensions will also get processed without modifying the SSIS package.

      "<Batch xmlns=\"http://schemas.microsoft.com/analysisservices/2003/engine\"> 
      <Parallel> 
      <Process xmlns:xsd=\http://www.w3.org/2001/XMLSchema\ xmlns:xsi=\http://www.w3.org/2001/XMLSchema-instance\ xmlns:ddl2=\http://schemas.microsoft.com/analysisservices/2003/engine/2\ xmlns:ddl2_2=\http://schemas.microsoft.com/analysisservices/2003/engine/2/2\ xmlns:ddl100_100=\"http://schemas.microsoft.com/analysisservices/2008/engine/100/100\">
      <Object>
      <DatabaseID>" + @[User::_DatabaseID] + "</DatabaseID>
      <DimensionID>" + @[User::_DimensionName] + "</DimensionID>
      </Object>
      <Type>ProcessFull</Type>
      <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
      </Process>
      </Parallel>
      </Batch>"