Friday, November 2, 2012

SQL Saturday #174 Recap

 

What a great success SQL Saturday was in Charlotte last weekend.  I volunteered to be a proctor in the morning, so I could attend and also help out.  My presentation on “Explore to Integration Services Catalog” went well with full house of attendees.  Thanks to all the people who attend my session and everyone in the Charlotte BI group for organizing a great event.

I was asked to post my PowerPoint presentation. 

Here is the link:

Explore the Integration Services Catalog

Friday, October 5, 2012

Use Sybase SQL Anywhere as data source for PowerPivot

To use Sybase SQL Anywhere as the Power Pivot data source, you need to use ODBC. It is not intuitive to set it up.  After encountered several errors, I am able to load the data into PowerPivot to analyze it now.

Here is the list of steps:

  • Install SQL Anywhere Client. You may download it from Sybase we: http://www.sybase.com/detail?id=1087327. If you have 32 bits Office product, you need to install 32 bit client.
  • Open the Sybase Central App and create a new connection to the Sybase Host and the database you want to load the data from.
  • After you test the connection, click on the Save ODBC Data Source menu on the Tools button.

clip_image001

  • When the Save as ODBC Data Source window open, type in a name for the ODBC and click on the Save button.

clip_image001[5]

  • You could now open the ODBC data source admin to verify it been created.  Please note that if you install 32 bit SQL Anywhere on a 64 bit windows machine, then you will need to use the 32 bit ODBC to view/configure it. You could find it in C:\Windows\System32\odbcad32.exe.

clip_image001[7]

  • In Excel, open the PowerPivot, click on the Get External Data from Other Sources on the Home page.
  • Select Others(OLEDB/ODBC) from the list to specify the Connection String

image

  • You could type in the following connection string in the Table Import Wizard window:
      Provider=SAOLEDB.12;Data Source=DataSourceName; Persist Security Info=False
    clip_image001 

  • Or Click on the Build button to open the Data Link Properties window to select the ODBC just created.

imageclip_image001[1]

Now, you should be able to load the data from SQL Anywhere data source into PowerPivot workbook.

Wednesday, October 3, 2012

SQL Saturday #174

This SQL Saturday training event is focused on Microsoft Business Intelligence topics.

I am honored to have been selected to present “Explore the Integration Services Catalog” out SQL Server on October 27, 2012 in Charlotte North Carolina.

I am very excited as this will be my first time presenting in SQL Saturday.

If you in the area, there is no better opportunity to get free Microsoft BI training.  I am looking forward to meeting you there.

image

Saturday, September 22, 2012

Create SSRS Report using DAX

 

We could query BISM Tabular model using either MDX or DAX. People are very familiar with how to use MDX in SSRS.  Using DAX in SSRS is totally different from using MDX.  Here is an example to create SSRS report using DAX language:

image

List of Steps:

1. Open SQL Server Data Tools to create a SSRS report.

2. Add an existed BISM Tabular Model as the Shared data source.

3. Write the DAX in the SSMS or DAX Studio first. The DAX Studio is an Excel Add-Ins that you could download from the CodePlex.

EVALUATE
SUMMARIZE
    (    
        CALCULATETABLE( 'Internet Sales'
        ,'Product Category'[Product Category Name] = "Bikes"       
    ) ,
                  
 'Date'[Calendar Year],
 'Product Category'[Product Category Name],               
  "Total Sales", 'Internet Sales'[Internet Total Sales],               
  "Total Tax Amount", 'Internet Sales'[Internet Total Tax Amt],              
  "Total Margin", 'Internet Sales'[Internet Total Margin]           
 )

 


4. Create SSRS dataset




    • Add a new dataset and connect to the BISM Tabular Model
    • Open Query Designer
    • Click on the Command Type DMX button on the toolbar to switch to the DMX query designer.

image




    • Click on the Query/Design Mode button to switch to the Query mode.

image




    • Paste or type your DAX query into the Query window and then click on OK button.
    • Click on the Fields tab to fix/rename the field name to make it more readable. The field name generated by the query designer will not lineup with the column name you have in the DAX query.

 image




    • Now, you could add those fields to the report.


5. Add parameter



  • Go to dataset query, open the query designer.
  • Change the DAX query by replacing the filer part of Query with parameter name and click on the Parameter Button to open the Query Parameters window

image


image



  • Add Parameter and set the default value, then click OK to exit out the designer
  • On the report data window, you will find a parameter been created with default value that you typed in.

image


6. Create parameter dataset



  • There is no parameter query generated by the tool. You need to create it if you allow user to pick from the drop down list.
  • Write a DAX query with VALUES() function using the same steps before to create a dataset that will contain the list of parameter values.

EVALUATE 
values('Product Category'[Product Category Name])
ORDER By 'Product Category'[Product Category Name]




  • Associate this dataset to the report parameter

image



  • Now your SSRS report should work with single value selection.

Report Parameter Allows Multiple Values


If you allow multiple values for the report parameter, you could not use filter with equal sign (“ = ”) in the DAX query.  You could use PATHCONTAINS() function in the filter part of query to filter the result set.


PATHCONTAINS(<path>, <item>)


Since the multi select parameter string pass in will contain extra special characters that could not be used for the Path() function, you need to use Substitute() function to replace them.


Here is the complete DAX query for the report.



EVALUATE
SUMMARIZE(
    CALCULATETABLE( 'Internet Sales'
            ,PATHCONTAINS(
                    substitute( 
                        substitute( 
                            substitute( 
                                    @CategoryName
                              , "{ ", "") 
                        , " }", "") 
                    , ",", "|") 
        ,'Product Category'[Product Category Name] ) 
    
),                
 'Date'[Calendar Year],
 'Product Category'[Product Category Name],               
  "Total Sales", 'Internet Sales'[Internet Total Sales],               
  "Total Tax Amount", 'Internet Sales'[Internet Total Tax Amt],              
  "Total Margin", 'Internet Sales'[Internet Total Margin]           
 )

 

After you update your query, the SSRS report will work for multiple selections from the user now.

 

 


Reference:


http://daxstudio.codeplex.com/


http://msdn.microsoft.com/en-us/library/gg492182.aspx

Monday, September 3, 2012

DQS Exception Issues

Recently I created several SSIS packages with DQS component to do data cleansing jobs while loading data into data warehouse.  After deploying the project to the remote server, the packages failed to execute at remote server with following exception thrown:

[DQS Cleansing [2]] Error: 
Microsoft.Ssdqs.Infra.Exceptions.EntryPointException: A fatal error occurred
when trying to execute the service.at
Microsoft.Ssdqs.Proxy.Database.DBAccessClient.Exec()
at
Microsoft.Ssdqs.Proxy.EntryPoint.KnowledgebaseManagementEntryPointClient.KnowledgebaseStartBatchActivity(Int64
knowledgebaseId, String activityIdentifier)
at
Microsoft.Ssdqs.Component.DataCorrection.Logic.DataCorrectionComponent.PreExecute()
at
Microsoft.SqlServer.Dts.Pipeline.ManagedComponentHost.HostPreExecute(IDTSManagedComponentWrapper100
wrapper)




After research and investigation for several days, I figured out that the SQL server version on my PC  is different from the remote sever which has SQL 2012 CU2.  I then upgraded my client machine to SQL 2012 CU2.

After that, some DQS packages will run inside the VSDT, some will still fail with the same exception as the diagram showed:

clip_image001

This is caused by the length of Data Quality Project Name that DQS created, which is limited to 128 characters (a combination of the package name + DQS Cleansing component name + datetime stamp + a few GUIDs).  The long project name will fail to be inserted into  [DQS_MAIN].[dbo].[A_KNOWLEDGEBASE_ACTIVITY].[KNOWLEDGEBASE_NAME].


In addition, I noticed that before CU2, the project name only append one GUID at end which allows longer SSIS package name without failing.  So, if you have experience of the DQS package failing to execute after upgrading to CU2, you may have the same problem.



image



clip_image001[9]


Solution:



  • Right click the DQS Cleansing Component to open the component property page.

  • Rename the PackageName + Name of the component to be less than 30 characters.


P.S. Change the Name of the package property or SSIS package name will not change the PackageName property inside the DQS Cleansing component.



clip_image001[5]


Now, the SSIS Package is able to run without exception on either client or remote server.

clip_image001[7] 


If you encountered the following error when you open the DQS client application, it is related to the server been upgraded without updating the DQS databases. You could run DQS installer with upgrade option in Command Prompt to resolve the issue.


image 



DQSInstaller.exe -upgrade

Special thanks to Jason H. from Microsoft for looking into my question posted on DQS Forums.


Reference:


http://social.technet.microsoft.com/wiki/contents/articles/8442.upgrade-dqs-installing-cumulative-updates-or-hotfix-patches-on-data-quality-services-en-us.aspx

Saturday, July 14, 2012

Alternate Row Color for SSRS Report With Column Group

There are several ways to alternate background color for rows. The most common way is using RowNumber() function to retrieve the order of the row then set the background color using expression as:

=IIf(RowNumber(Nothing) Mod 2 = 0, "Silver", Nothing)

image

If there is a row group, using RowNumber() can lead to some issues. The another alternative is to use RunningValue() function as:

=IIf(RunningValue(Fields!Product.Value, CountDistinct, "Customer") Mod 2 > 0,"Silver", Nothing)

image

Dynamic Column Group Scenario

image

When there is a column group with dynamic data (i.e. Sales Reason) , you will not be able to use above solutions to alternate the row color.  Since the number of column (the Sales Reason in my example) may be varied depended on the row group data (the Date and Product in my example).  Here is what you may see if you use above approach only.

image

Solution 1

Use Customer Code and variable to determine the row color

  • Create a report customer code to toggle the row status for each function call
Public RowStatus as Boolean = False
 
Public Function GetRowStatus as Boolean 
 
    RowStatus  = Not RowStatus 
    Return RowStatus 
 
End Function


  • Add a variable RowColor to the detail row group that has value set to the expression as below to call the customer function:

          = Code.GetRowStatus()


image



  • Select the whole detail row and set the background color based on the variable value.image
  • Now the alternated row color shows up correctly.image

image


Solution 2


Store RunningValue() in a textbox and use it to set the row color



  • Add a new column outside the Column group at end of the report

image



  • Name the detail row textbox as RowNumber and set its value to the expression as

        =RunningValue(Fields!Product.Value, CountDistinct, "OrderDate" )



  • Select the whole detail row and set the back ground color to use the expression as:

         =IIF(ReportItems!RowNumber.Value mod 2 > 0, "LightGrey", Nothing)


image




  • Now the row color displays correctly and we can hide the last column that contains the RowNumber text box from the end user.


image


image



Reference:


http://social.technet.microsoft.com/Forums/en-US/sqlreportingservices/thread/6dbde4cf-2b3f-4aa6-b47c-5c37c47bfc82#72a83b7f-241c-4c80-b79a-7f6e4b4bc74a


http://msdn.microsoft.com/en-us/library/dd255285

Tuesday, June 5, 2012

The Simple DAX Functions as SELECT Statement


Do you even wonder how to use DAX function to query BISM to return a simple dataset or a dataset that is grouped by some attributes? 

Evaluate is the core DAX function that returns a table of data. It is similar to the SELECT statement in T-SQL.

Here is the Syntax:

EVALUATE <table>  




Example:

We want to select all the data from the Internal Sales table.







EVALUATE('Internet Sales')



By using the Evaluate function, the query returns all the rows and columns from the Internet Sales as below:


image

To sort the result,we could add ORDER BY at end. For example, we want to sort the result by Customer Id and product ID, we could do the following:


image

CalculateTable is a function that return a table that modified by the giving filters. It is similar to the SELECT * statement with WHERE Clause in T-SQL



Here is the Syntax:



CALCULATETABLE(<expression>,<filter1>,<filter2>,…)






This function takes expression for a table as the first parameter and any number of Boolean expressions as filter.



Example:




We want to return the Internal Sales records that were ordered in 2007 and the [Product Category] equals to “Bikes”.







EVALUATE
CALCULATETABLE(
 'Internet Sales',
 'Date'[Calendar Year] = "2007", 
 'Product Category'[Product Category Name] ="Bikes" 
)
ORDER BY 'Internet Sales'[Order Date]



By using the CalculateTable function, the query returns the records that match the filters provided.




image
Summarize is a DAX function that returns a table for the requested totals over a set of groups. It is similar to the SELECT statement with Group By in T-SQL.




Here is the Syntax:







SUMMARIZE(<table>, 
<groupBy_columnName>[, <groupBy_columnName>]…[, <name>, 
<expression>]…)


This function takes table of data as the first parameter, any number of columns as group by parameter, the name given to the Sum of the column, and the expression



Example:


We want to Sum up the total of [Internet Total Sales], total of [Internet Total Tax Amt], and total of [Internet Total Margin] FROM the [Internet Sales] table grouped by the [Calendar Year] and the [Product Category].




EVALUATE

SUMMARIZE('Internet Sales',
'Date'[Calendar Year],'Product Category'[Product Category Name],
"Total Sales", 'Internet Sales'[Internet Total Sales],
"Total Tax Amount", 'Internet Sales'[Internet Total Tax Amt],
"Total Margin", 'Internet Sales'[Internet Total Margin]
)ORDER BY 'Date'[Calendar Year],'Product Category'[Product Category Name]


By using the Summarize function, the query returns the sum of the measures that are grouped by the columns provided.


image



Reference:




http://msdn.microsoft.com/en-us/library/gg492156.aspx



http://msdn.microsoft.com/en-us/library/ee634760(v=sql.105).aspx

http://technet.microsoft.com/en-us/library/gg492171.aspx

Tuesday, May 29, 2012

SSIS Errors


Recently I encountered some ETL errors after deploying the SSIS packages from a single SQL server environment to a two SQL servers environment.

Error One:

Error 0x8000FFFF while preparing to load the package. Catastrophic failure.
 

Problem and Resolution:

This error message is very vague, it dese not provide enough info to figure out what's going on.  When I google it, the only clue is invalid disk address.  Since I just pushed some new packages to a new server, I suspect the client tool calling the ETL packages may not be able to locate it for some reason.
After closer examining the Parent package that hosts this package, I find this package is outside the sequence container even thought it looks like in it.  The step to replace the physical path for the package before this sequence container has been skipped and causes this error.


imageimage

 

Error Two:

The SSIS Runtime has failed to enlist the OLE DB connection in a distributed transaction with error 0x8004D025. "The partner transaction manager has disabled its support for remote/network transactions.". Error: SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x8004D025.
A similar error could be:
System.Transactions.TransactionException: The partner transaction manager has disabled its support for remote/network transactions. (Exception from HRESULT: 0x8004D025) ---> System.Runtime.InteropServices.COMException (0x8004D025)
 

Problem and Resolution:

If there is a SSIS package that has the TransactionOption=”Required”, you will need to have DTC turns on.

image
image

When the package is going to be executed on a multi SQL servers environment, you will need to enable Network DTC Access on both Servers or change the SSIS TransitionOption to not be “Required” to fix the error.
Here is the step to enable the Network DTC Access:
  • Open Component Services in Administrative Tools
image
  • Expend the Computers->Expend My Computer->  to find Local DTC
  • Right Click on the Local DTC to open the property window.
image
image
 

Reference

Enable Network DTC Access

Wednesday, May 23, 2012

Deployment Options for BISM Project

 

There are two Query Mode options you could use when you deploy the BISM project to the AS Tabular Model server.

    • In-Memory
    • DirectQuery

    image

    By default, data in tabular model is processed, compressed using the xVelocity in-memory analytics engine (VertiPaq).  This in-memory columnar storage engine has been optimized for high performance analysis and exploration of data.  It provides fast query times for aggregation queries.

    However, there are some drawbacks:

    • Data is not updated when the source data changes. Model needs to be processed to refresh the data.
    • When you turn off the computer hosting the model, the cache is saved to disk and must be reopened when you load the model.
    • The save and load operations can be time-consuming.
    • Need lots memory to hold the large fact data

    On the other hand, DirectQuery mode uses data that is stored in a SQL Server database. It lets users retrieve data directly from a SQL Server data source in  real time.  This mode also lets you create models and build reports for large data sets that cannot reside in memory.

    Here are additional benefits:

    • The data is guaranteed to be up-to-date.
    • Use the advantage of provider-side query acceleration, i.e. SQL 2012 column indexes.
    • Could use row-level security provided by the backend  database.
    • Analysis Services can perform optimization to ensure the query plan against the backend database will be as efficient as possible.

    There are some design considerations if you are planning to use DirectQuery mode:

    • During design phase, you may need to use Preview or Filter function to load subset data into your project.
    • MDX queries are not supported for a model in DirectQuery mode. You cannot use PPS or other clients that only issue MDX queries to consume it.
    • Currently this model only supports one data connection. You cannot query two or more SQL Servers from a DirectQuery enabled model.
    • Calculated columns and some DAX functions are not supported. You may need to use SQL View or other technique for that purpose.

    DirectQuery supports a hybrid deployment mode that can use either the cache or the relational source. For more information, see DirectQuery Mode (SSAS Tabular).

     

    Reference

    Formula Compatibility in DirectQuery Mode

    DirectQuery Mode (SSAS Tabular)

    White Paper: Using DirectQuery in the Tabular BI Semantic Model

    Sunday, March 18, 2012

    Sort a MDX Dataset by Multiple Criteria

    Recently I work on a SSRS report that needs a parameter to return a set of user to be selected as the filter for the report.  The list of user has to be sorted by Department Code and User’s name.

    In MDX, sorting a dataset based on one criterion is directly supported
    through the Order( ) function.  However, it doesn’t support multiple
    criteria for sorting explicitly.  In order to sort the dataset with multiple criteria, some tricks will be needed to complete the task.

    Sample Data

    image

    Here is how we could return a user list dataset that are sorted by Department Code and User Name.

    Trick One:

    Concatenate the two attributes into one and then use Order() function to sort it accordingly. 

    WITH MEMBER [Measures].[ParameterCaption] 
    AS [User].[User].CURRENTMEMBER.MEMBER_CAPTION 
     
    MEMBER [Measures].[ParameterValue] 
    AS [User].[User].CURRENTMEMBER.UNIQUENAME 
     
    MEMBER [Measures].[Department] 
    AS [User].[User].CURRENTMEMBER.Properties( "Department Code")
     
    SELECT {  [Measures].[ParameterCaption], [Measures].[ParameterValue], [Measures].[Department] } ON COLUMNS ,  
    {
        ORDER(
                    [User].[User].Children ,
                    [User].[User].CURRENTMEMBER.Properties( "Department Code") + [User].[User].CURRENTMEMBER.MEMBER_CAPTION, 
                    BASC
                    )
    } ON ROWS
    FROM [Test]
     

    As you could see, the result set returned has been sorted by Department first and then Username in alphabetical ascending order.


    image
    You may use this trick when the sorting order of both members are in the same direction either ascending or descending.


    Trick TWO


    When the sorting criteria are in different alphabetical order i.e. one in ascending order and the other in descending order, we will need to use two ORDER() functions.



    WITH MEMBER [Measures].[ParameterCaption] 
    AS [User].[User].CURRENTMEMBER.MEMBER_CAPTION 
     
    MEMBER [Measures].[ParameterValue] 
    AS [User].[User].CURRENTMEMBER.UNIQUENAME 
     
    MEMBER [Measures].[Department] 
    AS [User].[User].CURRENTMEMBER.Properties( "Department Code")
     
    SELECT {  [Measures].[ParameterCaption], [Measures].[ParameterValue], [Measures].[Department] } ON COLUMNS ,  
    {
     ORDER( 
            ORDER(
                    [User].[User].Children ,
                    [User].[User].CURRENTMEMBER.MEMBER_CAPTION, 
                    BASC
                    ),
        [User].[User].CURRENTMEMBER.Properties( "Department Code"),
        BDESC
        )
    } ON ROWS
    FROM [Test]

    As you could see, the result set returned now will be sorted by the Department Code in descending order first, then the Username in ascending order.


    image


    Reference


    http://msdn.microsoft.com/en-us/library/ms145587.aspx