SSAS Dynamic Partitioning Using AMO Objects in SSIS Script Task

1.     Introduction


A partition is a container for a portion of the measure group data. Every measure group has at least one partition; this partition is created when the measure group is defined. When you create a new partition for a measure group, the new partition is added to the set of partitions that already exist for the measure group. Partitions are a powerful and flexible means of managing cubes, especially large cubes. Processing a smaller amount of data will improve processing performance by decreasing processing time. It is also very easy to maintain the cube .By default each measuregroup have one Partition but for better processing performance and query performance we need to have multiple partitions based on criteria. The best way to design partition is based on date column.  Dynamic Partitioning is an automation of partition creation, deletion, merging and processing etc.  Without any manual intervention .it will reduce cost as well as human errors for handling the partition.   
           Analysis Management Objects (AMO) is a complete collection of management classes for Analysis Services that can be used programmatically, under the namespace of Microsoft.AnalysisServices, in a managed environment. By using AMO you are able to create, modify, and delete objects such as cubes, dimensions, mining structures, and Analysis Services databases; over all these objects, actions can be performed from your application in the .NET Framework. You can also process and update the information stored in Analysis Services databases

 

1.1          Design Scenarios


1.              Need to have one partition per day
2.              Redemption period should be configured
3.              Need to have only 2  intraday partitions (most recent) granularity ie Hourly or 15 minutes for present day
4.              N-2 (Previous Previous partition ) time based partition needs to merge into current day partition on realtime bases

 


2.    Implementation

 

 

2.1  Partition Template


                Create a cube with required measure groups and create 2 partition templates on each measure group  as shown in fig 1 . the naming convention should be like this . Partition Template is nothing but default partition where you need to add where condition that contains no data .I had used WHERE file_ts BETWEEN TO_DATE('2000-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') AND TO_DATE('2000-01-01 23:59:59', 'YYYY-MM-DD HH24:MI:SS')

  1. Partition_D_Template  : for Daily partition
  2. Partition_DH_Template  :For Hourly partition . ( you can do it for every 15 minutes also)
Fig 1



 

2.2   Processing Table


After Cube is ready we need to create a R_CubeProcessing table where we keep track of every change in Source system and Cube . see the sample data in fig 2 .

ColumnName
Description
CubeName
Name of the Cube
MeasureGroupName
MeasureGroupName
MeasureGroupIdentifier
MeasureGroupIdentifier
IsActive
to Active or not
LastPeriod
LastPeriod avaialable in Cube
LastPeriodName
LastPeriod Name in certain format
LastPeriodRowCount
How many Rows for last period in the cube
LastPeriodRowCountQuery
Query to retrieve rowcount from source system
RowModificationDate
when we modofed this processing table row
LastPeriodNameQuery
query to retrieve last period data from Source

 


Fig2



2.3   SSIS Packages for Hourly partitions


For hourly partitions we need to create 2 SSIS package (Parent-Child package)
1.       Master Refresh Cube   :   This package is a parent package  where it collects all the  measure group details of the particular cube and send its details to child package .
2.      Refresh Cube : it is a child package which deals with each measure group Details passed by parent package .
       Step 3.1   Master Refresh cube .
Define the variables as shown in below figure 3. Give the CubeName as CVMS as we configured in processing table.
Fig 3


 

2.3.1        Execute SQL task -   SQL Get Active measure groups

 

 

Get all the active measure groups from the processing table  and pass into ActiveMeasureGroups  Object type variable.

SELECT MeasureGroupName, MeasureGroupIdentifier, LastPeriodNameQuery, LastPeriodRowCountQuery
FROM [ODS_Schema].[R_CubeProcessing]
WHERE CubeName = ? AND IsActive = 'Y'

And assign to the corresponding variables defined earlier in Master table .

2.3.2       Foreach loop Container : FELC Refresh Cube (active measure groups)


        Here the child package called Refresh Cube for each  measure group details .

2.3.3        SEQC Get Last Period information from cube and Source

            
         In this step we will get the value of Last period name from cube and last period name from source and last period name Source and Last period count Source using R_CubeProcessing table  .



   There are 3 combinations

1)      @LastPeriodName_cube != @LastPeriodName_source

If last periodname in cube is different from the Source
Then Create new Partition and process it .

2)      @LastPeriodName_cube == @LastPeriodName_source

 If last period name in cube is same as in the Source
Then Reprocess previous partition.

3)      @LastPeriodName_cube == @LastPeriodName_source && @LastPeriodRowCount_cube != (DT_I4)@LastPeriodRowCount_source
              If last periodname in cube is same as in the Source and Counts are are different
                Then Reprocess current Partition.

2.4  Update latest values in Processing Table

              
                Update the R_CubeProcessing table with new values for LastPeriod and LastPeriod Rowcount in Cube .

2.5  Merge Previous Previous Partition into Current Day Partition


                Merge the Previous previous partition into current day partition . so that we will have only 2 intra day partitions for the current day .

 

2.6  SSIS Script task AMO code Sample for creating new partition

 

Script Task Code for Generation Next partition and processing it using AMO objects

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using AMO = Microsoft.AnalysisServices;

namespace ST_8792940044454b19b8c3bb5347a9b827.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion      

        public void Main()
        {
            AMO.Server amoServer;
            AMO.MeasureGroup amoMeasureGroup;
            String XMLANextPartition = String.Empty, myCubeName, myMeasureGroupName, myTemplatePartitionName_partialday, myLastPeriodName_source, myPartitionNamePrefix;
            ConnectionManager myConnectionManager;

            try
            {
                myCubeName = Dts.Variables["User::CubeName"].Value.ToString();
                myMeasureGroupName = Dts.Variables["User::MeasureGroupName"].Value.ToString();               
                myConnectionManager = Dts.Connections["OLAP"];
                myTemplatePartitionName_partialday = Dts.Variables["User::TemplatePartitionName_partialday"].Value.ToString();
                myLastPeriodName_source = Dts.Variables["User::LastPeriodName_source"].Value.ToString();
                myPartitionNamePrefix = Dts.Variables["User::PartitionNamePrefix"].Value.ToString();

                amoServer = new AMO.Server();
                amoServer.Connect(myConnectionManager.ConnectionString);
                amoMeasureGroup = amoServer.Databases.FindByName(amoServer.ConnectionInfo.Catalog.ToString()).Cubes.FindByName(myCubeName).MeasureGroups.FindByName(myMeasureGroupName);
                amoServer.CaptureXml = true;

                String myNextPartitionName = myPartitionNamePrefix + myLastPeriodName_source.Replace(":", "_").Replace("-", "_");

                //check wether next partition already exists
                if (!amoMeasureGroup.Partitions.Contains(myNextPartitionName))
                {
                    //generate new partition creation script
                    AMO.Partition amoTemplatePartition = amoMeasureGroup.Partitions.FindByName(myTemplatePartitionName_partialday);
                    AMO.Partition amoNextPartition = amoTemplatePartition.Clone();
                    amoNextPartition.ID = myNextPartitionName;
                    amoNextPartition.Name = myNextPartitionName;
                    //edit source query with relevant where clause
                    AMO.QueryBinding amoQueryBinding = (AMO.QueryBinding)amoTemplatePartition.Source.Clone();
                    amoQueryBinding.QueryDefinition = amoQueryBinding.QueryDefinition.Replace("2000-01-01 00:00:00", myLastPeriodName_source);
                    amoNextPartition.Source = amoQueryBinding;
                    amoMeasureGroup.Partitions.Add(amoNextPartition);
                    amoNextPartition.Update();
                    XMLANextPartition = amoServer.ConcatenateCaptureLog(true, false);

                    amoNextPartition.Process(AMO.ProcessType.ProcessFull);
                    XMLANextPartition = amoServer.ConcatenateCaptureLog(true, false);
                }
                else
                {
                    //generate processing script only (partition already exists)
                    AMO.Partition amoPartition = amoMeasureGroup.Partitions.FindByName(myNextPartitionName);
                    amoPartition.Process(AMO.ProcessType.ProcessFull);
                    XMLANextPartition = amoServer.ConcatenateCaptureLog(true, false);
                }

                Dts.Variables["User::XMLANextPartition"].Value = XMLANextPartition;
                amoServer.Disconnect();
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                Dts.Events.FireError(0, Dts.Variables["System::TaskName"].Value.ToString(), ex.Message, String.Empty, 0);
            }
        }
    }
}

Code Walk Through;

Main Logic :

Step 1.
Create Next partition Name

String myNextPartitionName = myPartitionNamePrefix + myLastPeriodName_source.Replace(":", "_").Replace("-", "_")

myNextPartitionName = F_SUBSET_67109391_DH_ + 2012-11-06 15:00:00. Replace(":", "_").Replace("-", "_")

myNextPartitionName = F_SUBSET_67109391_DH_2012_11_06 15_00_00


Step 2

Check that New Partition Name is already existing measureGroup Partitions

//check wether next partition already exists
                if (!amoMeasureGroup.Partitions.Contains(myNextPartitionName))


Step 3

If partition Name is not present in measure group . Create new partition .

Clone the Template and Update the query to Modify the whereClause with lastperiod Name .

//edit source query with relevant where clause
                    AMO.QueryBinding amoQueryBinding = (AMO.QueryBinding)amoTemplatePartition.Source.Clone();
                    amoQueryBinding.QueryDefinition = amoQueryBinding.QueryDefinition.Replace("2000-01-01 00:00:00", myLastPeriodName_source);


Step 4

Generate the Processing new partition XMLA script and store it in XMLANextpartition variable which will run that script later in the package .

amoNextPartition.Process(AMO.ProcessType.ProcessFull);
                    XMLANextPartition = amoServer.ConcatenateCaptureLog(true, false);


Step 5

If parttion is already exists in measure group  then Generate the Processing new partition XMLA script and store it in XMLANextpartition variable which will run that script later in the package.

amoNextPartition.Process(AMO.ProcessType.ProcessFull);
                    XMLANextPartition = amoServer.ConcatenateCaptureLog(true, false);

2.7    SSIS Package for Daily Partition


      There are 2 Packages for Daily Partition process.

          1 Master refreshes previous day partition
          2 refresh previous day partition

The Master package gets all the details of measuregroups which are active from R_ProcessingCube table similar to hourly Partition process.

Master package Calls Refresh previous day partition for each measure group .
There are 2 Major steps in this  Child package


1.       SEQC Reprocess previous day partition  :  it reprocess the previous day partition
2.      SECQ Delete old partitions:  It deletes the old partitions based on the redemption period which is configurable.

2.8          SSIS Script task AMO code Sample for Deleting Old partition


using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using AMO = Microsoft.AnalysisServices;

namespace ST_8792940044454b19b8c3bb5347a9b827.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion      

        public void Main()
        {
            AMO.Server amoServer;
            AMO.MeasureGroup amoMeasureGroup;
            String XMLADeleteOldPartitions = String.Empty, myCubeName, myMeasureGroupName, myOldPartitionsPivotDayName, myMeasureGroupIdentifier, myPreviousDayName;
            ConnectionManager myConnectionManager;

            try
            {
                myCubeName = Dts.Variables["User::CubeName"].Value.ToString();
                myMeasureGroupName = Dts.Variables["User::MeasureGroupName"].Value.ToString();               
                myConnectionManager = Dts.Connections["OLAP"];
                myOldPartitionsPivotDayName = Dts.Variables["User::OldPartitionsPivotDateName"].Value.ToString();
                myMeasureGroupIdentifier = Dts.Variables["User::MeasureGroupIdentifier"].Value.ToString();
                myPreviousDayName = Dts.Variables["User::PreviousDayName"].Value.ToString();

                amoServer = new AMO.Server();
                amoServer.Connect(myConnectionManager.ConnectionString);
                amoMeasureGroup = amoServer.Databases.FindByName(amoServer.ConnectionInfo.Catalog.ToString()).Cubes.FindByName(myCubeName).MeasureGroups.FindByName(myMeasureGroupName);
                amoServer.CaptureXml = true;

                //remove partition related to data before pivot date
                int i=0;
                string[] partitionArray = new string[amoMeasureGroup.Partitions.Count-1];
                foreach (AMO.Partition amoPartition in amoMeasureGroup.Partitions)
                {
                    //check for daily partitions
                    if (!amoPartition.Name.Contains("TEMPLATE") && amoPartition.Name.StartsWith("F_SUBSET_" + myMeasureGroupIdentifier + "_D_"))
                    {
                        if (Int32.Parse(amoPartition.Name.Substring(amoPartition.Name.Length - 10, 10).Replace("_", "")) <= Int32.Parse(myOldPartitionsPivotDayName.Replace("-", "")))
                        {
                            partitionArray[i] = amoPartition.Name;
                            i++;
                        }
                    }
                    //check for intra-day partitions
                    if (!amoPartition.Name.Contains("TEMPLATE") && amoPartition.Name.StartsWith("F_SUBSET_" + myMeasureGroupIdentifier + "_DH_"))
                    {
                        if (Int32.Parse(amoPartition.Name.Substring(amoPartition.Name.Length - 19, 10).Replace("_", "")) <= Int32.Parse(myPreviousDayName.Replace("-", "")))
                        {
                            partitionArray[i] = amoPartition.Name;
                            i++;
                        }
                    }
                }
                for (i = 0; i < partitionArray.Length; i++)
                {
                    if (partitionArray[i] != null)
                        if (partitionArray[i].Length > 0)
                            amoMeasureGroup.Partitions.GetByName(partitionArray[i]).Drop();
                }

                XMLADeleteOldPartitions = amoServer.ConcatenateCaptureLog(true, false);

                Dts.Variables["User::XMLADeleteOldPartitions"].Value = XMLADeleteOldPartitions;
                if (XMLADeleteOldPartitions.Length > 0)
                    Dts.Variables["User::ExecuteXMLADeleteOldPartitions"].Value = true;
                else
                    Dts.Variables["User::ExecuteXMLADeleteOldPartitions"].Value = false;
                amoServer.Disconnect();
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                Dts.Events.FireError(0, Dts.Variables["System::TaskName"].Value.ToString(), ex.Message, String.Empty, 0);
            }
        }
    }
}

3.     Conclusion


With AMO you are able to create, modify, and delete objects such as cubes, dimensions, mining structures, and Analysis Services databases; over all these objects, actions can be performed from your application in the .NET Framework. You can also process and update the information stored in Analysis Services databases.
This Paper explains the design of dynamic partitioning using AMO which is very useful for near real time cubes.

Comments

  1. On which step you create the partition for today and how to merge hourly partitions into it, on which part /step ?

    ReplyDelete
  2. If you want to partition your cube or tabular project, check out the SSAS Partition Manager project on Codeplex which will dynamically add partitions with minimal configuration on your part. See https://ssaspartitionmanager.codeplex.com/

    ReplyDelete
  3. Through this post, I know that your good knowledge in playing with all the pieces was very helpful. I notify that this is the first place where I find issues I've been searching for. You have a clever yet attractive way of writing on Msbi online training
    Msbi online training Hyderabad
    Msbi online training India
    Msbi online course

    ReplyDelete

Post a Comment

Popular posts from this blog

SSAS : A connection cannot be made to redirector. Ensure that 'SQL Browser' service is running. (Microsoft.AnalysisServices.AdomdClient). + Solution

SSIS Error : Unable to infer the XSD from the XML file. The XML contains multiple namespaces

SSRS : [Teradata Database] 3939 There is a mismatch between the number of parameters specified and the number of parameters required.