Thursday, September 27, 2007

Automating Partition Creation

By Bryan C. Smith


 

The use of multiple partitions per measure group is promoted to improve both query and processing performance as well as the overall manageability of cubes. Partitioning strategies are often time-based, so that a single partition may represent a specific year, quarter, month, or some other unit of time.

One of the central challenges of partitioning a measure group on time (or any other non-static dimension) is that new partitions may be required over the life of the cube. If it can be predicted when these partitions might be needed, the required partitions can be manually created in advance by the admin. Where values cannot be predicted, an "overflow" partition can be created to catch any values not attributed to other, more precisely defined partitions.

Still, having the ETL layer query the database to determine the partitions required and then create those partitions on your behalf is a much more attractive approach. Below is a simplistic script demonstrating the use of AMO to do just that.

The script is written for execution within an SSIS Script task and requires a reference to the AMO library. Its goal is to add any year-based partitions to the Reseller Sales measure group within the Adventure Works cube of the Adventure Works DW multidimensional database. It is tested by dropping any partition from the Reseller Sales measure group.

Here is a quick rundown of the logic.

The script starts by establishing references to the Adventure Works cube and the Reseller Sales measure group within the Adventure Works DW multidimensional database. A query is issued against the DimTime table in the AdventureWorksDW relational database to determine which years exist within the table and the key ranges of the records associated with those years. (There is a critical assumption here that records in the DimTime table are sequenced by date. This may not be true within your data warehouse.)

The routine then loops through each partition in the Reseller Sales measure group to determine is a partition for that year exists. If not, a new partition is created with the appropriate query bindings. The first aggregation design used within the measure group is assigned to the new partition. (This is not a recommended approach but simply serves to demonstrate how aggregations can be assigned to new partitions.) The partition is then associated with the measure group and the change is submitted to SSAS. (It is important to note, the new partition is unprocessed at this time.)

Imports System

Imports System.Data

Imports System.Math

Imports Microsoft.SqlServer.Dts.Runtime

Imports Microsoft.AnalysisServices


 

Public
Class ScriptMain


 


Public
Sub Main()


 


'Locate Measure Group of Interest (and a Known Partition)


'---------------------------------------------


Dim MyServer As
New Server

MyServer.Connect("localhost")


 


Dim MyDatabase As Database = _

MyServer.Databases.GetByName("Adventure Works DW")


Dim MyCube As Cube = MyDatabase.Cubes.GetByName("Adventure Works")


Dim MyMeasureGroup As MeasureGroup = _

MyCube.MeasureGroups.GetByName("Reseller Sales")


 


'Retrieve Years in Dimension


'---------------------------------------------


Dim MyConnectionString As
String

MyConnectionString = "Server=localhost;" + _


"Database=AdventureWorksDW;Integrated Security=SSPI;"


Dim MySqlConnection As
New SqlClient.SqlConnection(MyConnectionString)


Dim MySqlCommand As
New SqlClient.SqlCommand( _

MySqlQueryString, MySqlConnection)


 

MySqlConnection.Open()


Dim MySqlReader As SqlClient.SqlDataReader = MySqlCommand.ExecuteReader


 


'Add Partitions That Don't Exist


'---------------------------------------------


While MySqlReader.Read


 


Dim MyYear As
String = MySqlReader.GetString(0)


Dim MyStartKey As Int32 = MySqlReader.GetInt32(1)


Dim MyEndKey As Int32 = MySqlReader.GetInt32(2)


 


Dim MyPartitionName As
String = "Reseller_Sales_" + MyYear


Try


Dim MyPartition As Partition = _

MyMeasureGroup.Partitions.GetByName(MyPartitionName)


Catch


Dim MyPartition As
New Partition


With MyPartition

.Name = MyPartitionName


If MyMeasureGroup.AggregationDesigns.Count > 0 Then

.AggregationDesignID = _

MyMeasureGroup.AggregationDesigns.Item(0).ID


End
If

.Source = New QueryBinding(MyCube.DataSource.ID, _

MyParitionQueryString(MyStartKey, MyEndKey))


End
With

MyMeasureGroup.Partitions.Add(MyPartition)

MyPartition.Update()


End
Try


End
While

MySqlReader.Close()


 

MySqlConnection.Close()

MyServer.Disconnect()


 

Dts.TaskResult = Dts.Results.Success


End
Sub


 


Private
Function MySqlQueryString() As
String


Dim Ret As
String

Ret = "select CalendarYear," + _


"min(TimeKey),max(TimeKey) " + _


"from dbo.DimTime " + _


"Group by CalendarYear"


Return Ret


End
Function


 


Private
Function MyParitionQueryString(ByVal StartKey As Int32, _


ByVal EndKey As Int32) As
String


Dim Ret As
String

Ret = "SELECT " + _


"[dbo].[FactResellerSales].[ProductKey]," + _


"[dbo].[FactResellerSales].[OrderDateKey]," + _


"[dbo].[FactResellerSales].[DueDateKey]," + _


"[dbo].[FactResellerSales].[ShipDateKey]," + _


"[dbo].[FactResellerSales].[ResellerKey], " + _


"[dbo].[FactResellerSales].[EmployeeKey]," + _


"[dbo].[FactResellerSales].[PromotionKey]," + _


"[dbo].[FactResellerSales].[CurrencyKey]," + _


"[dbo].[FactResellerSales].[SalesTerritoryKey]," + _


"[dbo].[FactResellerSales].[SalesOrderNumber]," + _


"[dbo].[FactResellerSales].[SalesOrderLineNumber]," + _


"[dbo].[FactResellerSales].[RevisionNumber]," + _


"[dbo].[FactResellerSales].[OrderQuantity]," + _


"[dbo].[FactResellerSales].[UnitPrice]," + _


"[dbo].[FactResellerSales].[ExtendedAmount], " + _


"[dbo].[FactResellerSales].[UnitPriceDiscountPct], " + _


"[dbo].[FactResellerSales].[DiscountAmount], " + _


"[dbo].[FactResellerSales].[ProductStandardCost], " + _


"[dbo].[FactResellerSales].[TotalProductCost], " + _


"[dbo].[FactResellerSales].[SalesAmount], " + _


"[dbo].[FactResellerSales].[TaxAmt], " + _


"[dbo].[FactResellerSales].[Freight], " + _


"[dbo].[FactResellerSales].[CarrierTrackingNumber], " + _


"[dbo].[FactResellerSales].[CustomerPONumber]" + _


"FROM [dbo].[FactResellerSales] " + _


"WHERE OrderDateKey BETWEEN " + StartKey.ToString + " AND " + _

EndKey.ToString


 


Return Ret


End
Function

End
Class