Monday, November 6, 2006

More info on RS and AS [Reed]

A couple more ways the RS MDX Builder is smarter than I thought. I always wondered about some of the complexity.

- When you use a parameter, there’s no way around using StrToSet to turn the parameter into an MDX set. That leaves an opening for inserting malicious code. The auto-generated MDX adds the ,CONSTRAINED flag to StrToSet, which says that the only acceptable contents of the string are literal MDX names, so it won’t run any MDX functions (such as .Children), and it won’t run any inserted code.
- In AS2005, anytime you use a subcube subquery to filter down to a single date, there’s a problem with date-relative calculations because the “current member” in the report is still the “All” member of the date hierarchy. This is a serious problem right now with Excel 2007 PivotTables and MDX. The RS MDX builder adds a funny-looking WHERE clause that basically says, if you single-selected a date for the parameter, make it the current member so that date-relative calculations will work properly.”


Friday, October 13, 2006

Combining Dimensions [Reed]

The way AS2005 manages attributes in a dimension changes the design dynamic a bit: Which attributes should be combined into a single dimension? I've run into this issue from two sides.

On the one hand, we had a Project dimension with a lot of changing attributes. For example, the Project Manager would change frequently. To avoid dealing with creating lots of Type II instances of the Project, we decided to pull the Project Manager off as a separate key in the fact, making it a "Rapidly Changing Dimension". In 2000, this was often a very effective strategy. It's simple and easy to implement, but in 2005, we found that query performance was very bad. Even thougy the PM to Project relationship is very sparse, each crosstab query had to do a cartesian product and find the empty intersections. When we moved the PM back into the Project dimension, AutoExist suddenly made it really fast to find the projects for a given manager, or the managers for a given project.

Conversely, I had a client with lots of customers and lots of products, but where the relationship between customers and products was very stable and very sparse. It seemed bizarre, but we explored putting both customers and products into a single dimension, with the root key as the Customer/SKU combination. Again, the query performance gain was substantial, especially when looking for which customers buy which products, etc. In 2000, I would never have considered putting Customer and Product into a single dimension, but given the high cardinality, stable relationship, and inherent sparsity, it seems to be a really good idea in 2005.

- Reed

ByAccount aggregation bug workaround [Joe Kasprzak]

When working with Account type dimensions and semi-additive Measures across a variety of account types with different aggregation functions, you can use the ByAccount aggregation function to set the proper aggregation behavior.

There is a bug that sometimes occurs to the finance measure where the ByAccount option doesn’t properly set various aggregation rules for each account type. This can be manually fixed by the following process:

* Right click the solution in the Solution Explorer window and click “Edit Database” to display the current settings for the finance measures by account type.

The ByAccount aggregation function should have set for each account type to either sum for additive or lastnonempty for accounts which are semi-additive.

* Click the Aggregation Function in the Account Type Mapping box and change the aggregation function from Sum to Lastnonempty for the Flow, Liability, Balance, Asset and Statistical account types.

- Joe

Friday, September 1, 2006

Don't declare false Attribute Relationships [Reed]

It's really important to declare attribute relationships when they do exists. But what happens if you say A implies B when it isn't true. I somehow assumed that when you process the dimension, you'd get an error, but not so.

In the Adventure Works Date dimension, I created an attribute relationship from Day of Month to Day of Week. This is, of course, rediculous. Just because it's the 14th, that doesn't mean it's Wednesday. Just for fun, I then created a User Hierarchy of the same, with Day of Week drilling down to Day of Month. I got the nice, satisfying green triangle on the hierarchy. Then I deployed it.

As I said, I expected some kind of error or warning, but the days of the month just got a random weekday assigned. (I assume it has to do with the order the records were read from the dimension table.) So Sunday got associated with 12, 15, 18, and Monday with 1, 4, 21, 24, etc. This is actually the same behavior that AS2K had--if you created Day of Week as a member property of Day of Month, it would just pick one. I just thought 2005 would complain.

So, the moral is that you really really want to create attribute relationships when they are meaninful, but be careful not to create ones that are not, and don't trust silence as an indication of success.

-- Reed Jacobson

Thursday, August 31, 2006

New Style and Old Style Calculated Member commands [Reed]

Chris Webb posted an interesting observation about a new style for calculated members in scripts, where instead of


you can just say

CREATE [New Style] = 1;

He points out a "bug" with the old style syntax. I added some comments relating to some experiments he inspired me to do. My conclustion is that it's too predictable a behavior to be a real bug, but if MS wants to call it a bug, that's OK; and in any event, it might be worth being aware of, especially if you mix script assignments with calculated members. Here's the link to his July 2006 blog page:!7B84B0F2C239489A!823&d=1&wa=wsignin1.0

-- Reed Jacobson

Tuesday, August 29, 2006

Dynamic Dimension Security in Analysis Services 2005 [Reed]

Part I – The Measure Group Setup


The idea of dynamic dimension security is that with a single security role, each user gets a different list of members, based on an MDX expression that returns the permitted set. For example, User A may be allowed read access to different products than User B.


The MDX Username function returns the current user’s login information. If there is a many-to-many relationship between the User and the permitted products, you need a security map table, which is actually a “fact”, or bridging, table that contains one foreign key for the user and another for the product.


The SQL Server 2005 Adventure Works sample cube does not include a specific security map table, but it is possible to use the Internet Sales measure group as a surrogate: If a user (aka customer) purchased a product over the internet ([Internet Sales Amount]), then that user has the right to see the total sales for that product ([Sales Amount]). The structure is the same as if we used a regular security map table.


With a real security map table, you would create a measure group, and use a simple count of the rows as the measure, naming it something cool like [Security OK]. In this example, [Internet Sales Amount] is the surrogate for the [Security OK] measure.


Part II – The Username comparison


The format returned by the Username function is DOMAIN\User. In an ideal world, your dimension table would have an attribute in the exact same format. In the real world, you often have something a little different. For example, the name in the dimension may not include the domain. The closest thing to a user name in the Adventure Works Customer dimension is the Email Address attribute. Its format is This is not a bad approximation for a real-world user name attribute.


In the MDX security expression you compare the result of the Username function with the value of the [Email Address] attribute. Here’s an expression that will convert the Username to the right format for the Adventure Works [Email Address]:


Mid(Username,InStr(1,TestName,"\")+1) + ""


This expression will work even if users can be on multiple domains. If you know that all users will be on the same domain, you can make the expression evaluate much faster by replacing the (slow) InStr function with a hard-coded length. Assuming a constant domain name of DOMAIN, this formula is the simplified version of the previous one:


Mid(Username,7+1)  + ""


For testing purposes (in case your current domain user name doesn’t happen to be in the Adventure Works Customer dimension), you can create a calculated member as a surrogate for the Username function:


Member Measures.TestName as "DOMAIN\anne14"


Then, during early testing, you simply replace the Username function with the TestName calculated member. This makes it easy to try out different “users”. Then, during final testing, switch it back to use the Username function.


Part III: The MDX Set Function


The general concept of the set function is to find all the products where the intersection of the product, the current user, and the [Security OK] measure is not empty. (I’m going to show three methods for doing the set. If you just want the “best” answer, skip to the third one.)


** Double-Filter Method (2000 and 2005)


Here’s the way I usually did this in 2000. You use an “inner filter”, to find the User whose email alias matches the user logon (there should be only one), and use Item(0).Item(0) to turn it into a member. Then you combine that member with the test measure, and check for whether that tuple is empty in an outer filter against the list of products. This technique does work in 2005, but it’s a bit slow. On my test box it was about 1 minute.


Filter( [Product].[Model Name].[Model Name].Members, not IsEmpty(
            ( Filter ( [Customer].[Email Address].[Email Address].Members
            , [Customer].[Email Address].CurrentMember.Name = TestNameX
            ).Item(0).Item(0) , [Measures].[Internet Sales Amount] ) ) )


Note: To test the set, you can run a simple query:



            Member Measures.TestName as …

            Member Measures.TestNameX as …

            Set SecurityTest as …

Select [Measures].[ Sales Amount] on 0,

SecurityTest on 1

From [Adventure Works];


** NonEmpty Method (2005 only)


One way to simplify this is to use the NonEmpty function that is new in 2005. This basically puts the outer Filter and the IsEmpty test into a single function, but the logic is basically the same: Filter the list of products by testing it against the (single member) set of email members that match the current user. Note that you don’t need the .Item(0).Item(0) for this one, because NonEmpty is comparing two sets, so the result of the Filter function is fine as is. This is much faster—on my box, it’s about 3 seconds. Quite an improvement over 60 seconds.


NonEmpty ( [Product].[Model Name].[Model Name].Members
            , ( Filter ( [Customer].[Email Address].[Email Address].Members
            , [Customer].[Email Address].CurrentMember.Name = TestNameX

            ) , [Measures].[Internet Sales Amount] ) )


** NonEmptyCrossJoin Method (2000 and 2005)


Another way to do it—and this theoretically works in 2000 as well as 2005, but I don’t have a 2000 instance handy to test it on—is to use the NonEmptyCrossJoin function. Mosha says he hates the NECJ function (which is OK for him to say because he wrote it), and to use NonEmpty instead. But the NonEmpty function only takes two sets, and the NECJ function can take an indeterminate number. I have always had a great fondness for NECJ. The real goal is to find all the members of the Product dimension that actually exist in the Security measure group, when intersected with the specified member. This approach took < 1 second on my box, and I still like NECJ. Maybe there’s a better way in 2005 that can avoid NECJ, but I like this one. Plus, as an added bonus, it also exploits StrToMember which is another “no no” that I love to use.


            [Product].[Model Name].[Model Name].Members
            , StrToMember("[Customer].[Email Address].[Email Address].[" + TestNameX + "]")
            , [Measures].[Internet Sales Amount] , 1 )


-- Reed Jacobson

Wednesday, August 16, 2006

Designing legacy DTS packages from within SS2005 Management Studio [Michael B]

Problem solved. Thanks to all who wrote with suggestions.

Here is the full story. Sorry if it is a bit much, but there were several false starts.

Chapter 1:

I had been running SS2000 and had a bunch of DTS packages on my laptop. I installed SS2005 Developer’s Edition on the machine, which wiped out the SS2000 Enterprise Manager.

When I tried to open any of the legacy DTS packages from within Management Studio, I got an error message box opened telling me that I needed to install SQL Server 2000 DTS Designer Components.  Please note that I was not attempting to migrate the DTS packages to SSIS; I was merely trying to open the packages in the equivalence of the package designer used in the older Enterprise Manager.

I found the appropriate page on Microsoft’s site discussing the problem ( and the site to download the “fix” ( I downloaded the recommended file (SQLServer2005_DTS.msi), installed it, and rebooted.

It was no help; symptoms were unchanged.

Chapter 2:

One of our colleagues suggested that I uninstall and reinstall SS2005. I was getting close to do this and was in the Add/Remove Programs utility when I noticed SS2005 Backward Compatibility Components item in the currently installed programs list for removal or repair. I ran the repair.

That didn’t fix it, but it changed the symptoms somewhat. Now when I tried to open a DTS package within SS2005, I got 3 separate error message boxes in succession (I believe the third was the same as the single error message I was originally getting (and oddly, sometimes I got just two messages -- the first and third. Also the package tried to open but only displayed some of the icons in the design window, usually just the connecting arrows.)

Chapter 3:

Two colleagues had sent a link to another Microsoft article  (, (and one of those also suggested I install the Backward Compatibility Components which I had already done, as mentioned above) and now, with the 3 error message boxes, my symptoms matched what was discussed on the Microsoft page.

The support page gave two workaround methods. The second one was in effect to reinstall the client components from SS2000. The first method promised to allow the correct opening of DTS packages from within Management Studio, which was my preferred route.

It said to remove the following lines of code from the Sqlwb.exe.manifest file in the \Program Files\Microsoft SQL Server\90\Tools\Binn\VSShell\Common7\IDE folder.

The way the lines actually display on the Microsoft page, line 3 (red line numbering added by me for reference) is one long line in a small, scrollable text box. In this message, it wraps.


When I opened the file for editing I saw something like the following (once again, I’ve added line numbers) with lines 2 and 3 continuing on the right beyond the window width:

3. 4.  

I searched for and found the opening tag and its closing tag mate and deleted those and everything in between as it seemed to instruct on the Microsoft page.

What I didn’t notice was that the line went way off to the right and that there were 3 assemblyIdentity tag groups on that line. Without realizing it, I had deleted all of them. (Of course, I had made a copy of the file before altering it.)

After deleting the lines, SS2005 Management Studio wouldn’t open at all giving a memory location error. When I restored the lines to the file, Management Studio would open again, but I once again, I still couldn’t properly open DTS packages. 

After several additional attempts to edit the file, I finally noticed that there were 3 sets of assemblyIdentity tag groups, only one of which looked exactly like line 3 in the Microsoft text box. I reformatted the lines of the file so that I could more clearly see the logical tag groups. It now appeared in the Wordpad window something like the following (again, minus the line numbering and the bold font on line 7. And again, lines 5, 6, & 7 scroll off to the right, beyond the width of the editing window.) (BWT, I just noticed while writing this email, that lines 5 & 6 are identical, but that’s how they were in the original file.)

5. 6. 7. 8.

What I realized I needed to do was to delete only line 7, keeping lines 3 thru 6,  and lines 8 thru 10. This indeed did the trick. I can now edit DTS legacy packages from within SS2005 Management Studio.


1. Install or repair SS2005 Backward Compatibility Components using the Windows Add or Remove Programs utility in Control Panel.

2. Download and install SQL Server 2000 DTS Designer Components which is the SQLServer2005_DTS.msi file from

3.Carefully edit \Program Files\Microsoft SQL Server\90\Tools\Binn\VSShell\Common7\IDE\ Sqlwb.exe.manifest  file removing only the tag group that looks like line 7 immediately above.

I hope this is clear and is helpful.

Michael Bergelson
Hitachi Consulting

BTW, I also noticed that under SS2005 Management Studio, there is no option to view the DTS package run logs on a right click, nor have I found that function anywhere else as yet.

Monday, August 7, 2006

RS Aggregate and Manual MDX

In a recent post, I said RS Report Designer is smarter than I thought. And it is. But it is also really, really weird when you try customizing your MDX. (Not that I ever do that.)

This might be a bug, but it seems to happen only when you customize the MDX. Here's a scenario:

1) Use the wizard to create a table report with Country, State-Region, and City levels. Use Sales Amount (additive) and Average Unit Price (non-additve) as measures, and include subtotals.
 [At this stage, the automatic MDX includes just the City level]

2) Change the Sum function for Average Unit Price (and Sales Amount too, if you feel like it), to Aggregate. [At this stage, the MDX changes to include the Country and State-Region attributes.]

3) Delete the Country group from the report. [The auto-MDX changes to remove the Country attribute, and the report still works fine.]

BUT. If between step 2 and step 3, you go make an intsy-weentsy change to the MDX, then the auto-MDX does not remove the Country attribute. IN THIS CASE, none of the lower-level Aggregate functions work. They all return Null. If you have unused higher-level attributes in your MDX, lower-level Aggregate functions return Null

The really wacko part is that if you never touch the MDX, and just let the Report Designer fiddle with it, everything seems to work OK. But if you do use the Aggregate function and you do customize the MDX statement, make sure that you have a grouping in the report for each attribute that has higher-level totals.

Incidentally, the place I ran into this big time was when I was decided to add an All level to a hierarchy in the cube. When I had the All level there, the report showed all blanks. When I removed the All level in the cube, the report worked. It was really strange. Why would a change like that in the cube definition make the report switch between showing correct values and showing nulls? Well, naturally, I was writing my own MDX, and I asked for [Hierarchy].Members, rather than explicitly cross-joining specific attributes. So having the All level in the hierarchy returned an extra level in the result set--a level that wasn't "used" in a grouping in the report, so all the Aggregate functions below that level returned Null. Removing the All level from the hierarchy made it so that all the levels in the result set were actually used in the report.

- Reed Jacobson

Friday, August 4, 2006

Happiness: RS Report Designer smarter than I thought

Ever since the earliest Yukon beta's I've been frustrated with how RS report wizard handles AS as a data source. The problem has to do with non-additive measures--for example, a distinct count. In the cube, you create a measure or calculated member that gets the right answer at every point in the cube. But when you use the RS wizard to create the MDX and the report, it only queries for the lowest level used in the report, and then uses Sum for all the higher levels. This is fine when it is a Sum, but doesn't work at all for Distinct Count.

Brian Welcker kept telling me that they added the Aggregate function and fixed the problem, but every time I tried, I still saw the same leaf-level MDX and Sum function. And I know that you can't get the right answer with only leaf-level MDX. So I would change the MDX to return the multi-level values, and then change the report to use the Aggregate function instead of Sum, and it did work. But it really didn't make a good story.

Well, earlier this week, I discovered the secret. I had always fixed the MDX first, and then changed the Sum function to Aggregate. That was obvious to me, because how would Aggregate get the higher level value if the MDX is wrong? Well, I "accidentally" changed the Sum function to Aggregate first, and when I previewed the report, it showed the right answer. How could it do that? Switching to the data pane, the MDX was still the old detail level. So I used Profiler to capture the MDX, and the MDX was different--it included the additional levels.

The bottom line is that if you use the MDX designer to create the MDX, and then modify the report to use the Aggregate function instead of the Sum function (assuming you have non-additive measures), then the report designer changes the MDX to make it work. When I think about it, this really does make sense. If all you have is Sum formulas, it's more efficient to just add them up in RS. RS can't really know whether a measure is additive or not, so it does default to use the Sum function. But all you have to do as a user is build the report (with the wizard) and then go change the function for any non-additive measures to Aggregate. The report then adjusts the MDX to make it work.

It's a bit creepy for me to have it do all that "automatically" for me. (The windows on my car open with crank handles--or zippers--and don't even talk to me about automatic door locks.) But it actually does make sense and it does work. Maybe it's even documented in the new July BOL ;-).

- Reed Jacobson

Thursday, August 3, 2006

Top 10 Plus Other Query

On Teo Lachev's forum at Prologika, there was a question from gudjonv about how to do a RS chart that showed the top 10 members plus an "all other" member (thus, 11 slices of the pie). [See]. I didn't feel like registering so I could post a response, so I'll put the answer (converted to Adventure Works) here.

This is the general method I've used to do "Top 10 Plus Other" since OLAP 7.0. I can't think of any compelling functionality in 2005 that would do the job better. (Of course, in 2005, the Aggregate function will work even with a distinct count measure, so that makes this solution more functional, but it doesn't change the solution.)

  SET [TCat] AS TopCount([Product].[Subcategory].[Subcategory],10,[Measures].[Sales Amount])
  MEMBER [Product].[Subcategory].[Other] AS
Aggregate([Product].[Subcategory].[Subcategory] - TCat)
 SELECT { [Measures].[Sales Amount] } ON COLUMNS,
 TCat + [Other] ON ROWS
FROM [Adventure Works]

- Reed Jacobson

Tuesday, August 1, 2006

Sadness: Excel 2007, Subcube Subqueries, and Growth Formulas

When I first saw a demo of Excel 2007 in the Beta 1 timeframe, I was told that the Slicers in Excel 2007 would still use the WHERE clause, not the new Subcube Subquery available in SQL2005. I was very disappointed (partly because in the beta timeframe for SQL 2005, the Where clause behaved so badly). But when I got my copy of Excel 2007 and looked at the actual MDX, it turns out that they do use the subcube subquery. At first I was happy, but then I discovered a strange consequence of using the subcube subquery (while, in a bit of irony, by RTM the relative benefit of a subcube subquery has decreased because they fixed the behavior of the Where clause).

Suppose you have a "Sales Growth" measure. In all likelihood, the MDX will include something like ([Measures].[Sales],[Time].[Calendar].CurrentMember.PrevMember). Now suppose you have a query with [Nov 2007] as the slicer. If the slicer goes into the WHERE clause, then the current member of Time.Calendar is [Nov 2007] and the prev member is [Oct 2007] and all is good. But if the slicer goes into a subcube subquery, then the Time.Calendar dimension gets collapsed to [All]-->[2007]-->[Q4]-->[Nov 2007], but the current member is [All], not [Nov 2007]. With the subquery, the value of [All] equals the value of [Nov 2007], so the basic measures look alright, but the prev member for [All] is Nothing, so the growth calculation screws up.

The only way I've been able to get the correct growth to show up is by explicitly putting the Time dimension onto the Row or Column axis, so there's a meaningful CurrentMember. It works, but it looks a bit stupid.

I was hoping that the Existing operator (that I recently discovered) would provide a workaround for the problem (because the formula could then use Existing Time.Calendar.Month.Members. But, sadly, the "current context" for the calculated member does not take into consideration the subcube subquery restriction, so you get the entire set of months, not just the "current" month. So this approach did not work.

If anyone has an idea for how to write a formula that properly calculates growth, even when the filtering is done by using a subcube subquery, I'd love to hear it.

- Reed Jacobson

Friday, July 28, 2006

New Existing operator

I just found out about the Existing operator. You put it in front of a set and it filters the set by using all the current members in the current context. This simplifies using Attributes without User Hierarchies.

For example, suppose you need to do the equivalent of the LastChild aggregation function, but with the Standard Edition: Use the Existing keyword to find the last period. The issue is that without a multi-level time hierarchy, how do you know which is the "last" month in the current filtered space. So Existing Time.Month.Members is essentially equivalent to Descendants(Time.CurrentMember,Time.Month), except that it doesn't require a User Hierarchy. Thus Tail(Existing Time.Month.Members) give the last month in the current filter set. (Note that this doesn't give the Last non-empty member, just the Last Child.)

- Reed Jacobson

Visual Totals and Row Filters

I was troubleshooting with a client about some slow queries using Excel (2003) as a client for AS2005.

We had a large (customer) dimension on rows, and measures on columns. We added a second customer attribute (region) to rows. The pivot table returned very fast. As soon as we filtered by a region member (one or more), the report slowed to a crawl. Putting the region into the slice (filter) area and then selecting one or more items was fine, but with region on rows, it was slow. The problem is that users want to see the region next to the customer name.

Turning off Visual Totals gave fast results as well, but that allowed unwanted customers to make it through the filter, and the grand total was not what we wanted. In effect, we wanted the meaning of Visual Totals (with Region included on rows), with the performance of Region in Slicer (or no Visual Totals).

The best workaround we could come up with was to create two instances of the Region attribute, so there's a regular Region attribute (which goes on rows) and a Filter Region attribute (which goes into slicer). Filtering by the one in the Slicer area allows the report to be fast, even with Visual Totals, but still lets the user see the regions in the body of the report. We had to create similar parallel attributes for any customer attribute the user might want to show on rows--while simultaneously using as a filter.

My speculation as to the reason for this is that if the region filter is in the Slicer, it's the core cube "pyramid" engine that does the operations, and it's fast. Whether Visual Totals is specified or not is irrelevant as long as the Region is in the Slicer, because no "total" is being re-calculated by MDX. If the Region filter is in rows, then the All member of the Region attribute (which is implicitly there for every customer) has to be recalculated as a Visual Total by MDX, and that's really slow. Any time the MDX has to calculate for each member of the Customer hierarchy (whether it eventually gets filtered away by Non Empty or not), the overall report is slow. If an MDX expression only has to calculate for the rows that end up in the final report, it's really fast. The moral for me is to keep the high-cardinality operations in the Pyramid engine and restrict the Calculation engine to relatively small scope.

- Reed Jacobson


Thursday, July 27, 2006


A few times over the past couple of months I have run into a situation where removing NON EMPTY from a slow MDX query radically improved the response time. The symptom was clear, but not predictable. Earlier this week, we were able to pin down at least one specific scenario that definitely triggered the NON EMPTY slowness.

We were using Excel (2003) as the client, and had a large set (customers) on rows. Excel's default NON EMPTY behavior was filtering away the uninvolved customers. Everything worked well until we added a second measure. The query went from 1 second to 10+ minutes (or so). The measures were both calculated members that did a direct reference to base measures, and they did have the NonEmptyBehavior property set to the base measure.

After some troubleshooting, we observed that the speed hit happened only when the measures were on the Rows axis. If we put the measures on the Columns axis (i.e., a different axis from the large set), it went back to being really fast. The reason adding a second measure slowed it down was because with one measure, Excel puts the measure into the Where clause, but with two, it defaults to put the "Data" on rows.

So the solution (aka workaround) is to simply add the multiple measures to the report before adding the large dimension. With two measures on the report, you can move the measures to columns and then add the large dimension. It's a bit of a pain, but the difference between 10 minutes and 1 second is worth it.

My speculation is that 2005 is trying to optimize NON EMPTY behavior so that it's similar to the speed benefits of NonEmptyCrossjoin. If it gets the optimization right (which happens when the measures are on a different axis), it's blazingly fast. If it gets the optimization wrong (which happens when the measures are on the same axis), then the calculated member is evaluated for each member of the large set and it's distressingly slow.

- Reed Jacobson