Thursday, March 22, 2012

dimension design

In my telco cube, I want to analyze usage. Eg how much sms's a subscriber has sent. The fact table is completely denormalized and has lots of columns with different types of usage. For example, following columns occur:

MO_VOI_PEAK

MO_VOI_OFFPEAK

MO_VOI_WEEK

MO_VOI_WEEKEND

Now the sum of these fields overlap. Evenings during the week are also offpeak... This is only an example. Different categories occur for different products, like voice, sms, mms and so on. And not all categories apply to all products. Eg, mms does not have the peak and offpeak distinction.

Is it possible to put this into one dimension, without aggregating double?

I have already normalised the fact table, meaning that for each usage type, I have one record. So the 4 types mentioned above, all occur in 4 records, instead of 4 columns.

I have also built up a table to link this usage type to its attributes. You can find an extract below.

UsageType

OrigDesc

ApplCode

PeakCode

RoamingCode

ScopeCode

WeekCode

PersCode

MO_VOI_NAT_ONNET

MO

VOI

NON-ROAM

NAT_ONNET

MO_VOI_NAT_OTHER

MO

VOI

NON-ROAM

NAT_OTHER

MO_VOI_NAT_FIXNET

MO

VOI

NON-ROAM

NAT_FIXNET

MO_VOI_INT

MO

VOI

NON-ROAM

INT

MO_VOI_ROAM

MO

VOI

ROAM

MO_VOI_OTHER_DEST

MO

VOI

NON-ROAM

OTHER_DEST

MO_VOI_FLATRATE

MO

VOI

NON-ROAM

FLATRATE

MO_VOI

MO

VOI

MT_VOI_NAT_ONNET

MT

VOI

NON-ROAM

NAT_ONNET

MT_VOI_NAT_OTHER

MT

VOI

NON-ROAM

NAT_OTHER

MT_VOI_NAT_FIXNET

MT

VOI

NON-ROAM

NAT_FIXNET

MT_VOI_INT

MT

VOI

NON-ROAM

INT

MT_VOI_ROAM

MT

VOI

ROAM

MT_VOI_OTHER_DEST

MT

VOI

NON-ROAM

OTHER_DEST

MT_VOI

MT

VOI

MO_VOI_PEAK

MO

VOI

PEAK

MO_VOI_OFFPEAK

MO

VOI

OFFPEAK

MO_VOI_WEEK

MO

VOI

WEEK

MO_VOI_WEEKEND

MO

VOI

WEEKEND

MT_VOI_PEAK

MT

VOI

PEAK

MT_VOI_OFFPEAK

MT

VOI

OFFPEAK

MT_VOI_WEEK

MT

VOI

WEEK

Is there a way to include all this information into 1 dimension, without aggregating in a wrong way, so without summing up double?

Thanks a lot

Joos

Could you clarify why there are multiple rows for some UsageTypes above, but with the same data in the columns shown? For example, these 6 rows are repeated later:

UsageType

OrigDesc

ApplCode

PeakCode

RoamingCode

ScopeCode

WeekCode

PersCode

MO_VOI_NAT_ONNET

MO

VOI

NON-ROAM

NAT_ONNET

MO_VOI_NAT_OTHER

MO

VOI

NON-ROAM

NAT_OTHER

MO_VOI_NAT_FIXNET

MO

VOI

NON-ROAM

NAT_FIXNET

MO_VOI_INT

MO

VOI

NON-ROAM

INT

MO_VOI_ROAM

MO

VOI

ROAM

MO_VOI_OTHER_DEST

MO

VOI

NON-ROAM

OTHER_DEST

|||

They are not the same...you have MO and MT, meaning originating and terminating.

Regards

Joos

|||

Do you want each UsageType to be independently aggregated, with no aggregation of usage across UsageTypes? If so, you could create a UsageType dimension, where the "IsAggregatable" property of the UsageType attribute is set to false. Otherwise, could you explain how you want usage to be aggregated in more detail, with an example?

http://msdn2.microsoft.com/en-us/library/ms174497.aspx

>>

SQL Server 2005 Books Online

Configuring the (All) Level for Attribute Hierarchies

...

The presence of an (All) level in an attribute hierarchy depends on the IsAggregatable property setting for the attribute and the presence of an (All) level in a user-defined hierarchy depends on the IsAggregatable property of the attribute at the top-most level of user-defined hierarchy. If the IsAggregatable property is set to True, an (All) level will exist. A hierarchy has no (All) level if the IsAggregatable property is set to False.

>>

|||

I do want aggregation of usage types, but I want to control aggregation...

I would like to include all attributes of the usage types as displayed in the table. With these attributes, I would like to build hierarchies, like for example origdesc - appl code - peak code, and origdesc - appl code - weekcode. However, for example for Voice, this is not the sum of all week, weekend, peak and offpeak usage types. Moreover, for example for mms, I do not have this distinction and ideally, I would only like to show the appl codes, for which the hierarchy applies, meaning that MMS should not occur in the given hierarchy. This last remark is impossible, I guess...

Most importantly, I would like to control which usage types should be taken to get eg the voice aggregate and eg the MO aggregate. If I would just sum up all voice usage types, this would not be correct...I would be counting more than double.

|||Well, you've given examples of how usage aggregates in some specific cases, but there's no systematic pattern that I can discern. One approach that comes to mind is a fact table where each row indicates (non-overlapping) usage, so that values there can always be aggregated. Then multiple UsageTypes could be related to each fact row via a bridge table/intermediate measure group, using the SSAS Many-to-Many dimension modelling feature.|||

I have chosen to include only those usage types that aggregate towards a real total. If there will be a need to include the others, I will do this in a seperate measure group.

Joos

No comments:

Post a Comment