Showing posts with label dimensional. Show all posts
Showing posts with label dimensional. Show all posts

Tuesday, March 27, 2012

Dimensional Modelling and Userbased Reporting

We are using SSAS 2005 for the cubes and reports will be created in Reporting Services as well as Proclarity Desktop Professional (all using windows integrated authentication). The reports will be displayed using a Sharepoint portal, again using the windows integrated authentication.

We have a Fact Table like FactRevenues which has revenueamount and a few other measures.

It is linked to 2 Dimension tables - DimProjects and DimCustomers through ProjectCode and CustomerCode.

DimProjects contains 2 dimension fields (apart from many others) - ProjectMgr and ProjectDir.

Similarly DimCustomers contains 3 fields - AccountMgr,AccountDir and EngagementDir.

We have an User Dimension which is the referenced dimension. DimUser contains UserID and other details of the user. We store the NT login(or at least something from which NT login can be obtained) in the UserID.

The five fields mentioned above - ProjectMgr, ProjectDir,AccountMgr, AccountDir and EngagementDir stores the UserID and are linked to the UserID in the User Dimension.

We have created hierarchies/pseudo dimensions which link like RevenueFact-ProjectMgr-User, RevenueFact-ProjectDir-User, RevenueFact-AccountMgr-User,RevenueFact-AccountDir-User and RevenueFact-EngagementDir-User. So, for the MDX expression we can use these to check against the logged in User.

The Problem:

Our requirement is, the person logged in should be able to see only the details (measure values) that are relevant to him/her. Basically, the aggregate of RevenueAmount should be done based on the following conditions:

If the logged in user is a Project Manager for some projects, then the person should only be able to see the aggregated measure for the projects for which he/she is the manager (checked from the ProjectMgr field). Similarly for ProjectDir.

If the logged in user is an Account Manager for some clients, then the person should only be able to see the aggregated measure for the clients that he/she handles (checked from the AccountMgrfield). Similarly for AccountDir and EngagementDir.

There are possibilities, though remote, that a single person can be AccountDir as well as ProjectDir. Basically, the system should support the possiblity for a person to be in any combination of the five fields. This shouldn't be an issue as this will be taken care of automatically once we set up the dynamic dimension security for each of these dimensions.

There will a different set of users - SuperUser/Admins - who will be able to view all the details without any restrictions. We HAVE NO ISSUES WITH THIS AS we created a role specifically for Admins without any restrictions and a role for others (everyone) which will be applied with the dynamic dimension security.

What we tried:

We have actually gone through the links that you have sent earlier, when we were trying to solve the issue. But it seems like we are missing some small thing.

Our MDX for dimension security

Filter( [DIMIRLINE -IR FORM REF NO - IR H CUST - ACCOUNT MGR].[DIMUSERS].[DIMUSERS] = UserName)

UserName supposedly being the function for obtaining the currently logged in user.

But this didn’t seem to restrict the users.

Your syntax for the filter function does not look correct. You need to pass it a set and then the criteria with which to filter that set. I would expect to see something more like the following:

Filter(

[DIMIRLINE -IR FORM REF NO - IR H CUST - ACCOUNT MGR].[DIMUSERS].[DIMUSERS].members

, [DIMIRLINE -IR FORM REF NO - IR H CUST - ACCOUNT MGR].[DIMUSERS].[DIMUSERS].CurrentMember.Name = UserName()

)

|||

This following is the table I created through SSAS browser having UserID,EGName and Mothwisesales Column

where USerID is the member of [Account Mgr].[UsersID] Attribute

I want to show the only row when perticalur User LogsIn(NTlogin) insted of showing the details of all User.He should see only his details(I want to restrict this in Cube level rather on SSRS/Proclarity).

I created a Hierachy i.e UserHierachy -->Role->UserID

Role attribute always contains{User,Manager}.If UserID is belongs to User Role He ll be see only his details from NTLogin and if UserID is belongs to ManagerRole He ll be see all the userdetails.

Please help me regarding this.

Thank u

with regards

Saroj

userID

Mar

chandrashekar.cs

EG11

$700.00

EG12

$495.63

EG14

$1,500.00

EG21

$15,600.83

EG24

$0.83

EG26

$8,726.37

EG34

$2,103.34

EG39

$300.00

EG3P

$197.55

EG41

$5,866.68

EG42

$16,900.00

EG51

$2,200.00

EG52

$1,400.83

EG53

$0.00

EG61

$11,757.92

EG62

$199.96

EG65

$363.40

EG72

$3,400.00

EG74

$2,400.00

madhankumar.s

$228,385.98

ramaprasath.mss

$6,402.35

sarojkumar.nishanka

$6,308.43

sethumadhavan.sb

$284,112.19

sql

Dimensional modelling advice

I am a former DBA that has moved into a totally new role and
environment, as a data architect on an enterprise data warehousing
project for the national statistics office. I've started reading the
Kimball Data Warehouse Toolkit (2nd Edition) and now have a feel for
the approach that I'd like to follow with the data modelling, ie.
using the data warehouse bus architecture approach that is advocated
by Kimball.
However, the main concern that I have with this approach is that we
have hundreds of different data sources in the form of surveys that we
conduct as well as data that we share from other govt and non-govt
agencies. So the data that we will be loading into our DW is not
sourced from our OLTP systems but from a multitude of other businesses
and households. If we were to model the data on a business process
basis there is the potential to end up with dozens/hundreds of star
schemas, as well as the ongoing maintenance of systems development for
when new surveys are created. Internally there is some support for
developing a generic dimensional model that has the flexibility to
accept any type of data (be it of an economic, social or environmental
nature). I could see that this would work as far as loading data into
the warehouse goes, but think that it would introduce a layer of
abstraction that would make it difficult for the business users to
understand and query the data (or do BI tools get around this). So
the first question is, how suitable is generic modelling for a data
warehouse implementation and if not, how big an issue is deploying a
large number of star schemas for an integrated data warehouse?
Secondly, a question around granularity. We have many situations
where different surveys request similar data but with slight
variations. eg. We could ask about a businesses export totals in a
number of different ways in several surveys - total exports of all
products for 2003, exports of fruit and veges for April, exports of
apples for Q3 2004, exports of apples for June in US$, total exports
to Europe etc. How is this handled in a dimensional model? Is it
possible to model this in a single star schema when the granularity
appears to be different. NB. Is may not be possible to derive
aggregate totals from data with a finer grain because a different
subset of businesses will have been asked a different set of
questions, so aggregate totals might not include the full subset of
data that is needed for an accurate total.
Lastly, does anybody know of an example of a dimensional model that
could be used as a starting point for survey collection and analysis
data warehouse?
TIA,
Keith.Keith -
I would suggest that you also purchase another Kimball book entitled
"Data Warehouse ETL Toolkit"; they should really be sold as a pair.
Approaching a data warehouse with a strong relational 3nf
bakcground can be difficult to understand and implement the
denormalizing structures needed for a successful dw.
I've implemented large dw systems with numerous external sources of
data, and I've found that if one really boils the inbound information
the complexity is not what it seems at first.
FWIW.
\
b.
Keith Chung wrote:

> I am a former DBA that has moved into a totally new role and
> environment, as a data architect on an enterprise data warehousing
> project for the national statistics office. I've started reading the
> Kimball Data Warehouse Toolkit (2nd Edition) and now have a feel for
> the approach that I'd like to follow with the data modelling, ie.
> using the data warehouse bus architecture approach that is advocated
> by Kimball.
> However, the main concern that I have with this approach is that we
> have hundreds of different data sources in the form of surveys that we
> conduct as well as data that we share from other govt and non-govt
> agencies. So the data that we will be loading into our DW is not
> sourced from our OLTP systems but from a multitude of other businesses
> and households. If we were to model the data on a business process
> basis there is the potential to end up with dozens/hundreds of star
> schemas, as well as the ongoing maintenance of systems development for
> when new surveys are created. Internally there is some support for
> developing a generic dimensional model that has the flexibility to
> accept any type of data (be it of an economic, social or environmental
> nature). I could see that this would work as far as loading data into
> the warehouse goes, but think that it would introduce a layer of
> abstraction that would make it difficult for the business users to
> understand and query the data (or do BI tools get around this). So
> the first question is, how suitable is generic modelling for a data
> warehouse implementation and if not, how big an issue is deploying a
> large number of star schemas for an integrated data warehouse?
> Secondly, a question around granularity. We have many situations
> where different surveys request similar data but with slight
> variations. eg. We could ask about a businesses export totals in a
> number of different ways in several surveys - total exports of all
> products for 2003, exports of fruit and veges for April, exports of
> apples for Q3 2004, exports of apples for June in US$, total exports
> to Europe etc. How is this handled in a dimensional model? Is it
> possible to model this in a single star schema when the granularity
> appears to be different. NB. Is may not be possible to derive
> aggregate totals from data with a finer grain because a different
> subset of businesses will have been asked a different set of
> questions, so aggregate totals might not include the full subset of
> data that is needed for an accurate total.
> Lastly, does anybody know of an example of a dimensional model that
> could be used as a starting point for survey collection and analysis
> data warehouse?
> TIA,
> Keith.

Dimensional modelling advice

I am a former DBA that has moved into a totally new role and
environment, as a data architect on an enterprise data warehousing
project for the national statistics office. I've started reading the
Kimball Data Warehouse Toolkit (2nd Edition) and now have a feel for
the approach that I'd like to follow with the data modelling, ie.
using the data warehouse bus architecture approach that is advocated
by Kimball.
However, the main concern that I have with this approach is that we
have hundreds of different data sources in the form of surveys that we
conduct as well as data that we share from other govt and non-govt
agencies. So the data that we will be loading into our DW is not
sourced from our OLTP systems but from a multitude of other businesses
and households. If we were to model the data on a business process
basis there is the potential to end up with dozens/hundreds of star
schemas, as well as the ongoing maintenance of systems development for
when new surveys are created. Internally there is some support for
developing a generic dimensional model that has the flexibility to
accept any type of data (be it of an economic, social or environmental
nature). I could see that this would work as far as loading data into
the warehouse goes, but think that it would introduce a layer of
abstraction that would make it difficult for the business users to
understand and query the data (or do BI tools get around this). So
the first question is, how suitable is generic modelling for a data
warehouse implementation and if not, how big an issue is deploying a
large number of star schemas for an integrated data warehouse?
Secondly, a question around granularity. We have many situations
where different surveys request similar data but with slight
variations. eg. We could ask about a businesses export totals in a
number of different ways in several surveys - total exports of all
products for 2003, exports of fruit and veges for April, exports of
apples for Q3 2004, exports of apples for June in US$, total exports
to Europe etc. How is this handled in a dimensional model? Is it
possible to model this in a single star schema when the granularity
appears to be different. NB. Is may not be possible to derive
aggregate totals from data with a finer grain because a different
subset of businesses will have been asked a different set of
questions, so aggregate totals might not include the full subset of
data that is needed for an accurate total.
Lastly, does anybody know of an example of a dimensional model that
could be used as a starting point for survey collection and analysis
data warehouse?
TIA,
Keith.
Keith -
I would suggest that you also purchase another Kimball book entitled
"Data Warehouse ETL Toolkit"; they should really be sold as a pair.
Approaching a data warehouse with a strong relational 3nf
bakcground can be difficult to understand and implement the
denormalizing structures needed for a successful dw.
I've implemented large dw systems with numerous external sources of
data, and I've found that if one really boils the inbound information
the complexity is not what it seems at first.
FWIW.
\
b.
Keith Chung wrote:

> I am a former DBA that has moved into a totally new role and
> environment, as a data architect on an enterprise data warehousing
> project for the national statistics office. I've started reading the
> Kimball Data Warehouse Toolkit (2nd Edition) and now have a feel for
> the approach that I'd like to follow with the data modelling, ie.
> using the data warehouse bus architecture approach that is advocated
> by Kimball.
> However, the main concern that I have with this approach is that we
> have hundreds of different data sources in the form of surveys that we
> conduct as well as data that we share from other govt and non-govt
> agencies. So the data that we will be loading into our DW is not
> sourced from our OLTP systems but from a multitude of other businesses
> and households. If we were to model the data on a business process
> basis there is the potential to end up with dozens/hundreds of star
> schemas, as well as the ongoing maintenance of systems development for
> when new surveys are created. Internally there is some support for
> developing a generic dimensional model that has the flexibility to
> accept any type of data (be it of an economic, social or environmental
> nature). I could see that this would work as far as loading data into
> the warehouse goes, but think that it would introduce a layer of
> abstraction that would make it difficult for the business users to
> understand and query the data (or do BI tools get around this). So
> the first question is, how suitable is generic modelling for a data
> warehouse implementation and if not, how big an issue is deploying a
> large number of star schemas for an integrated data warehouse?
> Secondly, a question around granularity. We have many situations
> where different surveys request similar data but with slight
> variations. eg. We could ask about a businesses export totals in a
> number of different ways in several surveys - total exports of all
> products for 2003, exports of fruit and veges for April, exports of
> apples for Q3 2004, exports of apples for June in US$, total exports
> to Europe etc. How is this handled in a dimensional model? Is it
> possible to model this in a single star schema when the granularity
> appears to be different. NB. Is may not be possible to derive
> aggregate totals from data with a finer grain because a different
> subset of businesses will have been asked a different set of
> questions, so aggregate totals might not include the full subset of
> data that is needed for an accurate total.
> Lastly, does anybody know of an example of a dimensional model that
> could be used as a starting point for survey collection and analysis
> data warehouse?
> TIA,
> Keith.

dimensional modeling

I need some guideline for designing diemsional model for the situation
given below. Suppose we track individual sales transactions alongwith
different categories of product, areas and employee who made sale.
Sales targets have alreay been assigned to employees for a given time
period eg. quarter. How should we design DM which allows to track not
only actual sales but also compares it targets alloted? Where should be
alloted target amounts need to be stored?
In SQL 2005 UDM, you can have a single dimension connected to two fact
tables on a different levl of granularity, which is exactly what you need
(sales transactions vs. target sales). Check the "Defining Dimension
Granularity within a Measure Group" part of the Analysis Services Tutorial
in SQL 2005 BOL (http://msdn2.microsoft.com/en-us/library/ms166573.aspx).
Dejan Sarka, SQL Server MVP
Mentor
www.SolidQualityLearning.com
"viki" <waqer@.yahoo.com> wrote in message
news:1133692623.706401.175280@.z14g2000cwz.googlegr oups.com...
>I need some guideline for designing diemsional model for the situation
> given below. Suppose we track individual sales transactions alongwith
> different categories of product, areas and employee who made sale.
> Sales targets have alreay been assigned to employees for a given time
> period eg. quarter. How should we design DM which allows to track not
> only actual sales but also compares it targets alloted? Where should be
> alloted target amounts need to be stored?
>
|||Hi Viki,
I have a beginners page on my web site www.peternolan.com. It contains
links to books I recommend. Ralphs books cover these areas in detail.
There are also sample databases for what you are asking...The Data
Warehouse Design Solutions book would also be useful to you as you will
surely not stop at merely doing sales tracking...
Best Regards
Peter
www.peternolan.com

dimensional modeling

I need some guideline for designing diemsional model for the situation
given below. Suppose we track individual sales transactions alongwith
different categories of product, areas and employee who made sale.
Sales targets have alreay been assigned to employees for a given time
period eg. quarter. How should we design DM which allows to track not
only actual sales but also compares it targets alloted? Where should be
alloted target amounts need to be stored?In SQL 2005 UDM, you can have a single dimension connected to two fact
tables on a different levl of granularity, which is exactly what you need
(sales transactions vs. target sales). Check the "Defining Dimension
Granularity within a Measure Group" part of the Analysis Services Tutorial
in SQL 2005 BOL (http://msdn2.microsoft.com/en-us/library/ms166573.aspx).
Dejan Sarka, SQL Server MVP
Mentor
www.SolidQualityLearning.com
"viki" <waqer@.yahoo.com> wrote in message
news:1133692623.706401.175280@.z14g2000cwz.googlegroups.com...
>I need some guideline for designing diemsional model for the situation
> given below. Suppose we track individual sales transactions alongwith
> different categories of product, areas and employee who made sale.
> Sales targets have alreay been assigned to employees for a given time
> period eg. quarter. How should we design DM which allows to track not
> only actual sales but also compares it targets alloted? Where should be
> alloted target amounts need to be stored?
>|||Hi Viki,
I have a beginners page on my web site www.peternolan.com. It contains
links to books I recommend. Ralphs books cover these areas in detail.
There are also sample databases for what you are asking...The Data
Warehouse Design Solutions book would also be useful to you as you will
surely not stop at merely doing sales tracking...
Best Regards
Peter
www.peternolan.com

Dimensional Model

I am in the process of designing a Booking, Billing and Backlog
datamart. My backend is going to be SQL Server 2K. Data source is
Baan ERP system. The transaction level I have to populate the
datamart is at sales order line level.
My problem is that the data model I have come up with is more
relational than dimensional. Here is what I have got so far.
Orders
Order Detail
Customer
Products
Countries
Regions
Employees
Other way of sourcing the data would be to write one big query from
ERP and then do the dimensional model but that would slow down the
data retrieval. I would be grateful if any you can share your
experience with this sort of a datamart.
Thanks
Ather
Well, you have to take the approach of dimensional modelling to make it
work. http://www.ralphkimball.com has some good articles to get you started,
also I would recommend a book "Data Wareshousing TookKit" by the same author
has some good examples that you can related to.
Pramod
"Ather Mian" <athermian@.yahoo.com> wrote in message
news:6ed9e09e.0403310515.4c0e0a79@.posting.google.c om...
> I am in the process of designing a Booking, Billing and Backlog
> datamart. My backend is going to be SQL Server 2K. Data source is
> Baan ERP system. The transaction level I have to populate the
> datamart is at sales order line level.
> My problem is that the data model I have come up with is more
> relational than dimensional. Here is what I have got so far.
> Orders
> Order Detail
> Customer
> Products
> Countries
> Regions
> Employees
> Other way of sourcing the data would be to write one big query from
> ERP and then do the dimensional model but that would slow down the
> data retrieval. I would be grateful if any you can share your
> experience with this sort of a datamart.
> Thanks
> Ather
sql

Dimensional Model

I am in the process of designing a Booking, Billing and Backlog
datamart. My backend is going to be SQL Server 2K. Data source is
Baan ERP system. The transaction level I have to populate the
datamart is at sales order line level.
My problem is that the data model I have come up with is more
relational than dimensional. Here is what I have got so far.
Orders
Order Detail
Customer
Products
Countries
Regions
Employees
Other way of sourcing the data would be to write one big query from
ERP and then do the dimensional model but that would slow down the
data retrieval. I would be grateful if any you can share your
experience with this sort of a datamart.
Thanks
AtherWell, you have to take the approach of dimensional modelling to make it
work. http://www.ralphkimball.com has some good articles to get you started,
also I would recommend a book "Data Wareshousing TookKit" by the same author
has some good examples that you can related to.
Pramod
"Ather Mian" <athermian@.yahoo.com> wrote in message
news:6ed9e09e.0403310515.4c0e0a79@.posting.google.com...
> I am in the process of designing a Booking, Billing and Backlog
> datamart. My backend is going to be SQL Server 2K. Data source is
> Baan ERP system. The transaction level I have to populate the
> datamart is at sales order line level.
> My problem is that the data model I have come up with is more
> relational than dimensional. Here is what I have got so far.
> Orders
> Order Detail
> Customer
> Products
> Countries
> Regions
> Employees
> Other way of sourcing the data would be to write one big query from
> ERP and then do the dimensional model but that would slow down the
> data retrieval. I would be grateful if any you can share your
> experience with this sort of a datamart.
> Thanks
> Ather

Dimensional Design Question

I have done some reading on dimensional modeling and am a bit confused as to
how I create a dimension in our data warehouse using our transactional sourc
e.
I have a customer table which contains the customer id, customer name,
region codes and industry codes. Naturally I also have a region table and
industry table which has the region/industry descriptions.
I have read that the data warehouse should be designed using a star schema.
If that is the case, should I be creating a customer dimension with the
customer id, customer name, region desc, and industry name? Or should I
have a customer dimension, an industry dimension, and a region dimension?It depends on the kind of your analysis.
If Region and Industry are attributes of the customer and not
independent attributes of the fact you want to measure, they should be
normalized into the Customer dimension. If you want to track historycal
changes of Region/Industry of the customer, you should use SCD Type II
(Slowly Changing Dimension).
Only in particular scenarios you should consider a snowflake schema
instead of a star schema, but this is an exception and in my experience
you can safely avoid it in most of the models you have to manage.
I suggest you to read Kimball's books
(http://www.kimballgroup.com/html/books.html) as a reference for DW
modeling.
Marco Russo
http://www.sqlbi.eu
http://www.sqljunkies.com/weblog/sqlbi
JP wrote:
> I have done some reading on dimensional modeling and am a bit confused as
to
> how I create a dimension in our data warehouse using our transactional sou
rce.
> I have a customer table which contains the customer id, customer name,
> region codes and industry codes. Naturally I also have a region table an
d
> industry table which has the region/industry descriptions.
> I have read that the data warehouse should be designed using a star schema
.
> If that is the case, should I be creating a customer dimension with the
> customer id, customer name, region desc, and industry name? Or should I
> have a customer dimension, an industry dimension, and a region dimension?|||thanks marco. i do have the msft data warehouse toolkit but it doesn't
address this issue directly.
the region and industry are attributes of the customer so sounds like i
should have them "normalized" within the customer attribute. the fact tabl
e
is based on customer but i want to be able to roll-up the data to report by
industry, region, sub-industry, etc.
one item i left out was sub-industry. each sub-industry belongs to an
industry. so my customer dimension should look like the following?
customerkey (surrogate)
customerid (business key from source system)
region
industry
subindustry
"Marco Russo" wrote:

> It depends on the kind of your analysis.
> If Region and Industry are attributes of the customer and not
> independent attributes of the fact you want to measure, they should be
> normalized into the Customer dimension. If you want to track historycal
> changes of Region/Industry of the customer, you should use SCD Type II
> (Slowly Changing Dimension).
> Only in particular scenarios you should consider a snowflake schema
> instead of a star schema, but this is an exception and in my experience
> you can safely avoid it in most of the models you have to manage.
> I suggest you to read Kimball's books
> (http://www.kimballgroup.com/html/books.html) as a reference for DW
> modeling.
> Marco Russo
> http://www.sqlbi.eu
> http://www.sqljunkies.com/weblog/sqlbi
>
> JP wrote:
>|||Yes, you're right.
Flatten all customer attributes (Region, Industry, Subindustry, ...)
into Customer dimension.
You will find it's a very good schema to query now and to maintain in
the future, when you will add some other attributes.
Just to make an example: if you would have started with several tables,
imagine how adding a subindustry attribute would have impacted on your
schema and on your ETL implementation...
Marco Russo
http://www.sqlbi.eu
http://www.sqljunkies.com/weblog/sqlbi
JP wrote:[vbcol=seagreen]
> thanks marco. i do have the msft data warehouse toolkit but it doesn't
> address this issue directly.
> the region and industry are attributes of the customer so sounds like i
> should have them "normalized" within the customer attribute. the fact ta
ble
> is based on customer but i want to be able to roll-up the data to report b
y
> industry, region, sub-industry, etc.
> one item i left out was sub-industry. each sub-industry belongs to an
> industry. so my customer dimension should look like the following?
> customerkey (surrogate)
> customerid (business key from source system)
> region
> industry
> subindustry
>
> "Marco Russo" wrote:
>

Dimensional Design Question

I have done some reading on dimensional modeling and am a bit confused as to
how I create a dimension in our data warehouse using our transactional source.
I have a customer table which contains the customer id, customer name,
region codes and industry codes. Naturally I also have a region table and
industry table which has the region/industry descriptions.
I have read that the data warehouse should be designed using a star schema.
If that is the case, should I be creating a customer dimension with the
customer id, customer name, region desc, and industry name? Or should I
have a customer dimension, an industry dimension, and a region dimension?
It depends on the kind of your analysis.
If Region and Industry are attributes of the customer and not
independent attributes of the fact you want to measure, they should be
normalized into the Customer dimension. If you want to track historycal
changes of Region/Industry of the customer, you should use SCD Type II
(Slowly Changing Dimension).
Only in particular scenarios you should consider a snowflake schema
instead of a star schema, but this is an exception and in my experience
you can safely avoid it in most of the models you have to manage.
I suggest you to read Kimball's books
(http://www.kimballgroup.com/html/books.html) as a reference for DW
modeling.
Marco Russo
http://www.sqlbi.eu
http://www.sqljunkies.com/weblog/sqlbi
JP wrote:
> I have done some reading on dimensional modeling and am a bit confused as to
> how I create a dimension in our data warehouse using our transactional source.
> I have a customer table which contains the customer id, customer name,
> region codes and industry codes. Naturally I also have a region table and
> industry table which has the region/industry descriptions.
> I have read that the data warehouse should be designed using a star schema.
> If that is the case, should I be creating a customer dimension with the
> customer id, customer name, region desc, and industry name? Or should I
> have a customer dimension, an industry dimension, and a region dimension?
|||thanks marco. i do have the msft data warehouse toolkit but it doesn't
address this issue directly.
the region and industry are attributes of the customer so sounds like i
should have them "normalized" within the customer attribute. the fact table
is based on customer but i want to be able to roll-up the data to report by
industry, region, sub-industry, etc.
one item i left out was sub-industry. each sub-industry belongs to an
industry. so my customer dimension should look like the following?
customerkey (surrogate)
customerid (business key from source system)
region
industry
subindustry
"Marco Russo" wrote:

> It depends on the kind of your analysis.
> If Region and Industry are attributes of the customer and not
> independent attributes of the fact you want to measure, they should be
> normalized into the Customer dimension. If you want to track historycal
> changes of Region/Industry of the customer, you should use SCD Type II
> (Slowly Changing Dimension).
> Only in particular scenarios you should consider a snowflake schema
> instead of a star schema, but this is an exception and in my experience
> you can safely avoid it in most of the models you have to manage.
> I suggest you to read Kimball's books
> (http://www.kimballgroup.com/html/books.html) as a reference for DW
> modeling.
> Marco Russo
> http://www.sqlbi.eu
> http://www.sqljunkies.com/weblog/sqlbi
>
> JP wrote:
>
|||Yes, you're right.
Flatten all customer attributes (Region, Industry, Subindustry, ...)
into Customer dimension.
You will find it's a very good schema to query now and to maintain in
the future, when you will add some other attributes.
Just to make an example: if you would have started with several tables,
imagine how adding a subindustry attribute would have impacted on your
schema and on your ETL implementation...
Marco Russo
http://www.sqlbi.eu
http://www.sqljunkies.com/weblog/sqlbi
JP wrote:[vbcol=seagreen]
> thanks marco. i do have the msft data warehouse toolkit but it doesn't
> address this issue directly.
> the region and industry are attributes of the customer so sounds like i
> should have them "normalized" within the customer attribute. the fact table
> is based on customer but i want to be able to roll-up the data to report by
> industry, region, sub-industry, etc.
> one item i left out was sub-industry. each sub-industry belongs to an
> industry. so my customer dimension should look like the following?
> customerkey (surrogate)
> customerid (business key from source system)
> region
> industry
> subindustry
>
> "Marco Russo" wrote:

Dimensional Arrays in MSSQL 2000

I am converting a database from Clarion to MSSQL 2000. In the Clarion files we use dimensional arrays. I've seen dimensional arrays mentioned on the forum, but can't find how to set one up in MSSQL 2000. Can anyone help me find some kind of documentation on this, or point me in the right direction?

MartyRefer to this Clarion link (http://www.clarionmag.com/cmag/v1/v1n4convertingtosql.html) for information and also refer to books online for 'sp_OAMethod' and other referenced topics.

HTH|||honea, not sure I know what dimesional arrays are with regards to Calrion. As best as I remember Clarion used flat files. If you can provide an explination I will be glad to help if I can.|||Satya,

Thank you for the response. I did look up the 'sp_OAMethod', and while it talks about referencing the arrays, I have found no place that talks about actually setting the array up. How do I put an array in the table?

Marty|||Paul,

In Clarion you would declare a field type and then dimension it. Example:

mydimlong Long Dim(2,36)

This would give you a long data type that has 72 possible places to store data. You would then retrieve or put data in by referencing it like this...

mydimlong[1,1] = 'Male'
mydimlong[2,1] = 'Female'
mydimlong[1,2] = somedata
mydimlong[2,2] = somemoredata

ETC...

Marty|||To my klnowledge there is nothing comperable in MSSQL Server. I would probably set this up as a child table and move on.|||Thank you Paul,

I will probably keep the table in clarion for now.

Marty

Sunday, March 25, 2012

Dimension or Fact?

Hi - I've got a dimensional modelling question that's got me stuck.
I have a "Student" entity and a "Student Transaction" entity.
The Student Transaction table will be a fact table -- that's easy.
And the Student will be a dimension of the Student Transaction table.
HOWEVER, the Student dimension is HUGE and has a lot of sub-dimensions
of its own -- things like State, Country, etc. It's basically what
Kimball calls a rapidly changing monster dimension!
First - isn't this snowflaking, and is that bad? Should the Student
be another fact? But then how do I relate the Student to the Student
Transaction if they're both facts.
Also, the student has attributes like test scores and GPA that seem
like measures -- but a dimension can't have a measure. So how could I
get average test scores and gpa, etc...
The other thing I was thinking of doing was treating the Student as
both a dimension AND a fact! Basically create 2 cubes:
Student Transaction cube, which has Student table as a dimension
Student cube, which actually uses the same Student table, but as a
fact.
This seems to work, but it seems awfully weird to use the same
physical table as both a fact and a dimension... even if it is in
different cubes...
IF anyone has insight, i would appreciate it very much!
thanksIn your case Stundent is a Dimension, average, etc are properties of the dim
ension not measures.
It's a better practice to design the dimension with star structure. You can
make a view to group all information an make the dimension based on that vie
w.
Also you can think several cubes that uses Student dimension and time dimens
ion, like test, etc. And then you can calcultate average and other. A way to
join these cubes is with virtual cubes.
Good Luck|||The "badness" of snowflaking is highly overrated. Conversely, the benefits
of denormalized stars are also overrated.
public @. the domain below
www.tomchester.net
"groove_sf" <dockv@.hot-NOSPAM-mail.com> wrote in message
news:cfpu70dvqh0s8c2p1etalnjdrvos9et1mf@.
4ax.com...
> Hi - I've got a dimensional modelling question that's got me stuck.
> I have a "Student" entity and a "Student Transaction" entity.
> The Student Transaction table will be a fact table -- that's easy.
> And the Student will be a dimension of the Student Transaction table.
> HOWEVER, the Student dimension is HUGE and has a lot of sub-dimensions
> of its own -- things like State, Country, etc. It's basically what
> Kimball calls a rapidly changing monster dimension!
> First - isn't this snowflaking, and is that bad? Should the Student
> be another fact? But then how do I relate the Student to the Student
> Transaction if they're both facts.
> Also, the student has attributes like test scores and GPA that seem
> like measures -- but a dimension can't have a measure. So how could I
> get average test scores and gpa, etc...
> The other thing I was thinking of doing was treating the Student as
> both a dimension AND a fact! Basically create 2 cubes:
> Student Transaction cube, which has Student table as a dimension
> Student cube, which actually uses the same Student table, but as a
> fact.
> This seems to work, but it seems awfully weird to use the same
> physical table as both a fact and a dimension... even if it is in
> different cubes...
> IF anyone has insight, i would appreciate it very much!
> thanks
>|||I disagree that the GPA and test scores should be part of the Student dimens
ion. They are semi-additive facts, and should be considered for a separate
star in your constellation (assuming you are using an architected DW with co
nformed dimensions.)
Think about it this way - if you have student transactions as one "business
process", thus is one star center, then GPA and test scores would merit its
own cetricity in a separate fact table, since GPA and test scores are not tr
ansactions, but are "Grades
", though GPA is going to need to be atomized into GPA factoids so it can be
calculated on the fly if you have other dimensions that will be used to sli
ce and dice. For example, you might want GPA by students by a particular ma
jor, or students in the cla
ss of 2005, etc.
Thus you would have Student as a conforming dimension to both a Transactions
fact table and a Grades fact table, thus the need for at least two stars in
your constellation that share the conformed Student dimension.

Dimension or Fact?

Hi - I've got a dimensional modelling question that's got me stuck.
I have a "Student" entity and a "Student Transaction" entity.
The Student Transaction table will be a fact table -- that's easy.
And the Student will be a dimension of the Student Transaction table.
HOWEVER, the Student dimension is HUGE and has a lot of sub-dimensions
of its own -- things like State, Country, etc. It's basically what
Kimball calls a rapidly changing monster dimension!
First - isn't this snowflaking, and is that bad? Should the Student
be another fact? But then how do I relate the Student to the Student
Transaction if they're both facts.
Also, the student has attributes like test scores and GPA that seem
like measures -- but a dimension can't have a measure. So how could I
get average test scores and gpa, etc...
The other thing I was thinking of doing was treating the Student as
both a dimension AND a fact! Basically create 2 cubes:
Student Transaction cube, which has Student table as a dimension
Student cube, which actually uses the same Student table, but as a
fact.
This seems to work, but it seems awfully weird to use the same
physical table as both a fact and a dimension... even if it is in
different cubes...
IF anyone has insight, i would appreciate it very much!
thanks
In your case Stundent is a Dimension, average, etc are properties of the dimension not measures.
It's a better practice to design the dimension with star structure. You can make a view to group all information an make the dimension based on that view.
Also you can think several cubes that uses Student dimension and time dimension, like test, etc. And then you can calcultate average and other. A way to join these cubes is with virtual cubes.
Good Luck
|||The "badness" of snowflaking is highly overrated. Conversely, the benefits
of denormalized stars are also overrated.
public @. the domain below
www.tomchester.net
"groove_sf" <dockv@.hot-NOSPAM-mail.com> wrote in message
news:cfpu70dvqh0s8c2p1etalnjdrvos9et1mf@.4ax.com...
> Hi - I've got a dimensional modelling question that's got me stuck.
> I have a "Student" entity and a "Student Transaction" entity.
> The Student Transaction table will be a fact table -- that's easy.
> And the Student will be a dimension of the Student Transaction table.
> HOWEVER, the Student dimension is HUGE and has a lot of sub-dimensions
> of its own -- things like State, Country, etc. It's basically what
> Kimball calls a rapidly changing monster dimension!
> First - isn't this snowflaking, and is that bad? Should the Student
> be another fact? But then how do I relate the Student to the Student
> Transaction if they're both facts.
> Also, the student has attributes like test scores and GPA that seem
> like measures -- but a dimension can't have a measure. So how could I
> get average test scores and gpa, etc...
> The other thing I was thinking of doing was treating the Student as
> both a dimension AND a fact! Basically create 2 cubes:
> Student Transaction cube, which has Student table as a dimension
> Student cube, which actually uses the same Student table, but as a
> fact.
> This seems to work, but it seems awfully weird to use the same
> physical table as both a fact and a dimension... even if it is in
> different cubes...
> IF anyone has insight, i would appreciate it very much!
> thanks
>
|||I disagree that the GPA and test scores should be part of the Student dimension. They are semi-additive facts, and should be considered for a separate star in your constellation (assuming you are using an architected DW with conformed dimensions.)
Think about it this way - if you have student transactions as one "business process", thus is one star center, then GPA and test scores would merit its own cetricity in a separate fact table, since GPA and test scores are not transactions, but are "Grades
", though GPA is going to need to be atomized into GPA factoids so it can be calculated on the fly if you have other dimensions that will be used to slice and dice. For example, you might want GPA by students by a particular major, or students in the cla
ss of 2005, etc.
Thus you would have Student as a conforming dimension to both a Transactions fact table and a Grades fact table, thus the need for at least two stars in your constellation that share the conformed Student dimension.
sql