If you have read The Data Warehouse Toolkit, you will have come across in Chapter 2 the Four-Step Dimensional Design Process. In this article, I will go through this process and how you as a developer can utilise this process to help with your report building to minimise re-work as well as ensure that you have all the data that you require to build it.
Four-Step Dimensional Design Process
When designing a dimensional model, there are some key decisions that you need to make at the start of the project. These decisions include the following:
By answering these questions, you are taking into account the needs of the business and getting a better understanding of the underlying data that is generated. Once you understand what the business process is along with the grain and identify the dimensions and facts that support that process, the design team determines the names of tables and columns, any business rules, and sample values. Business data governance reps must participate in this phase to ensure full business buy-in. The reason for this is if the data isn’t accessible or usable, it can be hard to generate a report that is required due to not having the data. There is also the issue of the data being protected against the outside world as well as ensuring that the data can’t be manipulated once entered into the database.
Business processes are the operational activities and tasks that, once completed, will accomplish an organisational goal. Such activities and tasks include taking an order, processing claims, and taking attendance in class. Business process activities generate or capture performance metrics, which turn into facts within your fact table. The majority of fact tables focus on the results of a single business process. Understanding the process is vital because it defines a specific target for designs, allows you to determine the grain of the data and work out what dimensions and facts need to be generated.
Questions you could ask to get this information could be:
By getting answers to these questions, you start to get an idea of how to establish the grain and identify the dimensions and facts.
Working out the grain of the process is an important step in a dimensional design. This establishes what a single row in the fact table represents. Once the grain is determined, you can lock that in for the design just as you would when signing a contract. It is only at this point that you can choose dimensions and facts, as these need to be consistent with the grain of the process. The consistency enforces a uniformity on all designs that is critical to Power BI’s performance and ease of use.
Atomic grain is the lowest level of data captured in a process. Focus on atomic grain data, as it is the most reliable grain when paired up against an infinite amount of user queries. If there are multiple grains you want to view, then you must create a separate fact table for each grain, as the different grains must not be mixed within the same fact table.
The ain question to ask in this section is, ‘What granularity do you wish to view the reports (i.e., daily, monthly, quarterly, yearly, etc.)?’.
Dimensions provide the ‘who, what, where, when, why, and how’ context around the event taking place within the business process. These tables supply the descriptive attributes used within Power BI for filtering, slicing, and grouping the facts. Thinking back to the grain of a fact table, you can determine all possible dimensions that you will require.
The dimension table is where all the information in relation to the attribute can be denormalised due to there being a large number of columns. However, it would then contain fewer rows of data than the fact table. The fields inside of the dimension table are used to create reports or display query results. They are also generally a one-to-many relationship with the fact table.
Facts are the measurements that give us a result from the business process event and they are generally in numeric form. A single row in a fact table has a one-to-one relationship to an event described by the grain of the fact table. Therefore, a fact is a direct relation to an event that has taken place. An example of this is when a wholesaler sells a product, the quantity and its price are facts.
They also contain foreign keys to the dimensional tables, which is how they relate each row of data in the fact table to the attributes of a dimension. There are also composite keys, which are made up of a subset of other keys to allow many-to-many relationships with other tables and entities in the dimensional model.
Going through this four-step process will allow you to capture everything that you are going to need to not only produce a report but a report that will answer questions in relation to the business process. It also will allow you to keep the model and data as small as possible to help with performance as well as ease of use.