Looking at the BI market place today, a lot of products and vendors are telling us that we don’t need data warehouses to create those flashy and easy-to-use reports and dashboards. Although from a technical point of view this statement could be valid, in the real world you are not only using the data warehouse to gather information in one location. In the real world we see that a data warehouse does much more than that.
Next to just loading the data to a central store, data warehouses also validate data during ETL. This can be done in various ways. At the basic level, data will be checked whether or not it has the correct format or whether the business keys are indeed unique. When a relation should exist, data warehouses validate this relation and verify whether all mandatory fields are filled out.
Besides basic validations, also more advanced data quality checks could be performed. Think about validations to make sure that pricing is set correctly. Combine customer or other information from several sources and make sure that this information is merged correctly and in valid way. So both checking the quality of loaded data as improving or rating the quality are features of data warehouses. These data quality checks and improvements will assist in getting coherent, correct and structured information.
Although data relations were briefly mentioned already in the previous paragraph, it is important enough to mention it again as a separate topic. Linking data correctly is the basic foundation for getting correct information and figures. If a relation is using wrong or non-existing codes, no warnings are given. Yet data will be missing in reports. For example: if you have sales records linked to a customer that is not part of the self-service solution, the sales figures will not be showing up on the reports or dashboard. As a user you will most likely not see that these records are missing, because the tools aren’t equipped to handle that kind of problems.
Having relations validated by performing lookups in the correct dimension tables will make sure that records are linked correctly. If the dimension information is missing but records are still linked, it most likely will turn out to be a dummy record. If all sales records are linked to existing dimensional information, all sales figures will be visualized in reports and dashboards.
After data quality has been checked and relational lookups are performed, we verify the consistency of our data warehouse. We log all data warehouse information to provide the principal users of the system with the necessary information on the loading and data quality. Delivering users with data quality reports enables them to take necessary actions to improve data at the source system, resulting in higher quality at the reporting side. Also, by showing quality issues to other users, they can work proactively on avoiding and fixing the problems. Giving key users a tool to work proactively on quality will give a better “feeling” to end users as opposed to when these users would need to inform the key users about problems with the data.
Next to having data available for analysis and standard reporting, a lot of other projects benefit from automated reports. Some examples that are only a small subset of the existing possibilities:
Looking at operational systems we see that some information is very fragmented. Getting that fragmented information into a model will result in a more difficult reporting model. One of the major advantages of creating a data warehouse is that you can actually convert a complicated source model into an easy to use reporting model that provides information at the correct level.
When looking back at the past, most will probably remember the excel hell where all business users were creating their own reports and giving identical names to different calculations. Next to reinventing the wheel over en over again, this could also lead to different implementations of that same wheel. Leading to different results. So, one of the main reasons for having a data warehouse is having one location where data is loaded consistently over time according to standardized processes. This gives you the same figures with the same co-notation in different reports.
As you can see, there are plenty of reasons to perform your self-service reporting on top of a data warehouse. Especially for data that is at the core of your organisation. This doesn’t mean of course that you can’t enrich data with external or your own created data. Extending the information will give added value and more insight.
© 2023 Kohera
Crafted by
© 2022 Kohera
Crafted by
Cookie | Duration | Description |
---|---|---|
ARRAffinity | session | ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user. |
ARRAffinitySameSite | session | This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session. |
cookielawinfo-checkbox-advertisement | 1 year | Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category. |
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
CookieLawInfoConsent | 1 year | CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie. |
elementor | never | The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | Cloudflare set the cookie to support Cloudflare Bot Management. |
pll_language | 1 year | Polylang sets this cookie to remember the language the user selects when returning to the website and get the language information when unavailable in another way. |
Cookie | Duration | Description |
---|---|---|
_ga | 1 year 1 month 4 days | Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors. |
_ga_* | 1 year 1 month 4 days | Google Analytics sets this cookie to store and count page views. |
_gat_gtag_UA_* | 1 minute | Google Analytics sets this cookie to store a unique user ID. |
_gid | 1 day | Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously. |
ai_session | 30 minutes | This is a unique anonymous session identifier cookie set by Microsoft Application Insights software to gather statistical usage and telemetry data for apps built on the Azure cloud platform. |
CONSENT | 2 years | YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data. |
vuid | 1 year 1 month 4 days | Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website. |
Cookie | Duration | Description |
---|---|---|
ai_user | 1 year | Microsoft Azure sets this cookie as a unique user identifier cookie, enabling counting of the number of users accessing the application over time. |
VISITOR_INFO1_LIVE | 5 months 27 days | YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface. |
YSC | session | Youtube sets this cookie to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt-remote-device-id | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt.innertube::nextId | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
Cookie | Duration | Description |
---|---|---|
WFESessionId | session | No description available. |