This Design Tip reflects on the remarkable durability of the basic Extract-Transform-Load paradigm, while at the same time recognizing some profound changes that must be addressed. These changes are due to new data demands, new classes of users, and new technology opportunities.
Category: Data Warehouse
One of the most effective tools for managing data quality and data governance, as well as giving business users confidence in the data warehouse results, is the audit dimension. We often attach an audit dimension to every fact table so that business users can choose to illuminate the provenance and confidence in their queries and reports. Simply put, the audit dimension elevates metadata to the status of ordinary data and makes this metadata available at the top level of any BI tool user interface.
via Design Tip #164 Have You Built Your Audit Dimension Yet? – Kimball Group.
Ralph Kimball introduced the data warehouse/business intelligence industry to dimensional modeling in 1996 with his seminal book, The Data Warehouse Toolkit. Since then, the Kimball Group has extended the portfolio of best practices.Drawn from The Data Warehouse Toolkit, Third Edition, the “official” Kimball dimensional modeling techniques are described on the following links and attached .pdf:
Large Bionocular Telescope brings the Universe into Sharper Focus
via LBT Press Release March 2012 Images and Captions.
I got the chance to help these folks with some database work. Talk about “big data”.
DBAs facing the problem of corporate data explosion have an excellent new tool to help them in the MySQL 5.0 Archive storage engine. Whether it’s a data warehousing, data archiving, or data auditing situation, MySQL Archive tables can be just what the doctor ordered when it comes to maintaining large amounts of standard or sensitive information, while keeping storage costs at a bare-bones minimum.
via MySQL :: The MySQL 5.0 Archive Storage Engine.
Should probably investigate using ARCHIVE storage for the multi-year history tables
Three ETL Compromises to Avoid
Whether you are developing a new dimensional data warehouse or replacing an existing environment, the ETL (extract, transform, load) implementation effort is inevitably on the critical path. Difficult data sources, unclear requirements, data quality problems, changing scope, and other unforeseen problems often conspire to put the squeeze on the ETL development team. It simply may not be possible to fully deliver on the project team’s original commitments; compromises will need to be made. In the end, these compromises, if not carefully considered, may create long-term headaches.
via IntelligentEnterprise : Kimball University: Three ETL Compromises to Avoid (printable version).
What does it take to develop a robust dimensional model? Here's how to get from requirements-gathering to final approval in a process that will ferret out the good, bad and ugly realities of your source data and help you avoid surprises, delays and cost overruns.
via IntelligentEnterprise : Kimball University: Practical Steps for Designing a Dimensional Model (printable version).
How do you deal with changing dimensions? Hybrid approaches fill gaps left by the three fundamental techniques
via IntelligentEnterprise : Slowly Changing Dimensions Are Not Always as Easy as 1, 2, 3 (printable version).
Follow the rules to ensure granular data, flexibility and a future-proofed information resource. Break the rules and youll confuse users and run into data warehousing brick walls.
via IntelligentEnterprise : Kimball University: The 10 Essential Rules of Dimensional Modeling printable version.
This article describes six key decisions that must be made while crafting the ETL architecture. These decisions have significant impacts on the upfront and ongoing cost and complexity of the ETL solution and, ultimately, on the success of the overall BI/DW solution. Read on for Kimball Group’s advice on making the right choices.
via IntelligentEnterprise : Kimball University: Six Key Decisions for ETL Architectures: