Having a structured approach to managing data integration (DI) projects can be challenging in large and growing companies. A company’s information assets can be vast; but all too often, they are stored in application silos.
Integrating the software and wetware (the information and human components of the system) while maintaining and extending the existing IT investments of the organization.
Isn’t “maintaining and extending” the same as “preserve”?
Doesn’t the goal of preserving the investment blind us to the possibility that it should be abandonded?
In many ways meta data sits in a different dimension than the other data warehouse data
- The First M – Meta Data Must be Meaningful
- The Second M – Meta Data Should be Mature
- The Third M – Meta Data Should be Manageable
- The Fourth M – Meta Data Should be Maintainable
- The Fifth M – Meta Data Should be Migrateable
The extraction, transformation and loading (ETL) technologies were originally invented to load warehouses and data marts. What sort of non-business intelligence (non-BI) applications are beginning to be seen for ETL tools?
Looks like people finally caught up.
There is absolutely nothing wrong with having more than one integration solution. The only thing that an enterprise needs to have is one integration strategy.
Meta data is all about data warehousing.
Most implementations of meta data target the data within a data warehouse. There is no reason that the implementation of meta data should be limited to a data warehouse. Depending on the scope of implementation, meta data can be maintained on organizational processes, business indicators and metrics. Enterprise-level meta data is more useful than a solution that is data warehouse centric.
Meta data should target data origination sources throughout the enterprise, such as transaction systems.
The recent EII – Dead on Arrival” article by Andy Hayler that appeared in DM Direct on February 25, 2005, lays out powerful arguments as to why enterprise information integration (EII) – the ability for a single business intelligence application to query inconsistent data source systems in order to generate answers to business questions, (what we’ll refer to as “querying on demand”) – will ultimately come up short as a general purpose business intelligence strategy and replacement for data warehouses.
The computer industry loves its buzzwords, and one that has cropped up in recent years is enterprise information integration. The idea is that everyone knows that companies have their data locked up in multiple, incompatible IT systems: ERP, CRM, supply chain, etc. At present, the only way to make sense of it is to extract data from these systems, try and resolve inconsistencies and data quality issues, and then load the result into a data warehouse from which you can report on the data in a common form.
Once integrated, these unified customer views provide the entire organization with the ability to drive meaningful business action within and across operational systems. While building and managing a unified customer view – across disparate data source, applications and channels – has often proved to be a complex and costly exercise, a neutral, meta data driven, rules-based approach to building a open customer hub can make the process much easier.
If we “re-termed” CDI to Person Data Integration, and followed a similar methodology we might be able to see into the problems of merging student data (as well as faculty or staff).
The same methodology could be applied to things as well – courses, programs, colleges?, sponsored projects?
All of these applications have one thing in common and, in order to maintain all of these functions, one needs to find the common thread. All of these functions use some form of hierarchies to logically organize their data. Hierarchies may include the corporate charts of accounts, product catalogs, legal organization structures and customer groupings. The challenge for the support organizations then becomes how to keep these hierarchies synchronized across organizational and application boundaries. This ensures everyone has the required views of the data while maintaining an environment that provides access to one common set of data across all of the disparate systems and functions.