by David Marco

March 2003

Top 10 Questions to Ask/Mistakes to Avoid When Building a Meta Data Repository (Part I)

The key to your company’s prosperity is how well you gather, retain and disseminate knowledge. A meta data repository is the key to gathering, retaining, and disseminating knowledge.
Building a meta data repository is critical for accessing, maintaining, and controlling the vital information stored in our business intelligence and operational systems. While meta data has always been a central covenant of data warehousing, over the last couple of years it has been brought further into the spotlight as most Global 2000 companies have some sort of business intelligence system currently in place, most for several years. The vast majority of these companies have had to struggle with the task of managing the exponential growth of these systems over time. Without meta data, the task of managing this growth becomes overly difficult and time consuming. This need has driven many major software vendors like Microsoft, Computer Associates, and Oracle to enter the meta data marketplace with significant product offerings.

Meta Data Repository Implementation

At Enterprise Warehousing Solutions, we have multiple clients that have over 100 different IT systems (one of these clients has over 500 systems). Clearly there is a great deal of redundancy between these systems. More importantly, there is a tremendous cost associated in supporting these systems. The objective of these client’s meta data repositories is to capture the data flows into and out of these systems. In addition, we will store technical processing, types of data (customer, product, etc.), user groups, and data element system of record that exists in each of these systems. With this meta data, these clients will be able identify the redundant systems and then begin a migration plan to remove them. Also, we will make sure that the repository team reviews all new system proposals to make sure that these companies do not start building new redundant systems.

1. Not Defining The Tangible Business And Technical Objectives Of The Meta Data Repository
This is THE top mistake that most companies make. Quite often the meta data repository team will neglect to clearly define the specific business and technical value that their meta data repository will provide. These objectives are critical to define up front as they will guide all proceeding project activity. When selling in the concept of meta data to your corporation’s senior management, there are only two things that they understand: Increasing Revenues or Decreasing Costs. If you are not talking about increasing revenues or decreasing expenses you are the IT version of the school teacher on the old Peanuts cartoon: blah, blah, blah, blah, blah.

Figure 1: Sample Meta Data Repository Objectives
Clear business and technical objectives are definable and measurable. This activity is imperative as once the meta data repository is completed the management team will have to justify the cost expenditures of the initiative. Keep in mind a meta data repository, like a data warehouse, is NOT a project, it is a process. The repository will need to grow to support the ever-expanding role of the data warehouse/data marts and operational systems that it supports. In addition, as business users become more sophisticated their demands will substantially increase. Once a cost justification can be quantified for the initial release of the repository, the process for gaining funding for the follow up releases is greatly simplified.

2. Examining Meta Data Tools Before Defining Requirements
It is surprising how often I receive calls from companies asking me to suggest a meta data tool for their repository project. My standard response is “what are your repository’s requirements?” Typically the reply from the other end of the line is silence. This situation is highly concerning. The meta data repository requirements must guide the tool selection process, not follow it. When a tool is selected before requirements are defined, quite often the requirements are forced to match the tool capabilities, as opposed to solving business problems.
As we discussed, clear requirements for the meta data project are critical as they provide the lighthouse for all subsequent project activities. Without this beacon, it becomes all too probable for the project’s course to go awry.

3. Selecting A Meta Data Tool Without Conducting An Evaluation
All of the major meta data vendor tools maintain and control the repository in a different manner. Finding the tool that bests suits your company requires careful analysis. An educated consumer will be the most satisfied one because they understand exactly what they’re buying and what they’re not buying.
Remember that whichever tool is purchased, none of them will make meta data integration “easy” regardless of the marketing materials or salesperson’s hype. To be successful in your meta data project takes knowledge, discipline, talented employees, and good old fashioned hard work, just like any other major IT endeavor. While none of the tools eliminate these needs, for some companies it is better to purchase a tool and work around its limitations, as opposed to building everything from scratch.

4. Not Creating A Meta Data Repository Team
Very often companies neglect to form a dedicated meta data repository team. Such a team should be responsible for maintaining, controlling, and providing access into the meta data repository. The typical meta data repository team at full staff will consist of 1 – 2 data modelers, 2 meta data integration developers, 2 meta data access developers, 1 – 2 business analysts, a meta data repository architect, and a project leader. Keep in mind that some of the roles can be fulfilled by the same resource, depending on the size and schedule of the effort.
It is important for the meta data repository project leader to report to the same person as the head of the business intelligence system. This creates a peer-level relationship between the meta data repository and the data warehouse team leaders. The meta data repository team and the business intelligence team must work together as their work directly impacts one another. Flawed or muddled data warehouse architecture will directly impact the quality of the meta data repository. Conversely, a poorly designed repository will greatly reduce the effectiveness of the data warehouse.

5. Having Too Many Manual Processes In The Meta Data Integration Architecture
The process for loading and maintaining the meta data repository needs to be as automated as possible. Less than successful meta data implementations typically contain far too many manual processes in their integration architectures. The task of manually keying in meta data becomes much too time consuming for the meta data repository team. With careful analysis and some development effort, the vast majority of these manual processes can be removed.
Often much of the business meta data will require some sort of manual activity just to capture the information. Additional processes will most likely need to be developed to allow the business leaders and analysts to modify the business meta data. Unfortunately, some companies manually key in a great deal of their business meta data, which makes the repository non-scalable, stale and impossible to maintain over time.

To be continued next month…