Integrating Master Data Management and BI (part II)
Having discussed Master Data Management (MDM) in the first part of this article, the second part of this article looks at the question of how does MDM and business intelligence fit together.
How Do MDM and BI Fit Together?
The first thing to point out is that master data is represented as dimensions in BI Systems. Master data is not associated with facts (i.e. transactions) in BI systems. The introduction of a master data management system within the enterprise should have a positive impact on BI systems. As an example, it is typically the case in an MDM system that the attribute data names and data definitions used to describe master data entities are likely to be the standard data names and data definitions for the enterprise. These master data definitions are often referred to as a shared business vocabulary (SBV) for the enterprise. The SBV is therefore master metadata.We can take advantage of a master data SBV in a BI system to enforce the reuse of the same data definitions across all dimensional models, cubes and BI tool business views in order to drive consistency across dimensional data (Figure 8). Adopting a master data SBV in this way can only improve the understanding of the data presented in BI system reports, OLAP analyses, dashboards and scorecards. It also contributes to the demand for compliance and the perception of trusted BI.
In addition to consistent metadata, the arrival of an MDM system in the enterprise can also impact data integration in a BI system (commonly referred to as data warehousing or ETL processing). If you do not have a MDM system today then your BI system will likely be based on a classic data warehouse architecture whereby your master data is fractured across multiple data stores in different line of business operational systems. Therefore to create integrated dimensional data in a BI system, you typically make use of a data integration tool to integrate disparate master data maintained in multiple operational systems to build dimensions.In that sense it is also true to say that your BI system is probably the only place today where you consolidate and integrate master data (Figure 9).
Figure 9 – A Classic BI System Architecture
Therefore it is no wonder we get confused between the concept of a master data hub and a data warehouse when both are integrating master data. So it raises the question of why do you need an MDM system when you already have a data warehouse? For me this is the wrong question. The question should be, why are we doing master data integration in a BI system? Should the master data not already be in integrated and treated as a data source by the data integration tools used in a BI system? In effect this would be better in my opinion because master data has to be supplied to more than just a BI system. It needs to be supplied to operational and BI systems. Master data can be supplied to BI system data integration tools in at least three ways:
- By using integration and matching services of a SOA enabled data quality software to supply master data directly into a data warehouse or to ETL jobs populating data warehouses.
- By using Registry based MDM solution to create a master data virtual data source that can be accessed by a BI system data integration tool
- By using a built or bought MDM data hub as a persistent data source to a BI system data integration tool
DataFlux is an example of a vendor that can support the first of these options.Figure 10 shows the Registry MDM option. This can be supported by Purisma for example who supply a connector for Informatica PowerCenter whereby it can provision master data to the Informatica data integration tool for the creation of dimensional data. PowerCentre can apply Purisma determined Global IDs to cross-system customer data to create integrated customer data records in a BI system. Purisma also have a partnership with Business Objects as part of the Business Objects Enterprise Information Management (EIM) initiative to pass master data to Business Objects Data Integrator.Registry MDM systems could also provision master data changes to data integration tools for slowly changing dimensions processing.
To some extent the first two of the above options should result in some degree of simplification of classic ETL processing in BI systems since this work may not have to be handled in ETL jobs.Figure 11 shows how an MDM data hub can be used as a data source to BI systems. Doing this of course would significantly simplify the building of BI systems and improve the quality of dimensional data because in this case the source of the dimension data already contains high quality and integrated master data. There is a much larger impact of this approach however in that it actually means we are separating dimension data from fact (transaction) data. This could potentially mean that all you need to do is to combine master data with transaction data to generate data marts with conformed (common) dimensions from master data and transaction data sources. This is very like the approach taken by the Kalido MDM solution. In fact it could be said that MDM Systems like Kalido could potentially negate the need for the ‘one huge single database’ approach to an enterprise data warehouse if the data marts all use conformed dimensions. Let’s be clear what I mean here. In other words the enterprise warehouse becomes the collection of data marts (that could be deployed on a single server) all of which share dimension data. It is a subtle change in that the concept of enterprise warehouse is still alive and well but the creation of it is done by combining master data and transaction data to create a set of linked data marts that share dimension data.
MDM can also strengthen BI analysis and reporting. For example in the area of statutory and regulatory reporting needed for compliance reasons, consistent master data is needed to ensure reports are correct across one or more BI systems. Accurate master data also prevents reconciliation problems occurring across BI systems, cubes and reportsAlso if an MDM system supports central control of hierarchies, it means that changes to these hierarchies can be easily propagated across multiple dimensional models in multiple line of business BI and operational systems. MDM vendors like Hyperion, Stratature and Siperian all support master data hierarchy management including tracking change history to hierarchies. This means that multiple versions of hierarchies can be created. Therefore multiple versions of dimensions can be re-created for more accurate historical reporting and analysis. Also cubes can be automatically re-built with different versions of hierarchies for more accurate performance analysis.Figure 12 shows an example of Product master data. Here if a product that exists in 2005 was moved out of its 2005 product category into a different category in 2006 then changes to the Product master data hierarchy are needed. These changes need to be tracked and propagated to all necessary systems. With respect to BI, this allows us to correctly report on product sales and compare results from 2005 with results in 2006 by correctly accounting for the product category in 2005 and also in 2006 . MDM systems can therefore automate the re-creation of earlier versions of a product hierarchy for more accurate BI reporting.
In summary therefore it could be said that master data management strengthens DW/BI systems by
- Providing master metadata for use in dimensional data models and cubes
- Providing high quality master data as a trusted data source to ETL processing
- Providing federated views of master data across disparate systems for reporting
- Tracking versions of hierarchies over time
- Automate re-creation of different versions of a dimension in a cube or star schema to reflect changes in hierarchies
- Providing trusted data for reporting and analysis
MDM systems will feed master data changes to both operational and analytical systems. It is unlikely however that master data and DW dimensional data will be the same physical data instance because master data is typically normalised to support OLTP processing while dimensions are typically de-normalised to support dimensional analysis and to simplify reporting. Companies will likely move towards full enterprise master data management over a number of years and may not go the whole way to a single system of entry. One possible route if you are buying MDM solutions is to perhaps start with a registry based MDM system, then move to a data hub and finally Enterprise MDM. As for the question of building MDM systems or buying them, if your organisation has many bespoke point solutions then consider the purchase of a data hub MDM system moving towards Enterprise MDM over time. If your organisation has the majority of its core applications already using a common operational database, then it may be better to build a MDM system to reuse core business entity master data in that data store. This data store could then become a source to the data warehouse.