From Business Intelligence to Enterprise IT Architecture
A history of integration
It’s often forgotten that key drivers for data warehousing and later business intelligence (BI) were information consistency and integration. Users required consistent reports and answers to decision support questions so that different departments could give agreed answers to the CEO. IT wanted an integrated set of base data to avoid time-consuming and costly reworking when the CEO got conflicting answers. The first article describing a data warehouse architecture meeting these needs was published in 1988 in the IBM Systems Journal . It was based on internal work in IBM Europe over the previous three years.
Nearly 25 years later, more emphasis is placed on speed and flexibility of decision making, and sophisticated analytics. We see this in diverse technology trends from operational BI through data appliances to “post-relational” databases and stand-alone analytic and dashboard environments. Sometimes consistency and integration are assumed—“we have a data warehouse; that will take care of it.” As often as not, these requirements are conveniently forgotten as data marts and appliances are sourced directly from inconsistent operational systems, and of course, the resulting inconsistencies emerge later to haunt the development and maintenance teams.
Meanwhile, operational application developers are also in a transition phase. Service Oriented Architecture (SOA) is creating a new process-oriented, plug-and-play approach for operational applications initially, but for informational and collaborative applications in the longer term. As business users become used to the concept that they can (or should be able to) link together existing services into a workflow they need to do their job, the business will correctly begin to question the difference between these classes of function: Why can’t we plug an analysis step into the workflow to understand the likely impact of delaying this shipment? How do we link into the e-mail system to notify a customer automatically of an order fulfillment problem? Web / Enterprise 2.0 approaches are also dissolving the old boundaries between operational, informational and collaborative function by reframing user interactions in a looser and more user-directed social environment. And as these old (and artificial) functional boundaries break down, so too does our traditional division between operational, informational and collaborative information.
Call center applications, for example, show clearly the breadth of information required by modern business processes. Data from the data warehouse environment is needed to track the caller’s history of business with the enterprise. Complaints, warranties and other document-based information from e-mail and content management systems are also required to understand product history and customer behavior. Access to operational systems is needed to create or update orders or other records. All this information must be consistent over its different sources and integrated in a timely manner. The traditional approach has been to duplicate much or all of this information into a bespoke application optimized for the agents’ needs. However, SOA promotes a very different approach. Services built for other departments, such as Update Order from the sales department or Check Order History from finance, are brought together in a workflow along with some specific call center services to meet the agents’ needs. If the underlying multiple data sources for these various services are inconsistent, the call center will be unable to determine the actual, current situation of any caller with certainty and thus may act appropriately.
Today’s integration needs and a new architecture
Extrapolating to the more general SOA environment, the consequences are clear. SOA can only succeed if it is based on a fully consistent and integrated set of information upon which services act. And, unlike data warehousing, this information is not stored in a single database, but is distributed throughout the entire IT infrastructure. Furthermore, Web / Enterprise 2.0 make it clear that the nature of this integrated information is expanding from numerical and tabular data—hard information—to include a wide variety of more complex data forms such as web information, documents, audio and images—soft information. And consistency and integrity needs apply to all these information types as well.
As in the 1980s, when business demands for consistency and integrity in management information forced us to create a new data warehouse architecture for decision support, SOA and Enterprise 2.0 are compelling us to re-examine our overall enterprise IT architecture. The key question being posed is: how can we create a new base of integrated and consistent information for the entire enterprise?
The simpler the answer, the better the solution. If you want to create a consistent, integrated information resource, you must stop creating duplicates of existing information that have to be managed to consistency, and you must eliminate—or, at least, substantially reduce—existing data duplication.
The original data warehouse architecture showed the way. It proposed a logically single data store—the Business Data Warehouse—modeled at the enterprise level as the consistent and integrated source of all information for decision making. This simplicity was ultimately lost with the emergence of the layered architecture (with multiple data marts fed from an enterprise data warehouse), due to a combination of database performance and enterprise modeling issues.
Nonetheless, the approach remains valid for the current much-expanded needs for integration. First, model all the information according an enterprise-level model and then implement as far as possible in alignment to that model. This is the approach proposed in a new architecture, Business Integrated Insight (BI2), which for the first time gathers all the information of the enterprise, hard and soft; operational, informational and collaborative into a single component called the Business Information Resource (BIR). The BIR is fully modeled, although this will require extensions to current techniques to support the variety of types of data involved and particularly to allow dynamic modeling of soft information as it arrives into the enterprise.
Implementation requires more than general purpose relational database technology. New, higher performance columnar and in-memory databases can play a key role, provided workload management and other reliability, availability and scalability issues are addressed. Other database technologies aimed specifically at soft information are also needed; wholesale loading of such information into relational databases is not recommended. Data duplication cannot be entirely avoided—there is no single optimum technology for all processing needs. However, it can and must be closely managed and controlled in this approach. And metadata becomes part of the BIR, rather than a troublesome side issue.
Modern business requires a much broader set of information than ever, and the joined-up and flexible nature of today’s business processes dictates that this information must be highly consistent and well integrated. Beginning from the first principles of data warehousing, we can see how to achieve these goals and expand our current business intelligence environment into a more comprehensive enterprise IT architecture that enables integrated insight into the entire process of running a business.