Previous | Table of Contents | Next |
The amount of data in a given organization doubles every five years. Most organizations suffer from an overabundance of redundant
and inconsistent data that is difficult to manage effectively, to access, and to use for decision making purposes. Data warehousing
provides an excellent approach for transforming data into useful and reliable information to support the business decision
making process and to achieve business intelligence. One of the most important aspects of data warehousing is metadata. Metadata
is used for building, maintaining, managing, and using the data warehouse. Unfortunately, the proliferation of data management
and analysis tools has resulted in almost as many different representations and treatments of metadata as there are tools.
Since every data management and analysis tool requires different metadata and a different metadata model (known as a metamodel)
to solve the data warehouse metadata problem, it is simply not possible to have a single metadata repository that implements
a single metamodel for all the metadata in an organization. Instead, what is needed is a standard for interchange of warehouse
metadata.
Management
Analysis
Resource
Foundation
The CWM is a response to these needs. It provides a framework for representing metadata about data sources, data targets,
transformations, and analysis, and the processes and operations that create and manage warehouse data and provide lineage
information about its use.
The CWM Metamodel consists of a number of sub-metamodels that represent common warehouse metadata in the following major areas
of interest to data warehousing and
business intelligence (see Figure 3-1):
• Data Resources -- These include metamodels that represent object-oriented, relational, record, multidimensional, and XML data resources. In the case of object-oriented data resource, CWM reuses the base object model.
• Data Analysis -- These include metamodels that represent data transformations, OLAP (On-line Analytical Processing), data mining, information visualization, and business nomenclature.
• Warehouse Management -- These include metamodels that represent warehouse processes and results of warehouse operations.
The CWM Metamodel
Warehouse ProcessTransformationObject ModelBusiness InformationObject Model |
Warehouse Operation |
||||
OLAP | Information Visualization Data Mining | Business Nomenclature | |||
Relational | Record | Multidimensional | XML | ||
Data Types | Expression | Type Mapping Keys and Indexes | Software Deployment | ||
Figure 3-1 CWM Metamodel
The CWM Metamodel is designed to maximize the reuse of Object Model (a subset of UML) and the sharing of common modeling constructs
where possible. The most prominent example is that CWM reuses/depends on Object Model for representing object-oriented data
resources. In addition, where applicable, key elements of the metamodels for other types of data resources all subclass from
the same model
elements in Object Model, as shown in Table 3-1. (The entries listed under Software
System and Deployed Software System are examples.)
Table 3-1 CWM Data Resources
Object ModelRelational |
Software System |
Deployed Software System |
Package |
Class |
Attribute |
Java | Java installation | Package | Class | Attribute | |
DB2 UDB, Oracle 8i, Teradata | DB2 UDB, Oracle 8i, Teradata installations | Catalog/ Schema | Table | Column |