| Previous | Table of Contents | Next |
The CWM Data Mining metamodel consists of seven conceptual areas: A core Mining metamodel (upon which the other areas depend),
and metamodels representing the data mining subdomains of Clustering, Association Rules, Supervised, Classification, Approximation,
and Attribute Importance. Each area is represented by the metamodel packages shown in the diagram below.

DataMining (from Analysis)

<<metamodel>> <<metamodel>> <<metamodel>> AssociationRules

Figure 12-1 CWM Data Mining Metamodel
Collectively, the collection of Data Mining packages provide the necessary abstractions to model generic representations of
data mining models (i.e., mathematical models produced or generated by the execution of data mining algorithms).
Included are representations of data mining tasks and models, as well as other entities (such as category matrix) that are
common across most data mining applications and tools, as well as their relationships to each other and their mappings to
technical metadata.
The Mining Core package consists of common Data Mining abstractions that are fundamental to, and reused by, the major conceptual
areas. In particular, this package contains several basic packages that are required to implement the CWM Data Mining interfaces.
It is required that at least this package and one more Data Mining package be implemented for compliance. The packages forming
the Mining Core are shown in the next diagram.

<<metamodel>> MiningCore (from DataMining)

Figure 12-2 CWM Data Mining Metamodel: Mining Core Package
The following subsections describe the content of each component package of the MiningCore. This is subsequently followed
by subsections describing each of the major conceptual area packages.
12.2.2.1 Mining Function Settings

algorithmSettings MiningAlgorithmSettings

settings


MiningFunctionSettings settings attributeUsageSet AttributeUsageSet

(from MiningData)
settings logicalData
Logical Data


(fr om MiningData)
Figure 12-3 CWM Data Mining Metamodel: Mining Function Settings
This package defines the objects that contain parameters specific to mining functions. The separation of mining functions
from mining algorithms enables the user to specify the type of the desired result without being concerned with a particular
algorithm. The Mining Function Settings metamodel is illustrated above.
Mining FunctionSettings (MFS) is the superclass of all other function settings classes. An MFS instance references a set of
MiningAttributes, aggregated by a LogicalData instance. The AttributeUsage set defines how each of the MiningAttributes will
be used by the Mining Algorithm.
12.2.2.2 Mining Model
+modelLocation

Class

(from Core) +model

MiningModel

+model +modelSignature


+/owner

+model


+model

MiningAttribute (fromMiningData)

+settings+keyAttribute +/feature

Attribute MiningFunctionSettings

SignatureAttribute
(from Core) (fromMiningFunctionSettings)



Figure 12-4 CWM Data Mining Metamodel: Mining Model
This package defines the basic Mining Model from which all model objects inherit as the result of a mining build task. The
Mining Model metamodel is illustrated above.
Each MiningModel has a signature that defines the characteristics of the data required by the model.
12.2.2.3 Mining Result
ModelElement

MiningResult

Figure 12-5 CWM Data Mining Metamodel: Mining Result
This package defines the basic MiningResult object from which all result objects inherit as the result of a specific mining
task (other than build).
12.2.2.4 Mining Data
This package defines the objects that describe the input data, the way the input data is to be treated, and the mapping between
the input data and internal representation for which mining algorithms can understand.
PhysicalData effectively references and instance of a class or subclass (e.g., Table, file, etc.). This allows JDM to leverage
the various row/column format data representation expressible in CWM.
Mining Data metaclasses representing the concepts of physical data are illustrated in
Figure 12-6. Logical data metaclasses are illustrated in Figure 12-7.
Attribute assignment and attribute usage metaclasses are illustrated in two subsequent diagrams (
Figure 12-8 and Figure 12-9, respectively).
Finally, metaclasses used to model the matrix representation and taxonomy of mining
data are presented in Figure 12-10, Category Matrix, and Figure 12-11, Category
Taxonomy, respectively.
ModelElement (from Core)

Figure 12-6 CWM Data Mining Metamodel: Physical Data
Figure 12-6
illustrates those elements of the Mining Data metamodel used to model physical data, whereas the following diagram shows those
elements facilitating the logical modeling of data.
Class (from Core)
Attribute (from Core)


MiningAttribute
LogicalData

/owner


LogicalAttribute

/featurelogicalAttribute



logicalAttribute
numericalProperties


NumericalAttributeProperties

CategoricalAttributeProperties categoricalProperties categoricalProperties
categoricalProperties


{ordered}
taxonomy

OrdinalAttributeProperties CategoryTaxonomy


category

Figure 12-7 CWM Data Mining Metamodel: Logical Data
Figure 12-7
contains objects that represent how physical data should be interpreted, logically by the mining algorithm.
A LogicalAttribute can be categorical, numerical, or both, depending on its usage. Categorical attributes that have ordered
category values are created as ordinal attributes.
12-8 Common Warehouse Metamodel, v1.1 March 2003
AttributeAssignmentSet



set
MiningAttribute

attrAssi gnment AttributeAssignment



logicalAttribute assignment

attrAssignmnet orderIdAttribute Attribute{ordered} (from Core)





Pi votAttributeAssi gnment DirectAttributeAssignment



directAttrAssignment


pivotAttrAssignmnet

pivotAttrAssignment

pivotAttrAssignment

attri bute

Attribute (from Core)
nameAttribute


Attribute

(from Core)
SetAttri buteAssignment




setAttrAssignment setAttrAssignment

setIdAttribute


Attribute (from Core) memberAttribute

valueAttribute ReversePivotAttributeAssignment
setIdAttribute

reversePivotAttrAssignment

0.

0.


Attribute selectorAttribute

(from Core)

Figure 12-8 CWM Data Mining Metamodel: Attribute Assignment
•
Figure 12-8
illustrates metaclasses that enable mapping physical data attributes to logical data mining attributes. The following attribute
assignments are supported:
• Direct assignment: A direct mapping between a mining attribute and a physical attribute.
• Pivot assignment: A mapping where the input data is in transactional format; each of the logical attributes occurring in a pivoted table is mapped to the three physical columns, presumably the same ones every time.
• Reverse pivot assignment: A mapping where the input data is in 2D format; the transformed input data contains set valued attributes; the sets are represented by enumerating the set elements based on the selection function.
• Set assignment: A mapping between a set valued mining attribute and a set of attributes in the physical data.
March 2003 OMG-CWM, v1.1: Organization of the Data Mining Metamodel
Class(from Core)

AttributeUsageSet
Feature (from Core) /owner




/feature

attribute
usage




Figure 12-9 CWM Data Mining Metamodel: Attribute Usage
Figure 12-9
illustrates metaclasses that enable specification of how a mining attribute should be used, interpreted, or preprocessed (e.g.,
mining value or outlier/invalid value treatment).

CategoryMatrix

categoryMatrix category



CategoryMatrixObject
categoryMatrix

matrixTaable
CategoryMatrixTable source Class


(from Core)
matrixTable

entry



CategoryMatrixEntry matrixTable matrixTable



categoryEntrycategoryEntry

rowIndex
col umnIndex columnAttribute

Attribute

rowAttribute
(from Core) valueAttribute


Figure 12-10 CWM Data Mining Metamodel: Category Matrix
Figure 12-10
illustrates the metaclasses that generalize a complex object used to represent a cost matrix (a model build input) or a confusion
matrix (a model test result). Two representations are supported:
• Java objects (CategoryMatrixObject)
• Table based (CategoryMatrixTable)
March 2003 OMG-CWM, v1.1: Organization of the Data Mining Metamodel
CategoryTaxonomy taxonomy CategoryMapcategoryMap


taxonomy


CategoryMapObject

CategoryMapTable mapTable table Class (from Core)



mapTable mapTable



mapObject

mapTable
entry

CategoryMapObjectEntry
entry

entry

parent

child

childAttribute


Category

Attribute


parentAttribute
(from Core) graphIdAttributerootCategory
Figure 12-11 CWM Data Mining Metamodel: Category Taxonomy
Figure 12-11
also illustrates the metaclasses that enable representing a taxonomy as a directed acyclic graph (DAG). Two representations
are supported::
• Java Object (CategoryMapObject)
• table-bound (CategoryMapTable).
Mining Task
This package defines the objects that are related to mining tasks. A MiningTask object represents a specific mining operation
to be performed on a given data set (i.e., physical data).
Figure 12-12
illustrates the basic Mining Task metamodel.
Transformation (from Transformation)

MiningTransformation ModelElement(fromCore)
transformation

procedure

MiningTask

miningTaskMini ngModel inputModel miningTask

(fromMiningModel)


miningTask
inputData

PhysicalDatamodelAssignment

(from MiningData) AttributeAssignmentSet (from MiningData)
Figure 12-12 CWM Data Mining Metamodel: Mining Task
Figure 12-12
illustrates Mining Task as referenced by a Mining Transformation. A Mining Task maps physical data to a model signature (when
applicable; for example, lift, test, etc.) using the Attribute Assignment set.
Min ingTask

MiningBuildTask buildTask


validationData




buildTask

buildTask buildTaskbuildTask

(from MiningData)
validationAssignmentresultModel miningSettings




settingsAssignment
Min ingModel MiningFunctionSettings
AttributeAssignmentSet
(from MiningModel) (from MiningFunctionSettings)
(from MiningData)
Figure 12-13 CWM Data Mining Metamodel: Mining Build Task
Model elements comprising the Mining Build Task are shown
in Figure 12-13
. The modeling of the application of output and the computation of the result of an application of a data mining model to
(new) data are illustrated in
Figure 12-14
and
Figure 12-15
, respectively.
MiningApplyOutput


applyOutput
MiningAttribute {ordered}

(from MiningData)
item


ApplyOutputItem



ApplySourceItem
ApplyContentItem


ApplyProbabilityItem ApplyScoreItem ApplyRuleIdItem

Figure 12-14 CWM Data Mining Metamodel: Apply Output
Figure 12-14
illustrates metaclasses that enable defining the content of an Apply task. This includes source items; for example, keys,
or specific content of apply (data scoring using a model).
An apply output may contain multiple source and content items.
MiningTask

MiningApplyOutput AttributeAssignmentSet (from MiningData)
Figure 12-15 CWM Data Mining Metamodel: Mining Apply Task
Figure 12-15
illustrates metaclasses that allow specification of an apply task. The apply task requires a model, physical data, apply output,
and an attribute assignment set.
Entry Point
This package defines the top-level objects of DataMining package which can be used as entry point in application programming.
This is illustrated in
Figure 12-16
.
Package (from Core)

CatalogLogicalData(from MiningData)

result MiningResult

catalog
(from MiningResult)
logicalData



schema

schema
Schema


schema categoryMatrix CategoryMatrix


(from Mi ningData)
schema

schemaschema


auxOobjects

schema
schema


schema AuxiliaryObject

miningModel 0..*
0..*

(from MiningModel)
MiningModel auxiliaryObject
task

miningFunctionSettings

MiningTask
taxonomy

MiningFunctionSettings

CategoryTaxonomy (from MiningFunctionSettings) (from MiningTask) attributeAssignmentSet (from MiningData) AttributeAssignmentSet
(from MiningData)

Figure 12-16 CWM Data Mining Metamodel: Entry Point
Clustering
This package contains the metamodel that represents clustering functions, models, and settings. The Clustering metamodel is
illustrated in
Figure 12-17
. It contains attribute usage and function settings, subclasses that are specific to the Clustering function.
March 2003 OMG-CWM, v1.1: Organization of the Data Mining Metamodel
AttributeUsage (from MiningData)

ClusteringAttributeUsage attributeComparisonFunction : AttributeComparisonFunction similarityScale : Double / comparisonMatrix
: CategoryMatrix



attributeUsage
comparisonMatrix

CategoryMatrix (from MiningData)

MiningFunctionSettings(from MiningFunctionSettings)

ClusteringFunctionSettings maxNumberOfClusters :Integer minClusterSize : Integer = 1 aggregationFunction : AggregationFunction


Figure 12-17 CWM Data Mining Metamodel: Clustering
Association Rules
This package contains the metamodel that represents the constructs for frequent itemset, association rules and sequence algorithms.
The Association Rules metamodel is illustrated in
Figure 12-18
.
MiningFunctionSettings (fromMiningFunctionSettings)

FrequentItemSetFunctionSettings

settings exclusion Category

(from MiningData)


AssociationRulesFunctionSettings SequenceFunctionSettings

Figure 12-18 CWM Data Mining Metamodel: Association Rules
12.2.2.5 Supervised
This package contains the metamodel that represents the constructs for supervised learning algorithms. The Approximation,
Attribute Importance, and Classification packages must implement this package.
Figure 12-19
illustrates the Supervised metamodel. It contains test and lift tasks, test and lift results, and a common superclass for
supervised function settings.
MiningTask (from MiningTask)
MiningResult(from MiningResult)

MiningTestTask MiningTestResult





testResult
testTask
liftAnalysis

positiveTargetCategory

LiftAnalysis Category positiveTargetCategory liftAnalysis (from MiningData)




MiningFunctionSettings(from MiningFunctionSettings)

liftAnalysis


point

LiftAnalysisPoint
SupervisedFunctionSettings

Figure 12-19 CWM Data Mining Metamodel: Supervised
Classification
This package contains the metamodel that represents classification function, models, and settings.
.
SupervisedFunctionSettings (from Supervised)

ClassificationFunctionSettings

(from MiningData)
Figure 12-20 CWM Data Mining Metamodel: Classification Function Settings
Figure 12-20
represents the model for Function Settings, while
Figure 12-21
illustrates those model elements used to represent Attribute Usage that can include prior probability specification.
Figure 12-22
shows that portion of the Classification metamodel modeling Classification Test tasks, results, and apply output.
AttributeUsage (from MiningData)

ClassificationAttributeUsage


usage
usage


priors

PriorProbabilities
positiveCategory 1..*1..*
Category(from MiningData)
priors

targetValue

prior

priorsEntry PriorProbabilitiesEntry



Figure 12-21 CWM Data Mining Metamodel: Classification Attribute Usage
MiningTestTask (from Supervised)
MiningTestResult (from Supervised)


ClassificationTestTask testTask testResult

ClassificationTestResult


testResult


confusionMatrix

ApplyOutputItem (from MiningTask)
CategoryMatrix (from MiningData)

ApplyTargetValueItem

(from MiningData)
Figure 12-22 CWM Data Mining Metamodel: Classification Test and Result
Approximation
This package contains the metamodel that represents the constructs for approximation modeling (also known as regression).
The metamodel is shown in
Figure 12-23
.
March 2003 OMG-CWM, v1.1: Organization of the Data Mining Metamodel
MiningTestTask
MiningTestResult (from Supervised)

SupervisedFunctionSettings (from Supervised)

ApproximationFunctionSettings

Figure 12-23 CWM Data Mining Metamodel: Approximation
Attribute Importance
This package contains the metamodel that represents the constructs for attribute importance (also known as feature selection)
model. This metamodel is illustrated in
Figure 12-24
.
SupervisedFunctionSettings (from Supervised)

AttributeImportanceSettings

Figure 12-24 CWM Data Mining Metamodel: Attribute Importance