1.4 Design Rationale

Simple Implementations

The DAF is distinguished from some other database APIs because it can be applied to very simple systems, which may not contain a full-blown database management system. We want people to create quick and simple implementations of the DAF to solve small-scale integration problems, without prejudice to more elaborate implementations.

Accordingly, the DAF defines very few interfaces, and does not require implementations to manage large, dynamic populations of CORBA objects. Most activity centers on the interface ResourceQueryService, which defines a small but sufficient set of queries as methods. The queries defined by the DAF are simple enough to be implemented in any UMS database and many related systems and applications.

High Performance Implementations

A UMS is a real-time system in the sense that it is used to make and execute operational decisions within strictly limited time boundaries. The performance requirements mean that a typical UMS does not use a typical database management system, which again leads to the need for the DAF.

To be effective in operational as well as off-line roles, the DAF must not introduce performance bottlenecks of its own. This has influenced the design in several ways, listed below. The first two emerged from performance testing and optimization in the prototype:

• Data Value Representation - The basic unit of data, from which query results are composed, is a union type: SimpleValue. SimpleValue exploits our knowledge of the basic data types needed and eliminates CORBA any from the highest bandwidth part of the interface. This can make a significant impact on performance when accumulated across large amounts of data.

• Support for Pre-joined Views - Join optimizations are particularly applicable to relational databases, whether real-time or general-purpose. Here views are often defined to flatten the schema, effectively pre-joining tables along the lines of anticipated queries. The DAF includes a query, get_descendent_values(), which gives implementations the opportunity to optimize this type of query.

• Query Results Granularity - While simple, DAF queries can return a large amount of data at once in the form of a ResourceDescriptionSequence. On the server side, this allows implementations to optimize data retrieval without the need for read-ahead schemes. On the client, it minimizes network latency without the need for caching schemes.

Data access patterns for UMS analysis software are well known, and are not friendly to caches. A typical analysis module reads a large amount of input data exactly once at the beginning of each analysis cycle. This input data is generally out-of-date by the next analysis cycle and therefore not amenable to caching.

Partial Schema

The EPRI CIM is relatively large schema and a given DAF data provider (in the electric power domain) may not support all of it. In an effort to enable partial implementations of the EPRI CIM, the EPRI CCAPI task force is in the process of defining conformance blocks for specific application areas. A need for partial schema support is envisaged in the water and gas domains as well. The DAF provides a method, get_resource_ids(), which clients can use to determine whether a given class or attribute is supported by a given implementation. The same method will be used to determine whether a conformance block as a whole is supported, once those blocks are defined.

• Attribute-level Decomposition - A partial schema may omit whole classes of data, but just as often it will omit some attributes of a class and support others. This reflects the observation that different groups of attributes correspond to different functional areas. Members of the EPRI CCAPI and OMG Utility Domain Task Forces have frequently referred to these functionally related groups of attributes as aspects.

In a traditional object oriented design, aspects would be classes in their own right. However, the EPRI CIM defines no such classes and it would be very difficult to keep such definitions up-to-date as new applications are discovered.

Accordingly, the DAF allows implementations to support or omit data at the level of individual attributes on individual objects. This influences the design of the query results structures and, more fundamentally, the model of data that underlies the DAF.

• Federation of Data Providers -This specification anticipates that a number of data providers, each supporting part of an overall schema, could be combined to create a complete system. This would permit independent developers to extend a UMS with data providers as well as clients. However, this implies the need for system and configuration dependent logic that must direct queries to the appropriate data providers and merge the results obtained. The DAF provides for a proxy data provider to hide these details from both the clients and the ultimate data providers, ensuring that these components can be developed independently of any given system.

Schema Access

The DAF is not required to support access to metadata (i.e., schema information); however, it is not possible to formulate queries without reference to at least some schema elements: classes, attributes, and relationships. Furthermore, many generic applications not only reference schema information but also query it in detail. To take an example, an XML export facility might determine what to export and how by querying what classes exist, what attributes belong to each, and what the attribute types are.

• Identifying Schema Elements - This specification deals with schema at two levels. For simple clients and servers it is sufficient to identify the schema elements. The DAF employs Universal Resource Identifier references (URIs) for this purpose. A simple client is expected to know the URIs for the classes and attributes it wants to access. A mapping is provided between the EPRI CIM defined in UML and the URIs used in the DAF.

• Schema Versions - The DAF provides a way for clients and data providers to negotiate schema versions. All schema elements are designated by URI references that may include version identification. Mismatches between clients and data providers are exposed when the data provider does not recognize a given schema URI. Alternatively, a data provider can support multiple schema versions at once, an approach that is especially useful when the differences are slight.

• Schema and Meta-Model Extensions - The DAF provides a way for an existing or standard schema to be extended in a system-specific or proprietary way. In particular, because schema elements are named by URI references, additional elements can be introduced without name conflicts. Similarly, it is possible to extend the meta-model to provide richer schema information for those generic applications that need it.

• Querying Schema Information - More sophisticated clients may want to query the schema information in greater detail. This is accommodated without requiring every client and data provider to deal with the complexities. Experience with developing data providers indicates that the schema query capabilities add significant cost and complexity and may even dominate the overall implementation cost. The design rationale employed here is that implementations only pay for what they use.

When available, schema information is provided by reflection through the same query interface that provides population data. A standard meta-model has been adopted as the basis of schema queries. As with cases of partial schema support described above, the specification allows for a proxy data provider to match clients to the ultimate data providers. The proxy may add or merge schema information in those cases where it is required by the client but is not available from the ultimate data provider.

Consistency with XML/CIM

The DAF shares both a model of data and a schema with the XML/CIM language. Thus the same interpretations of the CIM model are made in both standards. Moreover, this common basis should insure that these standards remain compatible in the future.