Previous | Table of Contents | Next |
The limitations of the Fault Tolerant CORBA specification are given below.
Legacy ORBs
An unreplicated client hosted by a legacy ORB can invoke methods of a replicated server, supported by the Fault Tolerance
Infrastructure. The object group references generated for replicated servers can be used by legacy ORBs, although the full
benefits of fault-tolerant operation are not achieved for an unreplicated client. If a legacy ORB has been modified to understand
object group references and to retry requests at alternative destinations, the unreplicated client receives the benefits of
a higher, but still partial, level of fault tolerance. Special service contexts in the request and reply messages protect
an unreplicated client from a replicated server executing its requests multiple times when the client retries those requests
at alternative destinations.
Common Infrastructure
All of the hosts within a fault tolerance domain must use ORBs from the same vendor and Fault Tolerance Infrastructures from
the same vendor to ensure interoperability and full fault tolerance within that domain. Consequently, the members of an object
group must be hosted by ORBs from the same vendor and Fault Tolerance Infrastructures from the same vendor. For clients and
servers in different fault tolerance domains, both using ORBs and Fault Tolerance Infrastructures from the same vendors, full
fault tolerance can be achieved. Otherwise, the specifications provide a useful improvement over no fault tolerance but substantially
less than full fault tolerance.
Deterministic Behavior
For the infrastructure-controlled Consistency Style, for both active and passive replication, deterministic behavior is required
of the application objects, and of the ORBs, to guarantee Strong Replica Consistency. The inputs to the replicas of an object
must be consistent (identical); this implies that request and reply messages must be delivered in the same order to each of
the replicas of an object. If sources of non-determinism exist, they must be filtered out. Multi-threading in the application
or the ORB may be restricted, or transactional abort/rollback mechanisms may be used.
Network Partitioning Faults
Network partitioning faults separate the hosts of the system into two or more sets, the hosts of each set being able to operate
and to communicate within that set but not with hosts of different sets. The current state-of-the-art does not provide an
adequate solution to network partitioning faults. Thus, network partitioning faults are not addressed in this specification.
Commission Faults
A commission fault occurs when an object or host generates incorrect results. A Byzantine fault is a commission fault in which
an object or host generates incorrect results maliciously. Algorithms have been devised to detect and protect against a fairly
wide range of Byzantine faults but they are complex and expensive in processing and communication. In the current state-of-the-art,
Byzantine algorithms are seldom appropriate for fault tolerance but might be appropriate for security, to protect a system
after one or more of its hosts have been subverted by intruders. The specification provides an ACTIVE_WITH_VOTING Replication
Style. Voting itself is relatively inexpensive, but the communications infrastructure required to support voting properly
is substantially more expensive than that required to tolerate only crash faults.
Correlated Faults
No protection is provided against design or programming faults, or other correlated faults, that cause the same errors in
all replicas of an object, in all ORBs, or in all hosts or their operating systems.