23.2.8 Transparent Reinvocation

This section defines mechanisms that provide transparent reinvocation of methods contained in request messages. The mechanisms handle failure of the primary member of a server object group that has the COLD_PASSIVE or WARM_PASSIVE Replication Styles and provide redirection of the client’s outstanding request to a backup server. In the absence of such mechanisms, the failure of the primary server could cause a client’s request to be executed two (or more) times, once by the original primary and once by a backup that became the new primary, without the client or the server being aware of the repetition, possibly producing erroneous results.

These specifications do not change the current at-most-once invocation semantics of the CORBA object model. At the level of the application, a client makes a request once only and that request is executed at most once. At the transport level, however, a fault-tolerant client ORB can transparently retransmit a request message to a fault-tolerant server, to mask faults including both object and link faults, thus providing higher reliability. Transparent reinvocation is permitted only under the completion status and system exception conditions listed in Table 23-1 on page 23-22, and provided that both the IOP profile used for the existing request and the IOP profile used for the reinvocation contain a TAG_FT_GROUP component. Both the existing request message and the reinvocation request message must contain an FT_REQUEST service context. Neither the client application nor the server application is aware of such retransmissions. The server application executes the request at most once with no special application programming to handle repeated requests, and the client application receives its reply with no special application programming to handle exceptions. (For replicated clients communicating with replicated servers, use of a multicast group communication protocol may be appropriate because such a protocol provides stronger acknowledgment and retransmission mechanisms.)

The mechanisms defined here consist of the FT_REQUEST service context, which a client may include in its request messages, and the Request Duration Policy.

23.2.8.1 FT_REQUEST Service Context

The FTRequestServiceContext is used to ensure that a request is not executed more than once under fault conditions. When encoded in a request or reply message header, the context_data component of the ServiceContext struct shall contain a CDR encapsulation of the FTRequestServiceContext struct, which is defined below.

module IOP {const ServiceId FT_REQUEST = 13;

};

module FT {

struct FTRequestServiceContext { // context_id = FT_REQUEST; string client_id; long retention_id; TimeBase::TimeT expiration_time;

};};

The FT_REQUEST service context contains a unique client_id for the client, a retention_id for the request, and an expiration_time for the request. The client_id and retention_id serve as a unique identifier for the client’s request and allow the server ORB to recognize that the request is a repetition of a previous request. If the request is a repetition of a previous request that the server has already executed, the server (which may be a new primary) does not re-execute the request but rather returns the reply that was generated by the prior execution (possibly by a previous primary that failed). The expiration_time serves as a garbage collection mechanism. It provides a lower bound on the time until which the server must honor the request and, therefore, retain the request and corresponding reply (if any) in its log.

const ServiceId FT_REQUEST = 13;

A constant that designates the FT_REQUEST service context.

struct FTRequestServiceContext { // context_id = FT_REQUEST;

string client_id;

long retention_id;

TimeBase::TimeT expiration_time; };

A structure that contains the client identifier, retention identifier, and the expiration time of the request. Each repetition of a request must carry the same client_id, retention_id, and expiration_time as the original request. These fields are defined as follows:

• The client_id uniquely identifies the client, so that repeated requests from the same client can be recognized. No mechanisms are defined for generating this unique identifier.

• The retention_id uniquely identifies the request within the scope of the client and the expiration_time. The client ORB can reuse the retention_id provided that it guarantees uniqueness.

• The expiration_time defines a lower bound on the time when the request will expire. Typically, the expiration_time is obtained by adding the request_duration_policy_value defined by the Request Duration Policy, to the local clock value of the client ORB.

If a server is unable to support the expiration_time, it may throw an INVALID_POLICY exception. Otherwise, the server must retain each request and its reply until the time (at the server) defined by the expiration_time. Until that time, the server must recognize requests that are repetitions of requests that have already been executed, and must return the reply to the original request rather than reinvoking the method. After that time, the server must return either the reply to the original request or a BAD_CONTEXT exception, but all replicas of the server must make the same decision about which reply to return so that the client receives only one reply.

The client ORB that has issued the request may reissue the request to the same or a different member of the server object group, but must use the FT_REQUEST service context with the same retention_id and same expiration_time as it used in its original request.

Before the server returns the reply for a request to the client, the Fault Tolerance Infrastructure must log the request and the reply. A backup that has become the new primary must not reply to the client until its state has been updated to include replies generated by other members of the object group, using the messages in the log.

Both the establishment of connections and the retention of requests are bounded by the expiration_time, or the client ORB’s current clock value plus the request_duration_policy_value if no expiration_time has been established. If a current connection fails, a new connection may be needed so that the request can be retransmitted to an alternative member of the server object group. The establishment of the new connection must be bounded by the expiration_time determined for the prior request.

23.2.8.2 Request Duration Policy

The Request Duration Policy determines how long a request, and the corresponding reply, should be retained by a server to handle reinvocation of the request under fault conditions.

module FT { const CORBA::PolicyType REQUEST_DURATION_POLICY = 47;

interface RequestDurationPolicy : CORBA::Policy { readonly attribute TimeBase::TimeT request_duration_policy_value; }; };

The Request Duration Policy, applied at the client, defines the time interval over which a client’s request to a server remains valid and must be retained by the server ORB to detect repeated requests.

The policy is defined by:

const CORBA::PolicyType REQUEST_DURATION_POLICY = 47;

A constant that designates the REQUEST_DURATION_POLICY.

interface RequestDurationPolicy : CORBA::Policy { readonly attribute TimeBase::TimeT request_duration_policy_value;

};

The request_duration_policy_value is added to the client ORB’s current clock value to obtain the expiration_time that is included in the FT_REQUEST service context for the request.

23.2.8.3 Fault Handling for GIOP Messages

The standard semantics of GIOP messages include definitions of fault conditions for messages of different types, and provisions for handling of faults by the ORBs. Fault Tolerant CORBA does not modify those semantics in normal (fault-free) conditions. For some types of GIOP messages, an ORB may attempt to retransmit the message or transmit the message to alternative destinations or over alternative transports. Such attempts are invisible to the client and server application and are bounded in time by the request_duration_policy_value defined for the client by the Request Duration Policy. We discuss below those GIOP messages for which fault handling is modified.

LocateRequest

If a client ORB loses an IIOP connection with a server while issuing a LocateRequest, or before receiving a corresponding LocateReply, or if it does not receive a LocateReply in a timely manner, then the client ORB may attempt to retransmit the message or to transmit the message to alternative destinations or over alternative transports. If the client ORB is unable to obtain a reply within the request_duration_policy_value of the Request Duration Policy, the client ORB must return a COMM_FAILURE system exception to the client application. It may return a COMM_FAILURE system exception before the end of that duration.

Request

If a client ORB loses the connection with a server or incurs some other kind of transport fault, the ORB may attempt to retransmit the request message, or retransmit the request message to an alternative destination or using an alternative transport, up to the expiration_time.

If a client invokes a fault-tolerant server (as indicated by the presence of the TAG_FT_GROUP component in the TAG_INTERNET_IOP profiles of the server’s object group reference), the client ORB may retransmit a request if it would have otherwise returned a COMM_FAILURE, TRANSIENT, NO_RESPONSE, or OBJ_ADAPTER exception with a COMPLETED_NO or COMPLETED_MAYBE completion status to the client application. The client is protected against repeated execution by the inclusion of an FT_REQUEST service context in the request message, as described in Section 23.2.8.1, “FT_REQUEST Service Context,? on page 23-24.

If a client invokes a non-fault-tolerant server (as indicated by the absence of a TAG_FT_GROUP component in the TAG_INTERNET_IOP profiles of its reference), the client ORB may retransmit the request only if it would have otherwise returned a COMM_FAILURE, TRANSIENT, NO_RESPONSE, or OBJ_ADAPTER exception with a COMPLETED_NO completion status to the client application.

LocateReply and Reply

Retransmission of a LocateReply or Reply message may occur either because theserver ORB has not received a transport-level acknowledgment for a previoustransmission or because the server ORB has received a repetition of a previousLocateRequest or Request message.

Fragment

Fragmented Request and Reply messages are handled like unfragmented Request and Reply messages.