System modeling using state machines

A. Building a model

Reactive systems: interactions with the environment. The environment may be other system components or external components (users).
Two aspects of system models:
- structure (often static - not changing over time): a system is composed of components - sub-components. Components have interfaces (one or several). At each interface, certain types of interactions may occur (sometimes the set of possible interactions is called the alphabet of the interface, or of the component). Possible notations are:
  - informal drawings
  - structure in terms of "blocks" and sub-blocks in SDL (see daemonGame structure in SDL; composite structure example from TAU documentation)
  - in UML: use of Class diagrams and/or composite structure diagrams (introduced in UML version 2, see examples)
- dynamic behavior: that is, in which order may the interactions occur. This part is much more difficult to describe because the sequences of interactions to be considered may be arbitrarily long, even infinite (e.g. for a telephone that never fails). Each sub-component that is not further sub-composed must be described by defining its dynamic behavior in terms of its possible interaction sequences. Possible notations are:
  - informal explanations
  - UML Sequence diagrams (or "Message Sequence Charts" (MSC))
  - UML Activity diagrams
  - State machines. There are different versions of state machines. In this course, we will see the following:
    - Labeled Transition System (LTS). An LTS has a finite number of states that are connected by transitions. Each transition has a label which represents an interaction that occurs when the transition in question is performed. The set of labels of an LTS is called the alphabet of the LTS. Different LTS may communicate (that is, coordinate their interactions) by sharing a subset of their alphabet. The shared interactions can only executed in "rendezvous", that is, a transition of the first LTS labeled with a shared interaction can only be executed jointly with a transition of the other LTS which is labeled with the same interaction (this interaction is executed in rendezvous).
    - Mealy automaton, also called "Finite State Machine" (FSM). A FSM has a finite number of state and its interactions are partitioned into input and output interactions. Each transition is normally labelled with an input interaction and an output interaction. When the machine is in a state and an input interaction arrives and there is a transition from the current state labelled with this input, then this transition will be executed (the new state will be reached) and the output of the transition will be produced. The output of one FSM normally becomes the input to another FSM. The propagation of the output of one machine to the other machine where it becomes input usually is related with some transmission delay and/or queuing. Note: Mealy used such models to describe the behavior of digital hardware circuits in the 1950ies.
    - Extended FSM (EFSM). To be more powerful, the FSM model is usually extended by allowing input and output messages with parameters and local variables that may store values related to previous input parameters or intermediate results of calculations, often intended to be used for defining the parameters of output messages. The UML State diagrams is one particular notation for EFSMs. The current version of UML supports the so-called state-oriented syntax (inherited from UML version 1) and the transition-oriented syntax (inherited from SDL). In this course, we will normally use the transition-oriented syntax.
    - Other kinds of state machines:
      - Moore machine: like FSM, but output is not produced during a transition, but the machine produces a continuous output which only depends on the current state of the machine (output of a Mealy machine is a signal/event, while output of a Moore machine is a continuous value, like the value of a variable or the current in an electronic circuit). For hardware designs, Moore machines are easily composed (by having the output of one machine become the input of another). A global clock is used to determine the time when each machines looks at its (continuous) input and triggers the corresponding transition.
      - Accepting automata (AA): They are an LTS for which, in addition, a set of "final states" is defined. They are used for defining sets of execution sequences that are accepted by the automaton. While an LTS accepts all sequences that can be executed by the LTS, an AA automaton only accepts sequences of finite length - exactly those for which it is in an accepting state after having followed the entire execution sequence. In Module 2 of this course, AAs are used to define sequences of characters or words that belong to a language.
Example: Coffee-Machine (user view: sketch of user interface; structure: sketch)
- Modeling the interactions: Let us consider a push-button interaction. How does a push-button interaction work ? - Consider a photocopier: It has a go button. If you push it twice, do you get two copies ?
  - Synchronous interaction model (relatively abstract): also called rendezvous (both parties must be in a state where they are ready for the interaction). How is the readiness of the machine indicated to the user ?
    - If we assume that the execution time of an interaction is very short, we may assume in our model that only on interaction takes place at a given time throughout the whole model. This is called interleaving semantics. How are simultaneous events modeled then ?
    - Often one considers asymmetric synchronous interactions where one side is always ready for an interaction; for instance in the case of procedure calls in sequential programming languages.
  - Asynchronous interaction model: based on message passing; this is more low-level than rendezvous - one has to deal with the set of messages in transit, which is complicated.
    - With reliability: output messages are queued until they are consumed as input messages by the destination component. One needs flow control (to avoid that too many messages accumulate before being processed).
    - With possibility of message loss (like IP packet transmission in the Internet): messages are dropped when they accumulate; no need for flow control (but message loss must be handled by the communicating partners).
    - The situation where a message arrives and the receiving component is in a state for which no transition is defined for receiving this message is called "non-specified reception". It usually indicates that there is a design error.
- LTS model of a coffee machine (informal sketch - notation of the LTSA tool) - only the user perspective is modelled (requirements)
- FSM model of a coffee machine (UML notation supported by the Telelogic TAU tool) - the model is more detailed: in addition to the user perspective, the model also identifies two internal components (the controller and the hardware that produces the drinks), and the communication between the internal components is also described.
Another example showing the usefulness of modeling with rendezvous communication (LTS formalism): an LTS model of a Door - and a FSM model of a Door
- In this example, the interface between a door and the user of the door is modeled as a single state machine. This kind of modeling is called "protocol modeling" in UML (since the dynamic rules for accessing an object over its interface are called "protocol" in UML) - in contrast with the structural information about the interface defining the methods and their parameters. It is important to note that modeling an interface with a single state diagram only makes sense when there is no message collision; if both parties take independent initiatives, such collisions may occur and one needs two separate state machine models (one for each side - see coffee machine example) in order to understand what can happen (as pointed out by me in 1977).

B. Checking properties of a given model

Two types of V&V activities:
- Validation: checking the properties of the system against the expectations of the users and owners
  - This leads to "system quality" (a term used in the book by Braek)
  - One asks the question: " Do we build the right system ? "
  - Tool support:
    - simulated execution of the functional specification (e.g. using Telelogic TAU or LTSA) and discussing obtained execution traces with the users and owners.(see SEG-2106 Lab2)
    - checking the internal consistency and correctness of the model: no design flaws such as deadlocks (the system blocks, no further progress), non-specified receptions (unexpected message arrivals), loops without progress, etc. (see SEG-2106 Chapter 10)
      - use ** feature of Telelogic TAU
      - use ** feature of LTSA
- Verification: checking that a more detailed model (e.g. detailed design) satisfies the properties specified by a more abstract model (e.g. the requirements document)
  - This leads to "process quality" (a term used in the book by Braek)
  - One asks the question: " Do we build the system right ? "
  - Tool support:
    - check whether a given Sequence Diagram can be realized by a given State Machine (e.g. using Telelogic TAU feature **) (see SEG-2106 Lab2 and Lab3)
    - check whether a given interaction sequence (or more complex testing process) can be realized by a given State Machine (e.g. using LTSA and forming a parallel composition of the specified system with the given interaction sequence or testing process) (see SEG-2106 Lab3)
    - "model checking": check that some given properties are satisfied by a given State Machine. One sometimes distinguishes safeness ("the events that occur are allowed") and liveness (progress) properties (e.g. using LTSA feature **)

C. Comments on non-determinism in models

A model is non-deterministic if it allows for several behaviors. Such non-determinism is often intended, because at the more abstract level one wants to leave several behavior options, or it is known that certain aspects of the behavior cannot be determined, for instance, due to unknown speed of execution of the different parallel processes inside the system. Here are some examples.

Non-determined sequencing defined by a sequence diagram. Diagram (a) below defines a single sequence of interactions, namely the following: send a, receive a, send c, receive c, send b, receive b, send d, receive d. However, Diagram (b) implies many different orders for the primitive send and receive operations, because the relative order of send a, and send c is not defined; similarly the order between receive a and send b is not defined. One says that the diagram shows a partial order, for instance, the order first receive b, then send d is clearly there in this diagram.

Non-determinism due to different execution speeds. The two FSM models below, when executed jointly with message passing between them, could give rise to an execution sequence corresponding to the second sequence diagram above. (Note: The transitions labelled "!xx" represent spontaneous transitions, only the output xx is indicated; the transitions labelled "?xx" are trigged by the input xx - they have no output.) However, if the speed of transmission of the message c would be faster, or the execution speed of machine A slower, the message c may arrive when the machine A is in state 2. This would be an unspecified reception (the reception of this message is not defined in this state; this must be considered a design error). We conclude that the design of the machines A and B shown below is bad.

In order to improve the design of the two machines A and B that should be able to produce the execution sequence as shown in Sequence Diagram (b) above, one therefore has to add more transitions for receiving certain messages that may arrive in different states. Here is a proposed solution (without the dashed transition from State 2):
Please note that these two machines can also produce execution sequences that do not conform to Sequence Diagram (b), for instance message c may be received before message b is sent by machine A. This is a general observation: Often, a given sequence diagram implies other sequence diagrams, that is, it is not possible to find two machines that realize the given sequence diagram alone, without also producing possibly other sequences that correspond to other sequence diagrams (so-called implied scenarios).
The main point here is that two machines, as above, can give rise to a large number of different execution sequences depending on the relative speed of their execution and the speed of message transmission. This means that in practice, one cannot predict which particular sequence will be realized in a particular run of the system. The system - even its implementation - is therefore non-deterministic. This non-determinism may actually lead to two quite different outcomes: assume that the transition on input c in State 2 of the left machine is replaced by the dashed transition which leads to State X and assume that from this state, the machine would have some quite different behavior (e.g. making an explosion); then the design of these two machines would imply that there is the possibility of an explosion, but it cannot be predicted (because it would occur if message c just arrives after message a is sent and before message b is sent, which cannot be predicted by this model.

Non-deterministic machine behavior.
- Non-deterministic output: The following FSM (on the left) shows that two different outputs may be produced in response to an input a. This specification reflects a level of abstraction of the model where we make abstraction from the parameters of the input a; in fact, at a more detailed level, it may be specified what condition the input parameters must satisfy in order to obtain the output b. At the abstract level, the specification is non-deterministic for the output: b or c may happen. - The machine on the right, which may represent the user of the left machine, observes the output obtained and continues accordingly in order to remain in sync with the machine on the left.
- Non-deterministic LTS: The three LTS below represent a sender, a buffer and a receiver. The put operation enters some data into the buffer and the get operation takes this data out of the buffer. The buffer has also a local (not coupled) interaction loss which means that the data is lost. The receiver does an ack interaction in rendezvous with the sender after having taken some data out of the buffer.
  - The left diagram below shows the behavior of the composition of the buffer with the receiver. The state name "1, 2", for instance, means that the buffer is in state 1 and the receiver is in state 2. Composition of buffer and receiver:
  - If we consider this composition as a black box, we should not see the interactions get and loss. The diagram in the middle shows this by replacing these labels by tau (which means invisible, spontaneous transition). This LTS is said to be (state) non-deterministic, because after some given visible execution sequence, for instance <put, ack, put>, one does not know in which state the LTS is, since it may already have done one the spontaneous transitions. So, any LTS with a spontaneous transition is non-deterministic.
  - One may also consider the LTS on the right (above this text). It is equivalent to the one in the middle - it has the same set of possible (visible) interaction sequences. It has no tau transition, but it is also non-deterministic, because there are two transitions labelled put leading to different states, and one does not know which one would be executed.

Created: December 17, 2008; last update: January 20, 2008