Cloud CM-IPMP Instrukcja Naprawy Pobierz pdf (Strona 111)

17.3.1 Node Failure

Node failure is deﬁned as a node exiting the process, or the machine that the process runs on dying (possibly due to power

failure etc). The following steps are performed after a node failure:

• The cluster detects that the node has not been active

and votes for new cluster membership.

• The new membership is decided and the new primary component is formed.

• Any FT activities being processed on that node are failed over to another node.

Figure 17.2 shows the cluster conﬁguration after the failure of node 1. The primary component has a membership of nodes 2

and 3.

Machine A

Rhino node 1

Machine B

Rhino node 2

Machine C

Rhino node 3

Cluster [2,3]

Figure 17.2: Node one has failed

17.3.2 Node Restart

A node booting and joining a cluster is the same as a failed node restarting. Figure 17.3 shows a two stage process for a node

re-starting. Initially the node boots and forms a non-primary cluster component with membership of itself.

The node becomes a member of the primary component and synchronizes working memory with the primary component. The

node can only perform work once it is a member of the primary component and has synchronized state with the rest of the

cluster.

17.3.3 Network Failure

In a distributed system, the connection between different computers can fail. Two possible examples of this include (but are not

limited to) the physical network cable being cut or the network interface card failing. Figure 17.4 shows two stages that occur

in a network failure.

The ﬁrst stage is shown in the top portion of the diagram and shows that two different components are formed one is a non-

primary component that has a membership of node 1 the other is the primary component that has membership of nodes 2 and

Once a node has transitioned from a primary component to a non-primary it logs an error message, ensures that outstanding

transactions will not commit, and ﬁnally terminates.

17.4 Conﬁguration Parameters

Rhino SLEE clustering software can be conﬁgured for various purposes. The default conﬁguration should be suitable for the

majority of environments.

The cluster detects the node has aborted by properties deﬁned in the ﬁle

$RHINO_NODE_HOME/config/savanna/settings-cluster.xml

Open Cloud Rhino 1.4.3 Administration Manual v1.1 102

1 2 ... 106 107 108 109 110 111 112 113 114 115 116 ... 200 201

Brak uwag

Cloud CM-IPMP Instrukcja Naprawy Strona 111