Hi,

We had 2 node Novell CLuster running with OES/Linux SP2 (SLES9 SP3).

Suddenly, only one node in the cluster is active ...I can only make one
node active at a time..If I try to make the other node (node B) join the
cluster, I get the foll. msg:

There was an error while joining the cluster
Error: "join/ already in progress"

The command "rcnovell-ncs status" on "node B" shows the status as "running"
Also "cluster status" comes up with blank reply..just nothing..

I looked at /var/log/messages of Node B ..I see following excerpts:

++++++++++++++++++++++++++++++++++++++++++++++++
Sep 6 22:42:33 node_B kernel: Hangcheck: Stopped hangcheck timer.
Sep 6 22:42:33 node_B kernel: Hangcheck: starting hangcheck timer 0.9.0
(tick is 1 seconds, margin is 8 seconds).
Sep 6 22:42:33 node_B kernel: Hangcheck: Using monotonic_clock().
Sep 6 22:42:40 node_B kernel: Hangcheck: Stopped hangcheck timer.
+++++++++++++++++++++++++++++++++++++++++++++++++

If I issue "cluster view" command on Node_B, I get foll.
==>This node is not a member of a cluster

This behaviour flip-flops, meaning if I reboot node_A, then node_B takes
over but when node_A comes back up then it is exhibiting the same
problematic behaviour, thus node_A would not be able to join the cluster.

I rebooted both my nodes a few times to no avail...Still the results are
the same...


Any pointers or help will greatly be appreciated.

Thanks,
Dharmesh.