4 node Netware 6.5 SP7 cluster (soon to be SP8). It has been working quite well for 4 years now, but in the last few months I have been having trouble with some nodes not joining the cluster after an abend or a reboot. Generally a second or third reboot (with no other changes) will successfully join the cluster.

I'm copying part of the logger output where I see the problem starting. Obviously if it can't load the first module successfully (CLSTRLIB.NLM) everything else fails too. Any ideas about what might be happening here and why it randomly fails?

NDS names changed for security reasons, otherwise everything is exactly as output in the logger screen.

Loading module CLSTRLIB.NLM
Novell Cluster Configuration Library Build Number = 1.8.4-367
Version 1.80.04 September 12, 2007
Copyright (C) 1999-2004 Novell, Inc. All Rights Reserved.
CLUSTER-<INFO>-<29>: ncslibResolveName: Now connected to .NODE3.OU2.OU1.O.TREENAME.
CLUSTER-<INFO>-<29>: ncslibResolveName: Now connected to .CN=NODE3.OU=OU2
CLUSTER-<INFO>-<30>: ncslibResolveName: Was connected to .CN=NODE3.OU=OU2
Now connected to .CN=NDS1.OU=OU2.OU=OU1.O=O.T=TREENAME.
CLUSTER-<FATAL>-<65>: ncslibLoadClusterConfig: API called = , error = 4
ncslibLoadCluserConfig: could not initialize the cluster configuration
SERVER-5.70-1553: Module initialization failed.
Module CLSTRLIB.NLM NOT loaded
Module CLSTRLIB.NLM load status INIT FAIL