Randomly problem in 12 nodes cluster.
We have 5 cluster:
.- DNS/DHCP Cluster (2 nodes)
.- BorderManager Cluster (3 nodes)
.- Zenworks 7 suite Cluster (3 nodes)
.- GroupWise Cluster (8 nodes)
.- File & Print Cluster (12 nodes)
All nodes have 10 GB of RAM, 4 Intel Xeon MP Processor @ 3.00 Ghz, dual HBA´s adapter (Qlogic) and dual NICs (balancing and fault tolerance).
All nodes have NetWare 6.5 SP6 with all updates and Novell Cluster Services 1.8.0.
All cluster attach to a SAN HP EVA 4000.
Eventually, some node of File& Print Cluster (randomly) reports comatose state without more details.
The rest of cluster works perfectly
I have the doubt about to the size of the cluster, nevertheless I do not locate specific information about how tunning clusters of great size.
These are its current values:
Membership (#nodes) 12
Timeout (secs) 60
Protocol Settings Value
Heartbeat (secs) 2
Tolerance (secs) 32
Master Watchdog (secs) 2
Slave Watchdog (secs) 32
Max Retransmits 30
Any recommendation or suggestion will be welcome.
Tags for this Thread