Ok guys I've got a Netware cluster that's been running for years but lately (last 6 months) its stability seems to be degraded, and that's got certain members of my department looking to replace it. I had an incident last night that I find pretty disconcerting in that 3 of the 4 nodes went down.

So here's the timeline as I've been able to pin it down.

6:01 Server S halts with the following condition: "Abend 1 on P00: Server-5.70.08-0: This node in the Minority partition and the node in Majority partition is Alive.For more information, consult technical information document 10053882 in the knowledgebase on"

6:45 Server A halts with the following condition: " Abend 1 on P00: Server-5.70.08: Page Fault Processor Exception (Error code 00000011)" "Additional Information: The CPU encountered a problem executing code in SERVER.NLM. The problem may be in that module or in data passed to that module by a process owned by FSBACK.NLM."

This is the Commvault NLM and frankly it's always been a little flakey.

11:43 Server R halts with the following condition: "Abend 1 on P01: Server-5.70.08: Page Fault Processor Exception (Error code 00000002)" " The CPU encountered a problem executing code in TCP.NLM. The problem may be in that module or in data passed to that module by a process owned by SERVER.NLM."

Now for added fun, two of the servers abended when they were rebooted this morning.


7:37:03.892 Server A halts during the load process with this condition: "Abend 1 on P00: Server-5.70.08: Page Fault Processor Exception (Error code 00000000)" "T he CPU encountered a problem executing code in DS.NLM. The problem may be in that module or in data passed to that module by a process owned by SERVER.NLM."

7:37:27.646 Server S halts during the laod process with this condition: "Abend 1 on P00: Server-5.70.08: Page Fault Processor Exception (Error code 00000000)" "The CPU encountered a problem executing code in DS.NLM. The problem may be in that module or in data passed to that module by a process owned by SERVER.NLM."

Netware version 6.5 sp8, eDir version 8.8 sp4.

Any one of these incidents taken individually wouldn't cause me any undue alarm, but taken as a whole I find it to be a cause for concern, especially the abends on DS on load. DSrepair doesn't show any significant errors.

So anyone have any ideas on what my problem might be? A few more mornings like this and I'm going to retire to a nice quiet construction job.