I've had an unusual problem occur twice now.

Apparently a single node (of my 6 node Netware cluster) gets cast out of the cluster by the remaining 5 nodes. I assume the other nodes think the node goes down because the resources it was running all start migrating to other nodes. The problem is, the original node doesn't ever get a poison pill it it keeps on running, making it impossible for the resources to come online somewhere else, so instead they go comotose.

I'm a little confused as to why this node doesn't abend. This has happened twice, on two different nodes.

I am suspicious that it could be related to the broadcom NICS in these dell poweredge 2650's. I've read how they have caused communication problems in clustering. Could this be causing the problem I've described? If so, anybody know which version driver is currently working best? Also, I would like to setup a separate heartbeat LAN, anybody have any tips on doing this?

I appreciate any feedback.