I hope you can help me get some SR's back on track.
I opened an SR (10631916691) on 23Jun10 regarding instability in oes2sp2 cluster nodes. I had awaited for a couple of instances of issues prior to opening this SR. As a result of the issues coming from different nodes in different clusters (multiple sites, though identical configuration) I was requested to open TWO more SR's...

So, I now have THREE SR's which are, seemingly I feel, all around the same issue....

These are:
10631916691 - oes2sp2 cluster node unstable
10632197011 - ndsd core and dead
10632262101 - oes2sp2 cluster node hang

I have since received various responses, requesting different configuration changes and apparently no co-ordination (despite my cross referencing) I have sent in numerous supportconfigs, set machines for kdump and also sent in some statistics from a script I developed (and cron) to try and develop a pattern.

I would be grateful if we could get these re-consolidated so that we can form a consistent troubleshooting approach i.e. agree on a configuration that I can deploy to all nodes that will help us ascertain what is causing this instability.

One of the main issues here is that some of these failures result in cluster resources becoming unavailable (either reported as running, but inaccessible, or sent comatose, despite monitor scripts with no local retires and action to migrate...) i.e. the clusters are doing the very opposite of what they are intended.... keep services available 24x7 (which is what the customer requires... hospitals...)

I really feel the issues are related, and probably something to do with the ndsd/ncp/nss stack...

I have again responded to all 3 SR's today, again requesting co-ordination and co-operation...
I would be grateful if management could step in, get this issue assigned to one engineer/team with whom I can work with to develop a viable and implementable plan to mitigate and hopefully resolve these instability issues

Regards and thanks

David Brightman