I've got a customer who has a problem with CIFS on an oes2 cluster. We opened up a SR and sent the Tech Support several configuration files and the load.out & unload.out log files. This is the answer we got:

"With regards to Service Request 10697418951 (CIFS on cluster resource comatose)

The file shows that the add secondary IP address for was done. Then a ping to that address was done and there was no reply. Then there was an ARP for that address that also failed. So it appears that the IP binding of that address failed. That would cause the ncpcon bind of the resource name to the IP address to also fail which would explain why the resource went comatose. So we have to try to figure out why the IP binding failed that day. If it ever works, then it was likely that this particular time there was a problem with the driver loading or the NIC not being available, etc. I hope this is what you were looking for.

Best Regards,"

Of course I'm not going to show any names. You have got to be kidding me? The Tech Support individual obviously had / has no idea how a cluster works. The resource has to ping the address and ARP it before it can bind it. Just makes sense... Now, this is the answer that I helped the customer write:

" Thank you for your prompt reply.

The Ping for the add secondary IP address is a standard procedure when the cluster adds the secondary IP address. It pings the address first to make sure that it is available. If the ping returns an answer (so successful) then the cluster resource will go comatose, because the address is already assigned. As far as I know this is standard operating procedure for all cluster resources. My other cluster resources also do this and do not get an answer and therefore do not go comatose. This cannot be our problem. The ARPing falls under the same category. This cannot be our problem.

As was stated in the original call that when we load a cluster resource without the CIFS protocoll enabled it has no problems and we can then perfom the novcifs --add command manually and it is successful. When we activate the CIFS protocoll within the cluster resource (iMgr) and take the resource offline then online it goes comatose. The resource was originally created with OES2 sp1, therefore we've used the cifsPool.py scripts. Both of them give us the INVALID_DN_SYNTAX error. We have a test cluster that was installed with SP3 and when we activate CIFS for an existing cluster resource we have a similar problem. When we activate CIFS directly when creating a resource it works. This is a known problem with cluster resources that were created before OES2 sp2. As stated I believe that the cifsPool.py scripts still have a problem, that or sp3 has a problem (INVALID_DN_SYNTAX error). Please note what I said relating to creating an SP3 cluster resource w/o CIFS activiated and then activiating it.

Thank you for your assistance and if you need more information or files please contact me asap.


Now, I do not want to be rude but WTF? The customer is already REALLY upset with Novell because we implemented oes2 SP3 in order to setup CIFS and it caused his cluster to regularly die (SR 10687567801). The solution to that was available 2 months later, eDirectory 8.8 sp6 Patch2 from April 22, 2011. The customer is not very happy....

I've worked professionaly with Novell products since 1989. What is going on? And what is going on with the open SR relating to CIFS? Is it possible to get help from someone that actually knows the products involved? No offense intended...

Thank you for any assistance you are able to provide. IMO moving your Support was a mistake.