I've got a strange problem, and have seem similar posts here in the forum, but not quite the same...so I thought I'd post to see what y'all think.

Last Friday, I had to down one of the nodes in the cluster (2 node cluster). The 2nd node (I'll call server2) picked up the master ip, as was expected. When I brought the 1st node (server1) back up, it was no longer able to join the cluster. It simply hangs on "Joining....". When I look at the log file, it says "Join retry, some other node acquired the cluster lock".

In trying to troubleshoot this, I brought down server2, and rebooted both servers. I let the server1 acquire the Master, and server2 joined without any issues. I then had server1 leave the cluster, and server2 then became the master. When trying to re-join the cluster with server1, it would again hang. So basically, if server1 is the Master, both servers can join. But if server2 is the Master, server1 is unable to join.

I have verified that there are no communication problems between the two servers. And that they can both see the sbd partition. My heartbeat settings are on the installation defaults. I have downed both servers completely, and it always comes back to the same thing (above paragraph).
I have also tried reinstalling NCS on both servers (joining an existing node), and it always fails on the "Join Node" section.

I've seen references to some TID's that are supposed to have some solutions, but am unable to find the TID's. I've also seen reference to re-creating the SBD partition, but can't find any good instruction on how to do that without hosing everything. I've read references on changing the panning ID, but also can't find instructions on how to do that.

Can someone please give me some advice. If re-creating the sbd partition is what I should try next, can you explain how to do it? Thank you!