Some back story first:

We've got three servers in a cluster. 1, 2, and 4. All three servers
are running Netware 6.5 SP2, clustered with 5 pools. We got the chance
to bring the server that was supposed to be server #3 online and in the
cluster last week. I installed NW 6.5sp4a on it with the intent of
doing a rolling update to sp4a on the others in the cluster this week.

On Monday, we brought #3 into the cluster with no problems. Then
everything went to (#*@&. We tried to migrate from #4 to #3 with the
intent of upgrading #4 - basically doing a rolling update. Server #4
wouldn't let go of the process. It took powering down the server for it
to free it's resources over to #3. When #4 came back online I got an
error about the master node lock (didn't write it down). Doing some
searching on novell, the basic answer was to reboot all the servers in
the cluster at the same time... couldn't do that till this morning, so
on with the show. We commented out the cluster services from
startup.ncf, the machine boots just fine (minus the cluster stuff).

Bad to worse, we figured we'd do the same on #2... same problem... same
solution. So now I've got two servers up and running (#1 a 6.5sp2 and
#3 a 6.5sp4a) in the cluster and two servers out of the cluster. While
they're out of the cluster I might as well do the rolling update on them
to bring them to sp4a. That's when they start to yo-yo. Through much
re-learning on how to start a server w/o autoexec.ncf (-na) we get the
servers up and running, comment out everything that's not needed and get
the servers up and online.

I started to load the autoexec.ncf manually, one load line at a time.
where it chokes is "load java.nlm -server".

At this point, they're useless - so I rolled back the SP to 2 hoping to
undo any damage. They rollback just fine, reboot them... it won't load
server.exe. It acts like it's a corrupt DOS partition. If we boot to a
floppy, then change directories to c:\nwserver and run server.exe it's fine.

So, any ideas on the upgrade? The corrupt partition? Other ideas?


Trent Le Clair
Systems Admin, Southern Oregon University