Recovering from a crashed primary server
Long story, but our primary zcm server had to be rebuilt. I backed up the CA and the conf files prior to doing the rebuild. We attempted to follow Section 36.2 of the Zen_system_admin.pdf page 431 “Replacing an existing primary server with a new primary server” but ran into a few issues outlined below. We opened a case with Novell, SR# 10827855671. We were able to rename the original primary server object in ZCC and then delete it, but now the problems that still exist are:
1. Can't make the new primary the CA When issuing the zman cai /root/CAbl we get:
Error:Could not find the object "43bf65e91d6221362d6bf7c946a86202". Which is the old GUID of the now deleted server.
We have temporarily imported the CA and made our secondary primary server the CA for the zone but would ultimately like to get the new primary (with the same name as our old primary acting as the CA for the zone).
2. We could not import the backup up Server Conf files via the zman zenserver-restore confbufile.bak command on the new primary. When attempting to do so we get the following error:
Error:Failed to restore the configuration files. Ensure that the backed-up file and the passphrase specified are valid and correct. Also, ensure that you have the write permission to /etc/opt/novell/zenworks directory on Linux and <Installation directory>:\NovellZENworks\conf on Windows.
Error in restoring the backup: com.novell.zenworks.zman.exceptions.ZManException: Unknown file in backed up files
Why are we getting this error, is there a way to fix it and what are we missing if we can't get it to import exactly?
3. Closet server rules are not working as designed i.e. a workstation that should be "talking" to zcm is "talking" to zcm2. Why?
We can fix this problem, by issuing the zac retr command on the workstation, but we have to supply a ZCC username and password for the command to process. Once this command processes, the workstation seems to follow the closest server rule again. The problem here is we have thousands of devices and running that command and supplying a username and password on every machine is not a viable solution.
4. New ZCM server is not taking the 11.2.3 update properly – it never applies
Has anyone else out there run into any of this craziness or have any ideas on how to fix the problems outlined above? We’re hoping Novell Engineering can supply us with an answer but right now we’re waiting to hear back from them.