After a server or HD crash the normal recovery procedure involves setting up a new server OS, eDir and in our case IM software versions, patch these to the level present before the crash and afterwards restore the eDir content using a backup obtained by dsbk, e.g.

We were wondering, whether VM snapshots can simplify the first step. However, snapshots taken of a current patch level will also contain a running eDir instance with outdated content, which will cause dissonances in a multi server eDir environment. We have therefore done some tests bringing one server of a two server replica ring back to a VM snapshot taken earlier while inhibiting communication with the other replica server. We wiped any eDir stuff off the "crashed" server and installed a new temporary tree, while keeping OS and IM software versions and patch levels. We continued with restoring eDir content using a full backup taken by dsbk earlier. We did not include roll-forward logs on purpose, because for the time being we won't be able to store these on a different storage device and hence to provide all roll-forward logs needed after a real crash.

Simply following section 15.7 "Recovering the Database if Restore Verification Fails" of the eDir 9 admin guide yielded a "locked Directory Services Database" error when changing the replica information on the failed server into external references using ndsrepair -R -Ad -xk2. ndstrace reports "DSAgentOpenLocal failed, ds locked" (error -663), our test box is an eDir 9.0.4/SLES 12.2/IM 4.5 installation, partitions IDMDriverset and root.

I'd therefore like to ask the experts, whether anybody has an idea what makes the restore procedure following sections 15.6 and 15.7 of the eDir 9 administration doc fail?

Moreover we wonder, if a restore to get hold of the RST files is neccessary at all or if outdated partitions brought back by VM snapshots could be removed from the rest of the replica ring in a safe way before allowing any sync between servers? After reestablishing communication our idea would be to re-add partitions from replica servers not affected by the crash.

Thanks in advance

Axel