I am experiencing ongoing stability issues when using NSS Data Pool snapshots. Upon Novell's recommendation we moved to NetWare 6.5 SP3 with the post SP3 NSS Update and updated Server.NLM.

This created a conflict with our backup software (BakBone NetVault) which advised we move to SP4. SP4 fixed our issues with BakBone, but I have noticed some odd behavior with snapshots.

I had a script to create daily snapshots and rotate them. Example Below;

# Delete Oldest Snapshot
mm snap delete SNAP5
# Rotate Snapshots
mm snap rename SNAP4 SNAP5
mm snap rename SNAP3 SNAP4
mm snap rename SNAP3 SNAP4
mm snap rename SNAP2 SNAP3
mm snap rename SNAP1 SNAP2
# Create New Snapshot

This script was initialized at 11:58 PM daily by cron.nlm
58 11 * * * snapshot.ncf

It appears that the mm rename command is broken. Snapshots soon failed. mm snap list displayed zero snapshots, they just vanished. Everytime a snapshot was renamed the space on our snapshot storage pool was zeroed. Very bizarre.

I stopped using the script and simply ran by hand once a day mm snap create DATAPOOL SNAPSTORE SNAP(x) (where X represents the next snapshot number).

When not using the mm rename command the space on the snapstore pool consistently increased with each snapshot (as it should). However, one day the NSS Pool DATAPOOL deactivated itself citing pool corruption (4 hours from the last snapshot creation). Console shows Short Term Cache Memory Allocator out of available space and attempts to get more memory failed in MM.NLM.

NSS /Pools showed DATAPOOL in a state unknown, but a new pool DATAPOOL_1 in an active state with my DATA volume attached. The DATA volume consisted of all data in DATAPOOL. It appears that DATAPOOL_1 was an active snapshot.

mm snap deactivate all didn't deactivate any snapshots. What snapshot has created DATAPOOL_1? When I attempt to activate DATAPOOL I receive nss /errorcode=20828. mm snap list shows that the most recent snapshot does not exist. When I attempt to mount yesterday's snapshot I receive nss /errorcode=20004 but the snapshot pool is then listed in nss pools as deactive.

I restarted the server and now the only visible pool is DATAPOOL_1. DATAPOOL is non existent. NSSMU shows all partitions previously associated with DATAPOOL as being associated with DATAPOOL_1. I still cannot activate the day old snapshot. I am also worried that creating further snapshots will worsen the issue.

The snapshots occupied approximately 60 GB and the DATAPOOL is about 2.3 TB in use.

Anyone have any ideas, suggestions, sympathies, or the like?