we have a 3node ncs with sles10sp3 + oes2sp3 (64bit).
all servers are virtualized (vmware).

We had to remove the servers from the vranger backup, because the creation of snapshots for the backups regularly kills our non-master nodes. They are then on 100% cpu load and not responsive. The time we received cluster notifications about failed nodes was exactly the time that vmware was creating snapshort for the automatic backups.
Removing the vms from backups solved the problem.

Now my questions is:
Is it a good idea to increase the tolerance for heartbeat & watchdogs to perhaps something like 16 or 32 seconds?

Has someone any experience on this subject?