Hi All,
Got an issue with a bunch of NCS servers; in simple terms, we added a
new device type to the network which does not accept a non-classful
network address (we use a supernet of four class cs, and it will either
take a class b or a class c, but nothing in between) and it is
receiving broadcast heartbeat packets from the cluster; unfortunately,
in another wonderful example of correct design, the thing is responding
to these with a icmp redirect (type 5/1 host redirect, gateway address
same as broadcast address)

Strangely, as soon as the NCS servers receive these icmp packets,
they lose the ability to send broadcast packets (I am assuming here
that a new route is added redirecting broadcast to broadcast, and the
resolver gets stuck in a loop) and of course the inevitable happens -
the node fails over, the broadcasting nodes poison-pill due to a split
brain, and we are left with a single (struggling) host instead of an
entire cluster.

This is a blade server so its hard to do clever networking (although
we might be able to VLAN something or restrict a broadcast domain - but
adding additional network cards is sadly a non-starter). Sticking a
router between the cluster and the network is also an option, but adds
another point of failure and confusion, not to mention considerable
expense. We can't isolate the new devices from the network (they are
VoIP handsets, but 100% dependent on software running on the user's
pcs; they also piggy-back on the drop cable for the pc under normal
usage) and while some clever vlanning might be the answer, it would be
hard to exclude the workstations from the cluster broadcast domain
without creating an equal number of new drops for the handsets to be
provisioned separately (plus of course that many extra switchports
would be expensive again)

So - any ideas? :)