To summarize, different cluster nodes are picking and choosing what they will run. Here's the story:

One of my 2 cluster nodes froze at 3am and required a restart. After that, I moved some nss cluster resources back to that server (node1). Later, when users logged in, they got 8884 messages for volumes on that pool (we'll say pool1), even though nssmu showed the volumes mounted and cluster status was good. I moved the pool to node2 and could access those volumes again. I tested with different pools, and they could all be accessed through node2 but not node1.

Now the weirdest part. Node2 started to flake out and I couldn't access pool1 volumes. I moved it to node1 and I could access them again. But I moved another pool to node1 and couldn't access that, so I moved it back to node2 and it was fine.

Sorry for the long winded explanation, but I've never seen anything like this.