This is an issue which has plagued me for months.

Firstly, IMO our environment is not as friendly to OES as it should be:
we have about 60 servers (which is far too many for the - 6,000 users
total we have), of which about half are spread out all over the state of
NJ via WAN links.. We're a state agency, there is really is no choice
in the matter.

Usually in the morning, I can run ndsrepair -T on our 3 main master root
nds servers (which are also masters of the various child partitions),
and get 0 errors... or maybe just 1.
Same is true at the end of the day.

But in the bulk of daytime hours, I can get up to 10 errors or more.
Novell usually says this appears to be a routing issue or similar, but
here's the thing:
Unless I'm mistaken, timesync uses NCP, port 524, to communicate. Or is
it UDP? That might help explain things.

Whenever I have server that returns a -625 or -626 error, I find that I
can ping that server with no issue whatsoever.
I can telnet between them, SSH, whatever.. there are no network
communication issues detectable.. and yet, it won't communicate.
I can also run "nmap <servername> -sS -p 524 and get a clear "port
open" response.

Often, even running ndsrepair -N and repairing the network address
returns an error.. which usually clears up maybe half an hour later.

I've noticed it's usually caused by the same group of servers at remote
locations.. usually.
The only thing I can think of is, our WAN is just over-saturated during
business hours, but that doesn't explain why at least problem servers is
in fact, local (a webaccess server).

Now, I ran this several times this week, and for the first time ever, I
saw Mon- Wed return 0 errors every time on all three servers.. that was
a first.
Last night however, our master returned 10 errors running ndsrepair -T.

Even more curious, two other servers that hold read write replicas of
root, and which are in the very same subnet/VLAN, returned 0 errors,
which pretty much removes a network issue.

These three servers are all SLES10sp4/OES2sp3. The other two servers
in the root partition are one NetWare 6.5 sp7 server and a
SLES10sp3/OES2sp2, but as they aren't masters of any other partitions
(as the other three are) I don't bother with them too often.

Lastly, our servers all point to our NTP server, which is an external
appliance server. So even if the servers can't confirm they're in time,
in actuality, they are.

lpphiggp's Profile:
View this thread: