Hi, I've got a 6.5 SP3 server (running on a Dell PowerEdge 2500) that
has been very reliable for ages, but it fell over at the weekend and
spent yesterday bouncing itself at seemingly random intervals (shortest
10 minutes, once, longest 2.75 hrs, median 1.5 hrs).

I haven't changed anything in the setup lately and I can't spot a
condition common to the Abends aside from them all but the first being
page fault exceptions thrown by the processor. The start of the
sequence on Sunday was different though, a CPU Hog detected by the timer
To see if it was something like processor overheating or the like I
dusted out the case with an air jet, reseated the memory and made sure
the fans were working and increased free space around it. Still fell
over though :-( I left it running overnight but not connected to the
network and it ran all night, 10.75 hours without a glitch, so I'm less
inclined to think it's hardware falling to bits than I was, though I
haven't ruled it out. There's nothing glaringly naughty in the health
trend graphs, with plenty of available memory and disk space on all volumes.

Any suggestions as to what next? I'm currently downloading SP5 with a
view to putting that on.

The Abend log for them all is 9000 lines long... I can post it all if
that would help, or can edit out bits that would be pointless bandwidth
wasters if someone says what they are.

Ta, Pete.
Peter Clinch Medical Physics IT Officer
Tel 44 1382 660111 ext. 33637 Univ. of Dundee, Ninewells Hospital
Fax 44 1382 640177 Dundee DD1 9SY Scotland UK
net p.j.clinch@dundee.ac.uk http://www.dundee.ac.uk/~pjclinch/