I have a client with a HP ML370g3 server that had been running NW 6.0
for quite a long time without any real problems. Around the
Thanksgiving holiday of '06 we upgraded the server to NW 6.5sp5, and
upgraded the hardware to a ML370g4. The new server is faster and has
more memory, and we were hoping for a slight speed boost. Shortly after
the upgrade, once the users returned from the holiday break, the server
started to abend weekly. The error is a Page Fault in Process Server
xxxx in module COMN.NSS. When the error occurs the server loses the
ability to write to the drives, so there is no data in the abend log,
only the IML contains a record of the abend. Of note, when the server
abends it doesn't recover. It sits there with a black screen, and the
cursor blinking in the upper left corner. Ctrl-Alt-Del or power cycling
are the only way to cure it.

I applied the N65NSS5A patch, but the server abended several days
later. I applied the N65NSS5B patch soon after, but the problems
continued. I even tried the comnxlocal_beta patch to no avail. I
decided to forego my usual wait time on a new service pack and updated
the server to sp6. The abends continued even more frequently.

The old server was a ML370g3 with dual 3.2ghz cpus, 4gb ram, and a SCSI
raid 5 array. The new server is a ML370g4 with dual 3.4ghz cpus, 6gb
ram, a raid controller with more memory, but the same drives. I went to
HP's site looking for clues, and the few things they had listed failed
to resolve the problem.

Since the software and hardware upgrades occured at the sime time, I
wasn't really 100% sure which was the cause. So I moved the drives back
to the ML370g3 during the New Year's break. While waiting to see if the
abend occured, I performed numerous tests on the ML370g4, which it
passed with flying colors.

The server has since abended with the same error on the older hardware,
thus causing me to believe it is NW6.5 that's the problem. I have seen
no newer posts on the topic in the Novell Support site, nor any new
beta patches. What makes this really strange is that I have an almost
identical hardware & software combination running at another client
that has only had this same error once in the last year.

While trying to diagnose the problem, I did stumble across a memory
leak in the older APC PWRCHUTE.NLM that was slowly eating up available
memory. This problem didn't occur under NW6.0, so it must be a
compatibility problem with NW6.5. I've since updated the app to the
newer Java based version. The page fault did occur -after- I updated
the app, but thinking along the same lines of a memory issue, I wonder
if the extra 2gb of memory in the newer hardware exacerbated the abend
causing it to occur more frequently.

Either way, I've run out of options and wanted to bounce this past you
guys. Any takers?