Dear fellow veteran GroupWise admins;

We have been having Client "Not Responding" issues for the past few months that seems to be growing more frequent since the upgrade of the server to GroupWise 2018 a few weeks back.

Environment Reference:
Average client is Windows 7 SP 1 fully patched with 12GB of ram or greater, running Sophos Anti Virus and either the latest GroupWise 14 or 18 Client and 1GBE. Clients are TCP only, not using caching mode. The issue happens on large, power user accounts and small, lightly used accounts. Server is now GW 18 running on Suse Linux Enterprise 12 SP3 and hardware on server is Intel Xeon Multicore 3+ GHz, 94GB ram, ssd raid 10 and 10GBE. 55 users.

Specific Problem:
Numerous help desk calls for "slow GroupWise, Not Responding GroupWise, etc.". Research has shown that GroupWise Client will eventually respond, but it's not clear to us why the issue even occurs.

Solutions:
Check_MK/nagios logs and performance monitors do not show the server is taxed. Client logs/events are all over the place with seemingly unrelated events - the only constant is GroupWise.exe (14 and 18) can be caught causing very high cpu. We were seeing some high server CPU utilization due to btrfs maintenance running during the work day, but we touched cron.weekly and cron.daily to be far earlier in the day when few are working here and that part of the problem seems solved. Unfortunately this btrfs maintenance tweak did not completely stop the 'Not Responding' / slow mailbox issues. We also have a connected Mobility server but there are only 12 users on it and we are not re-initializing users when this issue occurs on the desktops. Mail volumes are not abnormal.

Server I/O test:

Write:
server:/grpwise # sync;time -p dd if=/dev/zero of=/grpwise/testfile bs=1024k count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 9.14517 s, 1.1 GB/s
real 9.14
user 0.00
sys 4.80

Read:
server:/grpwise # sync;time -p dd if=/grpwise/testfile of=/dev/null
20480000+0 records in
20480000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 18.3752 s, 571 MB/s
real 18.37
user 5.19
sys 13.16

Question:
So my question is where should we be looking next....or should we just switch everyone to caching mode? Is that solid these days?