LinkBack Thread Tools Display Modes
Prev Previous Post   Next Post Next
  #1  
Old 30-Sep-2009, 09:32 AM
Junior Member
 
Join Date: Sep 2009
Posts: 1
olauer 0 reputation points
Default Odd NFS Hangs

Hi there,

we run two boxes with SLED 10 SP2 and two new boxes with SLED 11 against a
NFS server with AIX 5.3.

SLED 10 boxes are running fine since more than one year but on SLED 11 we get complete stalls at odd times. Users cannot do anything more probably because of NFS pathes in their environment, root can still login on console, can start programs like top but a command like df produces a hang after listing local filesystems when it should list first NFS mounted one.

Sep 30 15:05:54 lx4 kernel: SysRq : Show Blocked State
Sep 30 15:05:54 lx4 kernel: task PC stack pid father
Sep 30 15:05:54 lx4 kernel: df D 0000000000000007 0 4260 4221
Sep 30 15:05:54 lx4 kernel: ffff88063cc63b28 0000000000000082 ffff88033cdd6180 ffff88063cc63ab8
Sep 30 15:05:54 lx4 kernel: ffffffff80a29000 ffffffff80a33680 ffffffff80a30470 ffffffff80a33680
Sep 30 15:05:54 lx4 kernel: ffffffff80a29000 ffffffff80a33680 ffffffff80a33680 ffffffff80a33680
Sep 30 15:05:54 lx4 kernel: Call Trace:
Sep 30 15:05:54 lx4 kernel: [<ffffffffa01d47e9>] rpc_wait_bit_killable+0x2d/0x31 [sunrpc]
Sep 30 15:05:54 lx4 kernel: [<ffffffff8049c44f>] __wait_on_bit+0x41/0x70
Sep 30 15:05:54 lx4 kernel: [<ffffffff8049c4e9>] out_of_line_wait_on_bit+0x6b/0x77
Sep 30 15:05:54 lx4 kernel: [<ffffffffa01d4f1c>] __rpc_execute+0xe1/0x22d [sunrpc]
Sep 30 15:05:54 lx4 kernel: [<ffffffffa01cf482>] rpc_run_task+0x4f/0x57 [sunrpc]
Sep 30 15:05:54 lx4 kernel: [<ffffffffa01cf571>] rpc_call_sync+0x3d/0x5a [sunrpc]
Sep 30 15:05:54 lx4 kernel: [<ffffffffa0273792>] nfs3_rpc_wrapper+0x19/0x50 [nfs]
Sep 30 15:05:54 lx4 kernel: [<ffffffffa0273941>] nfs3_proc_statfs+0x63/0x87 [nfs]
Sep 30 15:05:54 lx4 kernel: [<ffffffffa0268498>] nfs_statfs+0x61/0x137 [nfs]
Sep 30 15:05:54 lx4 kernel: [<ffffffff802b0e9d>] vfs_statfs+0x5b/0x76
Sep 30 15:05:54 lx4 kernel: [<ffffffff802b10ab>] sys_statfs+0x3e/0x93
Sep 30 15:05:54 lx4 kernel: [<ffffffff8020bfbb>] system_call_fastpath+0x16/0x1b
Sep 30 15:05:54 lx4 kernel: [<00007f87a7c22627>] 0x7f87a7c22627

Usually there is at least one running process in top taking 100% CPU time and an increasing load over time.

What helps is only pressing power button or using SysRq B.

We already tried switching of TCP segmentation offload, NIS, Automount etc., but the
problem keeps the same.

In other forums there are some results regarding a Google search with 'rpc_wait_bit_killable' but from quickly scanning those results I didn't find something usable.

I really think about installing SLED 10 SP2 on the new boxes if I find no solution.

Any ideas appreciated.

Oliver
Reply With Quote
 

Tags
nfs rpc_wait_bit_killable

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -6. The time now is 05:31 AM.


© 2007 Novell, Inc. All Rights Reserved.

Search Engine Friendly URLs by vBSEO 3.3.0 RC2