Hi,

it seems my newly installed server experiences the ldap_initconn problem
mentioned already a few times here. From what I can see in
/var/log/messages and from testing with netstat and lsof, it seems that
namcd leaks a file descriptor every few minutes and then seemingly locks
up the server. "Seemingly" cause I could still login via SSH/publickey
as root, do a "namconfig cache_refresh" and everything went back to normal.

Now in /var/log/messages there was this:

May 28 21:56:03 oesi1 kernel: open files rlimit 1024 reached for uid 0
pid 16828
May 28 21:56:03 oesi1 /usr/sbin/namcd[16821]: pam_ldap_init():
ldapssl_add_trusted_cert() failed
May 28 21:56:03 oesi1 /usr/sbin/namcd[16821]: namGetLDAPHandle failed to
get LDAP handle, error 1.
May 28 21:56:03 oesi1 /usr/sbin/namcd[16821]: nss_ldap_init: Unable to
get LDAP handle.
May 28 21:56:03 oesi1 /usr/sbin/namcd[16821]: ldap_initconn: Error in
LDAP init for preferred server, rc = 2.
May 28 21:56:03 oesi1 /usr/sbin/namcd[16821]: ldap_initconn: LDAP bind
failed, trying to connect to alternative LDAP server
May 28 21:56:03 oesi1 /usr/sbin/namcd[16821]: ldap_initconn: Unable to
bind to alternative LDAP servers either.


lsof and netstatt told me this:

oesi1:~ # date && lsof -n -p 15859|tail -2
Tue May 30 11:49:51 CEST 2006
namcd 15859 root 174u IPv4 4843372 TCP
10.0.0.23:17092->10.0.0.23:ldaps (ESTABLISHED)
namcd 15859 root 175u IPv4 4846238 TCP
10.0.0.23:17327->10.0.0.23:ldaps (ESTABLISHED)

oesi1:~ # date && lsof -n -p 15859|tail -2
Tue May 30 11:53:34 CEST 2006
namcd 15859 root 175u IPv4 4846238 TCP
10.0.0.23:17327->10.0.0.23:ldaps (ESTABLISHED)
namcd 15859 root 176u IPv4 4849289 TCP
10.0.0.23:17632->10.0.0.23:ldaps (ESTABLISHED)

oesi1:~ # netstat -topn|egrep "(17632|17327|17092)"
tcp 0 0 10.0.0.23:17632 10.0.0.23:636
ESTABLISHED 15859/namcd off (0.00/0/0)
tcp 0 0 10.0.0.23:17092 10.0.0.23:636
ESTABLISHED 15859/namcd off (0.00/0/0)
tcp 0 0 10.0.0.23:17327 10.0.0.23:636
ESTABLISHED 15859/namcd off (0.00/0/0)
tcp 0 0 10.0.0.23:636 10.0.0.23:17327
ESTABLISHED 10112/ndsd keepalive (6286.86/0/0)
tcp 0 0 10.0.0.23:636 10.0.0.23:17092
ESTABLISHED 10112/ndsd keepalive (5986.34/0/0)
tcp 0 0 10.0.0.23:636 10.0.0.23:17632
ESTABLISHED 10112/ndsd keepalive (6587.36/0/0)

As can be seen from the TCP timeouts in netstat, every 300 seconds a new
connection is opened. How can I find out what triggers these connections
(strace doesn't seem to work as expected)? They happen even if the whole
network is shutdown and only one other Linux-only server is active.

bye,
Franz.