A few days ago, I posted a query (appended at end) requesting help
determining which program was generating huge numbers of packets on
the network software loopback interface (lo0). I got one response,
from Knut.Hellebo_at_nho.hydro.com, who thought he had seen something
similar in the archives. I checked the alpha-osf-managers archives,
but found nothing relevant.
Meanwhile, I was able to find out more about my problem. The output
of "netstat -an" showed a very large number of packets queued for
UDP service ports 1028, 1029, and 1030. These ports are not defined
in /etc/services. Meanwhile, "top" showed me that "rpc.lockd" was
essentially using all available cycles - up to 80% of the cpu.
"rpcinfo -p" then showed me that the "nlockmgr" rpc service was
bound to ports 1028, 1029, and 1030 (among others; it offers several
versions in both UDP and TCP). So now I suspected rpc.lockd was
responsible for the huge number of packets on the loopback interface.
I confirmed that rpc.lockd was associated with the UDP ports
1028, 1029, and 1030, using "lsof -i _at_pangea:1028-1030" (pangea is
the name of my system).
Next, I saw that a user had started a compute bound program that was
effectively competing with rpc.lockd for cycles. I took advantage of
this to manipulate the execution priority of rpc.lockd with "renice",
sending it from 0 (normal) to 10 or 20. If nothing else were running,
this wouldn't have any effect. But in this case, it forced the CPU
utilization of rpc.lockd down from about 25% to 10% or even 5%. Using
"monitor", I saw that the number of packets on the loopback interface
varied directly proportionally with the CPU utilization of rpc.lockd.
This was very repeatable.
So, it seems my real problem is rpc.lockd stuck in a loop. I couldn't
get any further on determining what it was doing or why, and ended up
rebooting the system. rpc.lockd is now acting normally.
Meanwhile, I found a consolidated patch kit for Digital Unix 3.2d-1
on Digital's anonymous ftp site ftp.service.digital.com. It includes
a patched version of rpc.lockd. The specific problems that are
mentioned do not sound like the one I had (they involve rpc.lockd
crashing, rather than getting stuck in a loop), but I will try
installing this patch anyway.
-Phil Farrell, Computer Systems Manager
Stanford University School of Earth Sciences
farrell_at_pangea.stanford.edu
----------original query
From: farrell Wed Dec 3 18:14:00 1997
To: alpha-osf-managers_at_ornl.gov
Subject: How do I determine who is using loopback interface?
Hi Managers,
I could use some help on this one. Running Digital UNIX 3.2d-1 on
AlphaServer 1000.
This afternoon, I noticed (using "monitor" utility) that my system
is sending huge numbers of network packets to itself on the software
network loopback interface (lo0) - from 500 to 1200 packets per second.
I have never seen more than a few packets per second before on
this interface. Of course, number of packets transmitted and received
on this interface are identical. The normal ethernet interface
(tu0) is showing about 100 to 300 packets per second input and output.
Can anyone give me some ideas to find out which process is creating
all these packets on the loopback interface? This is a server
system - it does not have a graphics console running X, so
they shouldn't be packets to the "unix:0" X display. But I
suppose a misbehaving program could be trying that.
I tried to see what I could find out with "netstat". The "-a"
option shows all open connections. I did not see any from
the machine to itself. I looked at a complete process list
from "ps" but did not see any funny looking process names
(there are about 500 processes running on this system from > 100 users).
This high usage of the loopback interface has been going on for
over one hour of which I am aware.
Thanks for any ideas.
-Phil Farrell, Computer Systems Manager
Stanford University School of Earth Sciences
farrell_at_pangea.stanford.edu
Received on Fri Dec 05 1997 - 23:17:43 NZDT