netisr kernel thread using 100% CPU, killing system ?

From: Bütow, Michael <michael.buetow_at_comsoft.de>
Date: Thu, 27 Jul 2006 15:57:44 +0200

Dear managers,

I am observing a strange problem where the netisr kernel thread goes from nearly no CPU to using nearly 100%.
At roughly the same time, the amount of wired pages on the system increases, first slowly (+100 pages/sec), then faster and faster (maybe 400-500 pages/sec).

Looking with vmstat shows that only the malloc pages are increasing - the rest of the wired pages remain stable.

The collect tool showed the network interface at 5% bandwidth utilisation (it's 10MB half-duplex, tu card), so it appears not that much.

The CPU was fully utilised, at first roughly 60% user and 40% system, but the system load increased slowly to 100% (netisr).

Does anybody have an explanation for the behaviour of the netisr thread and the seeming correlation to the increase in malloc pages ?

I would also appreciate any hints to the further diagnosis of the problem. So far we have used ps to identify the kernel thread.
We also twice forced a crash and analysed the kernel core file. In both cases we got:

(dbx) pd vm_page_free_count
0
(dbx) pd vm_perfsum.vpf_freepages
0

We tried the VM tuning recommended in the attachment v40d-tune.html of http://groups.google.de/group/fa.alpha-osf-managers/msg/6fc7d8d1ac927a1d . However, this did not fix the problem of netisr.

I can provide /etc/sysconfigtab if it helps - we have increased the TCP and UDP send and receive spaces, among others.

Looking forward to any suggestions,
Michael Bütow
Received on Thu Jul 27 2006 - 13:59:35 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT