Hi,
Here is the original question:
I'm running Digital Unix v4.0F, not that this problem is specific to that
particular version of DU or TU I believe.
We have a daemon called erdmaster, which attaches to port 5530 and acts as a
license manager. Due to problems of people getting a license allocated and
then hogging that license by keeping the application open, I started
stopping the daemon at 4 o'clock in the morning and restarting around 5
o'clock. The reason for the delay was, that trying to start the daemon
straight after stopping produced an error to the effect that the daemon
could not access port 5530, that it was already allocated to another
application. Running netstat -na, you could still see entries for the port
5530, and I guess from the systems point of view it thought that the port
was still in use. After some time delay, must be of the order of 5-10 mins
the system does release the port and the license daemon can be restarted.
I was wondering if there was a way of
(a) hurrying up the system so that it would release the port sooner, or
(b) forcing the system to release the port.
Conceptually I can see that the system may not like this, as there may still
be traffic comming in on that port from the clients who are trying to get
some sort of a license token. However, once the daemon has stopped, than
clients get the message that the license manager is uncontactable. So it
seems a waste of time the system still thinking the port is already
allocated, and that the port may as well be released immediately, from my
point of view.
I had 4 replies from
(a) Gavin Kreuiter [gavin_at_transactive.usko.com]
You could reduce the tcp_keepidle, although this is a system-wide
parameter;
ours is set 10 minutes (1200 half-seconds):
inet:
tcp_keepidle = 1200
I think the best technique, if you have access to the source, is to
set the
SO_REUSEADDR option on the socket.
(b) Derk Tegeler [tegel002_at_hvxbe01.unix.telecom.ptt.nl]
I'd use lsof to track down the process using a particular port.
You can find lsof at:
ftp://vic.cc.purdue.edu/pub/tools/unix/lsof
In my opinion, you should send SIGTERM to the process using the
port and SIGKILL if it doesn't die. The process should die almost
immediately after the SIGKILL, release the port, and you should be
able to
restart it very quickly.
(c) Dan Harrington [dan.harrington_at_av.com]
I believe you will want your license manager daemon to issue a
setsockopt
call with SO_REUSEADDR before doing its bind...this would let the
system
know that it's OK for another process (e.g. the same daemon
restarting) to
come along and grab that port immediately upon its becoming
available.
You might want to double check in the Network Programming guide,
where I'm
sure they'll explain it better than I have.
At the system level, I'm not aware of any utilities/tools that will
make
the port immediately available...but you might be able to tune the
TCP or
IP subsystems (e.g. via sysconfig) to reduce the amount of time it
waits.
(d) Mark Menkhus [athome11_at_qwest.net]
It could be 2 things:
1) clients are attempting to connect to the server and there is a
small
window
when that attempted port binding causes the server to get "can't
bind to
port, address already in use"
2) or the programmer just needs to set the socket option to
so_reuseaddr in
the program.
and:
You might get the ports to clear faster, if you enable the tcp/ip
keep alive
timer, sysconfig inet tcp_keepalive_default=1 . This is in the
system
tuning manual someplace...
Well the only thing I tried was to reduce the tcp_keepidle to 10 secs, ie 20
half seconds. Since the license manager comes form ERDAS the suppliers of
Imagine, I can't apply any of the other suggestions, especially the
SO_REUSEADDR one.
Dennis
######################################
Dennis Macdonell
Systems Administrator
AUSLIG
mail: PO Box 2, Belconnen, ACT 2617
email: mcdonell_at_auslig.gov.au
ph: 61 2 6201 4326
fax: 61 2 6201 4377
######################################
Received on Thu Aug 30 2001 - 01:22:19 NZST