Thanks to Dr. Tom Blinn for the following which got me thinking in the right
direction:
============================================================
The classic reason for an error in gethostbyaddr() (which has a man
page, it's a standard C library function) is that the host address
can not be found in the "hosts" database. It's a bit messier than
that, but that's the usual cause. The man page doesn't tell you that
the error codes are defined in /usr/include/netdb.h and if you go
look in that file you'll see that an error code of 0 is defined as
NETDB_SUCCESS with the comment "/* no problem */", so there may be
a bug in the h22agent program's handling of the errors, or there may
be a real error occurring and the reporting logic is wrong (see the
man page to get a sense of how messy this is). Of the two, I'd bet
that there's a bug in the h22agent code, but I have no idea where to
find that code (that software isn't part of the Tru64 UNIX code base
so it's not in any source pool repository to which I have access).
In any case, I'm betting that something is trying to connect to the
h22agent program and passing it an IP address it's supposed to use
for some purpose, and the h22agent is attempting to back-translate
the address to a host name, and the gethostbyaddr() lookup is not
returning a name, and instead of just dealing with this, the h22agent
thinks it needs to log a message. If you can figure out what host
it is trying to backtranslate, you might be able to fix this problem
by getting that host name entered into your BIND server. I'd bet it
is the NT machine, especially if that system is supported by DHCP and
the pool of IP addresses parcelled out by the DHCP server are not in
your BIND service with names. If you know the NT system's IP address
you can just try a good old UNIX command line "nslookup" giving it
the IP address and see if it returns a name. "nslookup" uses good
old gethostbyaddr() when you pass it an IP address, after all.
============================================================
I believe it's been solved now. I'm not sure exactly what fixed it but
basically it was this: The pc running swcc was old and had not been changed
to DHCP (which we went to about a year ago). The name of the pc was pc-name
with a domain of company.com. It wasn't listed in any DNS table; i.e. I
couldn't 'ping pc-name'. We reconfigured the pc to use DHCP, changed the
domain to pc.company.com (which is what we use now for all pc's), rebooted
and I haven't gotten any of those annoying messages since (oh yes and I
changed the entry for the pc name in the agent and restarted it via
swcc_config). NOTE: Sorry if I haven't explained the solution accurately --
somebody from the network grouped fixed it and I may not have gotten all the
particulars correct. Basically I think the unix server tried to lookup up
the name of the pc in a DNS and couldn't find an entry for it so produced
the message.
Peter Gergen also sent this:
============================================================
This happens when the password from the SWCC is given to the unix server.
============================================================
I was going to look into this next but I don't think I need to now.
Thanks again everybody,
Andy
ORIGINAL QUESTION
=================
Hi,
I've got SWCC running on an NT machine attempting to manage an RA3000 unit
on a DS20E running 4.0F. I keep getting these messages in my
/var/adm/syslog.dated/current/daemon.log:
Mar 25 19:02:50 odin h22agent[6340]: WARNING: Socket error -
gethostbyaddr(): Error 0 occurred. (SP_SOCKET: socketAccept)
Mar 25 19:32:53 odin h22agent[6340]: WARNING: Socket error -
gethostbyaddr(): Error 0 occurred. (SP_SOCKET: socketAccept)
every 30 minutes.
I saw in the archives that somebody posted this question a couple of years
ago but there was no reply. Anybody have an idea how to fix this?
Thanks alot!
Andy
Received on Thu Mar 27 2003 - 18:10:04 NZST