-- Bob Sloane, University of Kansas Computer Center, Lawrence, KS, 66045 ------------------------------------------------------------------------ Graham- I simply changed all of the joind's to dhcpd in the one included with the os (and I added a restart section which basically kills and invokes dhcpd). The only real difference between yours and the system one is the rcmgr calls and the ones to see if a process is already running ... this may provide enough slop in the timing to get it up and running. Also, the OS based one is S56dhcp. Also, I know that some people have found that a nohup helps them survive the init.d script (might help with the altavista startup). Not sure if this helps or not ... S ----------------------------------------------------------------------- Sean O'Connell Institute of Statistics and Decision Sciences ------------------------------------------------------------------------ From: "Naccarato, Robert" Did you check dhcpd's logs? I think you have to have the packetfilter set up for it to work. ------------------------------------------------------------------------ You will probably want to put a "nohup" in front of your command like: nohup /usr/sbin/dhcpd & which will prevent it from hanging up. Thanks, Dave Niska US Bank St. Paul, MN ------------------------------------------------------------------------ > Does anyone else experience erratic behaviour with system startup > scripts in /sbin/rc3.d? On some of my systems, running 4.0F pk2, some > daemons *claim* to be started (the message goes by on the console) but > they don't continue to run. It doesn't seem to happen with system-provided > scripts, but I don't see anything wrong with the scripts which fail. For > example the script we have to start the ISC dhcp 2.0 server is: We've been seeing similar things since at least 4.0B if not before, and still see it at 4.0D (PK5 I believe). I'm not sure I'd call it non-deterministic, though. It seems to happen every time for startup scripts that try to setuid to a different user, either via an 'su <username> -c blah blah blah' command in the script itself (such as our license-manager scripts do) or by doing it internally via a system call (as our AltaVista software appears to do). The same scripts seem always to succeed when we run them manually as root once the system is up. We've finally given up and just inserted commands into the startup scripts that email us to remind us to start them manually (sigh...). Sorry I can't offer any solutions, but if you hear of any way to fix it, we'd love to hear about it. -- Bob Jones Sr. Systems Manager ------------------------------------------------------------------------ As I've posted in response to similar queries before, see the list archives, the way init runs the scripts in the rc3.d directory is that the child scripts are sent (two if I recall correctly) HUP signals. HUP is traditionally used to tell Unix daemons to reread their config files, but of course a programmer could use it for anything. Since if the programmer did not provide a HUP handler, the default response is to terminate, this is what you see. The programs work when run manually since you (manually) don't send a HUP after starting them. :-) Solutions: a) rewrite the applications to handle HUP's gracefully. or b) use a nohup at the part of the start script where you actually start the program. Note that some public code does handle this stuff right; e.g., Samba handles HUP fine. It is bizarre that the Digital search engine doesn't. Oisin McGuinness Sumitomo Bank Capital Markets ------------------------------------------------------------------------ YES! I've been fighting this with my backup verification script for months. The problem showed up after installing pk3. I've given the Compaq Services UNIX Expert Team a heads up that it's a problem, but haven't logged an "official" call until I can determine where the problem is. It's difficult to diagnose since it requires a reboot, and, as you say, there's not much debugging at that level. Alan Davis ------------------------------------------------------------------------ > case "$1" in > 'start') > if /usr/sbin/dhcpd; then > echo "Started dhcpd" > else > echo "Couldn't start dhcpd" > fi > ;; > While the script will run without error, one assumes that the missing line is a typo... all it does is test for the existance of a file and then print out the words "Started dhcpd" - without starting the daemon. Off hand I don't believe that we have modified the AV startup script, but I don't know. AV can have problems re-starting if it was not "shut-down, normally" but rather crashed-down (ie, cpu panic, or other hardware crash.) This is how we drive the startup script as we have multiple instances of AV running. =======================<cut here>========================================== #!/bin/sh # ---------------------------------------------------------------------* # Make links in /sbin/rc3.d and rc2.d # # ln -s /usr/local/sbin/init.d/altavista /sbin/rc3.d/S96altavista # ln -s /usr/local/sbin/init.d/altavista /sbin/rc2.d/K96altavista # ln -s /usr/local/sbin/init.d/altavista /sbin/rc0.d/K96altavista # # When you install any new AV index, an artifact of the AV install # is to copy a new startup script /sbin/init.d/altavista but this script # only starts the new index being built and wipes out any existing startups # Copy this local altavista into /sbin/init.d/altavista # # cp /usr/local/sbin/init.d/altavista /sbin/init.d/altavista # ---------------------------------------------------------------------* ## startup for Enterprise AltaVista Beta2 MODE=s ## default mode is startup if [ $# -gt 0 ]; then case "$1" in start) MODE=s;; stop) MODE=k;; *) echo "$0: unknown option: $1" exit 1;; esac fi ( cd /usr/local/altavista/pennweb ; ./avsetup -$MODE ) ( cd /usr/local/altavista/oncolink ; ./avsetup -$MODE ) ( cd /usr/local/altavista/pennweb2 ; ./avsetup -$MODE ) ( cd /usr/local/altavista/computing ; ./avsetup -$MODE ) ( cd /usr/local/altavista/special ; ./avsetup -$MODE ) =======================<cut here>========================================== -- ===<Tru64 UNIX-SIG Chair>=== www.tru64unix.org T.T.F.N. William H. Magill Senior Systems Administrator Information Services and Computing (ISC) University of Pennsylvania ------------------------------------------------------------------------ To which I replied > No, this does work, really. Rather than checking for existance of a > file, it runs the daemon and checks the exit code. ------------------------------------------------------------------------ 2nd followup: ------------------------------------------------------------------------ If you are getting an exit code, that implies that the daemon has terminated. If the daemon continues to run, you won't get an exit code. (And the next script in the series won't run.) And in all probablity, you will always get a "successful execution" exit code from the "daemon" anyway. They rarely set an exit code that says oops something screwed up -- unless the program terminates abnormally. I don't know which dhcp daemon you are running, but the classic problem with a Sun oriented daemon is that the Tru64 environment is different. I'm not a programmer, so I may not be explaining this correctly, but... a Tru64 daemon has no parent, and as I recall, you therefore can't simply fork a process and expect it to run after the "parent" init script terminates. (ie the parent is NOT "1" /sbin/init, but the rc script.) I've seen this problem before with people who are used to writing Daemons that run in the Solaris or SunOS environment. They don't work under OSF/1. There is something you have to do differently to convince a process that it is going to run as a daemon. -- ===<Tru64 UNIX-SIG Chair>=== www.tru64unix.org T.T.F.N. William H. Magill Senior Systems Administrator Information Services and Computing (ISC) University of PennsylvaniaReceived on Wed Feb 23 2000 - 21:58:06 NZDT
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT