SUMMARY(sort of): Adding a new tape drive to a cluster.

From: Jim Fitzmaurice <jpfitz_at_fnal.gov>
Date: Wed, 05 Sep 2001 11:40:59 -0500

Managers,

One response from Amber Wilen, Project Leader of the SysMan Station team.

It wasn't a response targeted at the proper way to add a tape drive to the
system, but she had a work around to fix the sysman daemon(smsd).

----------------------------------------------------------------------------
-------------
If you could try deleting the SysMan Station serialization files and
restarting the daemon, this might be a good workaround to your
problem. To successfully restart the daemon (login as root):
(1) >> /sbin/init.d/smsd stop
(2) >> rm -rf /var/cluster/sms/*
(3)>> /sbin/init.d/smsd start
----------------------------------------------------------------------------
------------

I'd already killed it on two members of the cluster, so I just killed it on
the remaining member before removing the serialization files. After
restarting the daemon on all three members, there was a flurry of about 120
events, (written new serialization files, etc...) then nothing more than a
timestamp since then.

I'd still like to know the proper way to add a tape drive to the cluster if
anybody knows.

My original question is below:

----- Original Question -----

> Managers,
>
> Last week we put a new SCSI card on member2 of our cluster and
attached
> a new DLT tape drive. We booted the machine and all's well with member2,
he
> sees the new tape drive just fine, unfortunately member1 and member3
aren't
> quite as happy. The new tape drive is available to the cluster. I can run
a
> backup on member1 and send the data to the tape drive on member2, but
SysMan
> is acting weird. Some of the messages we were seeing are:
>
> 30-Aug-2001 11:53:01 Device base name changed from unknown to tape1
> (HWID=425)
>
> 30-Aug-2001 11:56:34 SysMan Station event
> 30-Aug-2001 11:56:45 DRD: Server member2 selected for device 425
>
> 30-Aug-2001 11:57:15 SysMan Station event
> 30-Aug-2001 11:57:18 DRD: Server member2 selected for device 425
> 30-Aug-2001 11:57:18 DRD: Added (mapped) DRD server member2
> 30-Aug-2001 11:57:20 DRD: Added (mapped) DRD server member2
> 30-Aug-2001 11:57:20 DRD: Server member2 selected for device 425
>
> Which appear to be okay, but on the other two members we started seeing
more
> and more of the following messages:
>
> 30-Aug-2001 22:20:53 SysMan Station event
> 30-Aug-2001 22:20:53 SysMan Station event
> 30-Aug-2001 22:20:53 SysMan Station event
> 30-Aug-2001 22:20:53 SysMan Station event
> 30-Aug-2001 22:20:53 SysMan Station event
> 30-Aug-2001 22:20:54 SysMan Station event
> 30-Aug-2001 22:20:54 SysMan Station event
> 30-Aug-2001 22:20:54 SysMan Station event
> 30-Aug-2001 22:20:54 [2199 times] SysMan Station event
> 30-Aug-2001 22:20:54 SysMan Station event
> 30-Aug-2001 22:20:55 SysMan Station event
> 30-Aug-2001 22:20:55 SysMan Station event
> 30-Aug-2001 22:20:55 SysMan Station: daemon on host member3 has written
new
> serialization files. These files allow the daemons to communicate state
and
> topology changes.
> 30-Aug-2001 22:20:55 SysMan Station: daemon on host member2 has written
new
> serialization files. These files allow the daemons to communicate state
and
> topology changes.
> 30-Aug-2001 22:20:55 SysMan Station event
> 30-Aug-2001 22:20:55 SysMan Station: daemon on host member3 has written
new
> serialization files. These files allow the daemons to communicate state
and
> topology changes.
>
> And they continued on the next day, increasing in frequency......
>
> 31-Aug-2001 13:15:17 SysMan Station event
> 31-Aug-2001 13:15:18 [7238 times] SysMan Station event
> 31-Aug-2001 13:15:18 SysMan Station event
> 31-Aug-2001 13:15:31 SysMan Station event
> 31-Aug-2001 13:15:32 SysMan Station event
> 31-Aug-2001 13:15:36 SysMan Station event
> 31-Aug-2001 13:15:37 [3212 times] SysMan Station event
> 31-Aug-2001 13:17:28 SysMan Station event
> 31-Aug-2001 13:17:44 SysMan Station event
> 31-Aug-2001 13:18:00 SysMan Station event
> 31-Aug-2001 13:18:17 [794 times] SysMan Station event
> 31-Aug-2001 13:25:11 SysMan Station event
> 31-Aug-2001 13:25:12 SysMan Station event
> 31-Aug-2001 13:25:14 SysMan Station event
> 31-Aug-2001 13:25:15 [3837 times] SysMan Station event
> 31-Aug-2001 13:25:29 SysMan Station event
> 31-Aug-2001 13:25:29 SysMan Station event
> 31-Aug-2001 13:25:29 SysMan Station event
> 31-Aug-2001 13:25:31 SysMan Station event
> 31-Aug-2001 13:25:31 SysMan Station event
> 31-Aug-2001 13:25:32 SysMan Station event
> 31-Aug-2001 13:25:32 SysMan Station event
> 31-Aug-2001 13:25:33 SysMan Station event
> 31-Aug-2001 13:25:34 [3832 times] SysMan Station event
> 31-Aug-2001 13:25:34 [3832 times] SysMan Station event
> 31-Aug-2001 13:25:35 SysMan Station event
> 31-Aug-2001 13:25:36 [3065 times] SysMan Station event
> 31-Aug-2001 13:38:19 EVM daemon: High event activity - exceeds 500 in 5
> minutes
>
> This continued until we killed smsd on those two machines, as the
processes
> were taking up all the time of one full processor. The documentation shows
> how to add a disk drive and make it available to the cluster, but that
> didn't work for us. We couldn't find any information on what to do when
you
> add a tape drive and want to make it available to the cluster, so that
> SysMan doesn't complain. Short of rebooting the other two members, is
there
> any way to do this?
>
> Jim
> jpfitz_at_fnal.gov
> "Take your WinXP! Use it. Strike at LINUX with all of your hatred & anger
> and your journey towards the dark side will be complete!" - Bill Gates
>
>
>
Received on Wed Sep 05 2001 - 16:40:07 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT