Dears managers,
I have now sucessfully configured DECsafe/ASE. It is working fine and I am
going in vacation for some days.
Thanks to the peoples who quickly responded:
Benoit Maillard <maillard_at_atyisa.enet.dec.com>
Peter <flack_at_rtp4me.enet.dec.com>
Lucien Hercaud <Lucien_HERCAUD_at_paribas.com>
Mike Cross <crossmd_at_heu538.ha.uk.sbphrd.com>
Javier Aida <jaida_at_gmd.com.pe>
Michel Jouvin <jouvin_at_lal.in2p3.fr>
Dave Cherkus <cherkus_at_unimaster.com>
The answers are appended to this mail.
My original message was:
----------------------- Beginnig of my original message -----------------------
Dears managers,
I have solved my DECsafe/ASE hardware configuration and sent a summary
("SUMMARY: DECsafe ASE: problems"). Now, I have some problems to configure NFS
services, so I need help from ASE experienced users.
I have a master/standby configuration and defined two nfs services. The two
nfs services run on the master and serves ADVfs filesets. The first nfs service
handles 3 ADVfs filesets and the second service handles only one fileset.
The system is working but not as expected. I have crash problems: when I shutdown
the master system, the second system crashes instead of automatically NFS serving
the disks defined in the nfs services.
The second problem I have is the following: when I relocate manualy an nfs
service from the master to the secondary system and then relocate it from the
secondary to the master everything works fine. Now, if I retry a relocation
from the master to the secondary, the master crashes with the following message:
ADVFS EXCEPTION
Module = 2, Line = 1657
panic (cpu0)
...
Any idea will be appreciated.
Thanks in advance.
----------------------- End of my original message -----------------------
Following is how I solved the problem:
1) by using SCSI # 6 and 7 for the two SCSI PZKSA controlers
-------------------------------------------------------------
I was using 0 and 7 as the SCSI # for the SCSI controlers. Using 0 for
one controler and 7 for the other one was causing priority
problems on the shared SCSI bus. Now using 6 and 7, everything is OK.
Following is how to change the SCSI # of the second controler.
>>> sh pkb*
pkb0_host_id 7
>>> set pkb0_host_id 6
>>> init
>>> sh dev =====> OK !!!
2) by using the correct path name in my automonter maps
--------------------------------------------------------
For the NFS clients, I was using a remote path name begining with
/usr/var/ase/mnt. For example I was using
/usr/var/ase/mnt/homephy/export/home1osb instead of /export/home1osb.
I have used /usr/var/ase/mnt/homephy/export/home1osb because the mount
command shows that it is the mount point of the fileset.
I did not understand that the ASE/NFS service just needs /export/home1osb
which is the path I defined with asemgr NFS service configuration. ASE
cannot relocate the NFS service when I use a path like
/usr/var/ase/mnt/homephy/export/home1osb: it says "device busy" when
trying to relocate a service.
Now in the automonter's maps. I have entries like
/mount_point nfs_service:/export/<a_path>
instead of
/mount_point nfs_service:/usr/var/ase/mnt/<nfs_service>/export/<a_path>
This works fine.
***
Christophe DIARRA
Institut de Physique Nucleaire
I.P.N ORSAY
91406 FRANCE
Tel: (33 16) 69 41 65 60
E-mail: diarra_at_ipnsun5.in2p3.fr
***
The mails I haved received are the following:
> From maillard_at_atyisa.enet.dec.com Wed Mar 6 20:22:43 1996
> To: "diarra_at_ipnsun5.in2p3.fr"_at_vbormc.vbo.dec.com
> Cc: maillard_at_atyisa.enet.dec.com
> Subject: RE: DECsafe ASE: NFS service not working
> Content-Length: 922
> X-Lines: 20
>
> Christophe,
> Est-ce que les deux services nfs correspondent bien a deux domaines AdvFS
> differents ? C'est indispensable.
> Pouvez vous envoyer la sortie de asemgr (deficition des services/config)
>
> De plus, les deux services NFS sont-ils bien associes a des disques separes ?
> (ASE doit s'approprier les disques en entier pas les commandes SCSI allocate
> et release)
> Dans tous les cas, la relocation manuelle ou forcee devrait ne pas poser de pb.
> Pouvez vous refaire le test de relocation (manuelle, plus simple) en n'ayant
> qu'un seul des services nfs Online (par asemgr) ?
> Si ce test fonctionne, remettez on ine l'autre service, et refaite le test de
> relocation. En fait, la mise en place d'une configuration ASE est souvent
> realisee de facon prograssive, un service apres l'autre, avec tests de bascule
> completes avant de passer a l'etape suivante.
> Tenez moi au courant.
> Bon courage.
>
> Benoit Maillard
> maillard_at_fgt.dec.com
>
> From flack_at_rtp4me.enet.dec.com Wed Mar 6 21:28:11 1996
> To: diarra_at_ipnsun5.in2p3.fr
> Subject: RE: SUMMARY: DECsafe ASE: problems
> Content-Length: 1067
> X-Lines: 23
>
> Christophe,
>
> You should set the other host to SCSI ID 6 not 0. This is due to SCSI bus
> priority - the higher numbers have higher priority - it is also a requirement
> of ASE as it uses the SCSI buses to communicate between systems (as well as
> the network).
>
> The other thing to check VERY, VERY carefully is SCSI bus termination - this
> is the cause of most ASE problems. All connections on the shared SCSI bus
> need to be done with Y-cables or tri-link connectors. Terminators go on
> one connection of two of the Y-cables (or tri-links) - the other connector
> gets connected to the BA-356 from each system. The
> PZKSA and BA-356 differential converter must have their internal termination
> resistors removed. If the shared differential bus is not set up like this,
> you will sooner or later have problems with ASE...also make sure you have the
> correct revisions of firmware on the SCSA adapters and the disk drives - this
> is also important for correct ASE operation due to the way ASE uses the SCSI
> buses.
>
> Hope this helps...
>
> Peter Falck
> Digital Systems Integration
>
> From Lucien_HERCAUD_at_paribas.com Wed Mar 6 22:40:10 1996
> To: diarra_at_ipnsun5.in2p3.fr
> Subject: Re: DECsafe ASE: problems
> Content-Length: 3013
> X-Lines: 83
>
>
> Monsieur,
>
> J'ai deja eu le pb. C'est surement un probleme de cablage (terminateurs ou pas,
> cable defecteux, ...) Essayez de demander a DEC de faire venir le "support" si
> pas de solution rapide de leur part. La personne chez DEC la plus competente est
> DIDIER BOURDIN.
>
> Cdtr.
> Lucien Hercaud
> ex. DEC
>
> ____________________________ Sparateur Rponse ________________________________
> Objet : DECsafe ASE: problems
> Auteur : alpha-osf-managers-relay_at_sws1.ctd.ornl.gov_at_INTERNET CCROUTER
> Date : 05/03/1996 14:57
>
>
> Dears managers,
>
> I have a very urgent problem to solve and need help from people who have
> experience with DECsafe/ASE.
>
> We are trying to share disks between an AS 2100-4/200 and a AS 2000-4/233.
> Disks are located in a BA356. Each AlphaServer have a PZKSA differential
> SCSI controleur. Our firmwares are all up to date.
>
> The BA356 and one of the AlphaServer terminate the differential SCSI bus.
>
> When we power on only one of the two AlphaServer it sees all the disques on the
> SCSI bus 0 and the SCSI bus 1 (the differentiel shared bus).
> We use '>>> sh dev' get get the list of the devices.
>
> When we power on the two AlphaServer, no one of them sees the disques in the
> BA356. We see only the disques on bus 0 (private bus).
>
>
> We left all the terminators on the differentiel to single-ended converter in
> the BA356.
>
>
> I have call DEC people since this morning but I am still waiting for their
> answer.
>
> Who have an idea of the origin of our problem ? We really need help !
>
>
> Thanks in advance.
>
>
> ***
> Christophe DIARRA
> Institut de Physique Nucleaire
> I.P.N ORSAY
> 91406 FRANCE
> Tel: (33 16) 69 41 65 60
> E-mail: diarra_at_ipnsun5.in2p3.fr
> ***
> --------------- Start RFC822 Headers ---------------
> Received: by SMROUTER.PARIBAS.COM (5.65+/1.3)
> id 20755.AA; Tue, 05 Mar 96 14:21:52 -0800
> Received: (from daemon_at_localhost) by sws1.CTD.ORNL.GOV (8.7.4/8.7.3) id
> IAA07457 for aomah; Tue, 5 Mar 1996 08:58:34 -0500
> Sender: alpha-osf-managers-relay_at_sws1.ctd.ornl.gov
> Followup-To: poster
> Precedence: bulk
> Received: from oaunx1.ctd.ornl.GOV (oaunx1.ctd.ornl.gov [128.219.128.17]) by
> sws1.CTD.ORNL.GOV (8.7.4/8.7.3) with ESMTP id IAA06841 for
> <alpha-osf-managers_at_sws1.ctd.ornl.gov>; Tue, 5 Mar 1996 08:55:58 -0500
> Received: from ccpngw.in2p3.fr (ccpngw.in2p3.fr [134.158.69.100]) by
> oaunx1.ctd.ornl.GOV (8.7.1/8.7.1) with SMTP id IAA21415 for
> <alpha-osf-managers_at_ornl.gov>; Tue, 5 Mar 1996 08:55:56 -0500 (EST) Received:
> by ccpngw.in2p3.fr (5.57/Ultrix2.4-C-3)
> id AA20551; Tue, 5 Mar 96 14:55:53 +0100
> Received: by ipnsun5.in2p3.fr (4.1/SMI-4.1)
> id AA03928; Tue, 5 Mar 96 14:57:47 +0100
> Date: Tue, 5 Mar 96 14:57:47 +0100
> From: diarra_at_paribas.com
> Reply-To: diarra_at_ipnsun5.in2p3.fr (Christophe Diarra)
> Message-Id: <9603051357.AA03928_at_ipnsun5.in2p3.fr>
> To: alpha-osf-managers_at_ornl.gov
> Subject: DECsafe ASE: problems
> --------------- End RFC822 Headers ---------------
>
> From flack_at_rtp4me.enet.dec.com Thu Mar 7 02:58:49 1996
> To: diarra_at_ipnsun5.in2p3.fr
> Subject: RE: DECsafe ASE: NFS service not working
> Content-Length: 730
> X-Lines: 17
>
>
> Greetings,
>
> I think in order to help, we would need more information about your
> configuration. Just from the brief information you provided, it almost
> sounds like you have at least one disk in your configuration that is used
> by more than one service - if you do, this is probably the cause of your
> problems. ASE will move over the disk(s) associated with the service being
> moved, which takes away the disk from another AdvFS domain (with the other
> service). Can you send me info on what appears in your daemon.log files, this
> is where ASE logs all its activities (and problems it encounters).
>
> It sounds like you have a configuration problem preventing services from
> moving cleanly.
>
> Peter
> Digital Systems Integration
>
> From crossmd_at_heu538.ha.uk.sbphrd.com Thu Mar 7 08:41:36 1996
> To: diarra_at_ipnsun5.in2p3.fr (Christophe Diarra)
> Cc: crossmd_at_heu538.ha.uk.sbphrd.com
> Subject: Re: DECsafe ASE: NFS service not working
> X-Mts: smtp
> Content-Length: 2588
> X-Lines: 76
>
> Christophe,
>
> We are running a very similar environment here, but we only offer one NFS
> service containing 5 file sets.
>
> Early on we experienced systems crashing at relocation, after a little
> investigation we found that the crash was being forced by ASE as it could not
> relocate the service, basically the standby system was using the mount point
> which ASE wanted to use, at relocation aSE failed to mount the file setsw and
> crashed the system to clear the blockage. Having found this from the logs files
> (/var/adm/syslog.dated/Mdate/time>/*.log we removed the problem and now we can
> fail over without any problems.
>
>
> The second problem can be caused by both systesms trying to access the file
> domain at the same time. We encountered this when our shutdown scripts on the
> master did not release the disks quick enough, this was caused by our user
> define Stop action script now waiting long enough for all the jobs to release
> their hold on the disks. Once we fixed this, the disks can be relocated every
> time.
>
> Hope this helps
>
> Mike
>
>
>
> > Dears managers,
> >
> > I have solved my DECsafe/ASE hardware configuration and sent a summary
> > ("SUMMARY: DECsafe ASE: problems"). Now, I have some problems to configure NF
> S
> > services, so I need help from ASE experienced users.
> >
> > I have a master/standby configuration and defined two nfs services. The two
> > nfs services run on the master and serves ADVfs filesets. The first nfs servi
> ce
> > handles 3 ADVfs filesets and the second service handles only one fileset.
> > The system is working but not as expected. I have crash problems: when I shut
> down
> > the master system, the second system crashes instead of automatically NFS ser
> ving
> > the disks defined in the nfs services.
> >
> > The second problem I have is the following: when I relocate manualy an nfs
> > service from the master to the secondary system and then relocate it from the
> > secondary to the master everything works fine. Now, if I retry a relocation
> > from the master to the secondary, the master crashes with the following messa
> ge:
> >
> > ADVFS EXCEPTION
> > Module = 2, Line = 1657
> > panic (cpu0)
> > ...
> >
> > Any idea will be appreciated.
> >
> > Thanks in advance.
> >
> > ***
> > Christophe DIARRA
> > Institut de Physique Nucleaire
> > I.P.N ORSAY
> > 91406 FRANCE
> > Tel: (33 16) 69 41 65 60
> > E-mail: diarra_at_ipnsun5.in2p3.fr
> > ***
>
>
> Mike Cross
> Senior Analyst
> UNIX Systems
> SmithKline Beecham Phone: +44 (0)1279-644858
> New Frontiers Science Park (South) Fax: +44 (0)1279-644969
> Third Avenue, Harlow,
> CM19 5AW email: crossmd_at_sbphrd.com
>
> From cherkus_at_fastball.unimaster.com Thu Mar 7 15:03:08 1996
> Subject: Re: SUMMARY: DECsafe ASE: problems
> To: diarra_at_ipnsun5.in2p3.fr
> Mime-Version: 1.0
> Content-Type> : > text/plain> ; > charset=US-ASCII>
> Content-Transfer-Encoding: 7bit
> Content-Length: 1482
> X-Lines: 40
>
> |> SUMMARY:
> |> --------
> |>
> |> The solution to my problem was to use differents SCSI number for my PZKSA
> |> controler.
> |>
> |> By default, each controler has the SCSI number 7. This make a conflict on the
> |> SCSI bus.
> |>
> |> I left the SCSI number 7 on one of the AlphaServer and changed the SCSI number
> |> to 0 on the other AlphaServer as following.
> |>
> |> >>> sh pkb*
> |> pkb0_host_id 7
> |> >>> set pkb0_host_id 0
> |> >>> init
> |> >>> sh dev =====> OK !!!
>
> Setting the ID of a controller to 0 is a BAD idea.
>
> SCSI bus is fixed priority, with 0 as lowest priority, so you just made
> your controller the lowest priority, lower than all other devices on
> the bus.
>
> ASE relies on host to host SCSI transfers. These are used to determine
> the liveliness of the hosts. If you have lots of traffic and your
> controller is low priority the host to host transfers will not get
> through. Thus, ASE may decide your second host is dead and relocate
> services off of it. But once the load decreases, the second host
> appears to be live, so the services move again. This cycle continues
> forever.
>
> I recommend configuring your storage such that ID 6 is available,
> and then use ID 6 for your second controller.
>
> --
> Dave Cherkus ----- UniMaster, Inc. ----- Contract Software Development
> Specialties: UNIX TCP/IP X OSF/1 AlphaAXP AIX RS/6000 Performance ISDN
> Email: cherkus_at_UniMaster.COM Tel: (603) 888-8308 Fax: (603) 888-4598
> Live Free or Die - New Hampshire has 3 seasons: ice, mud and black fly
>
Received on Fri Mar 08 1996 - 01:03:15 NZDT