AlphaServer 1200 and ports showing up as No_Sync

From: Scott Brewster <scott_at_ses-sb.its.dias.qut.edu.au>
Date: Wed, 12 Sep 2001 15:33:45 +1000

Hi,

-----
Summary of the problem: An AlphaServer 1200 using dual KGPSAs is connected
to two 8 port Compaq EL switches. These, in turn, are connected to a pair
of HSG80 controllers in `multibus failover' mode. The AS1200 is unable
to install either Tru64 UNIX 4.0G or 5.1.
-----

We have an AlphaServer 1200, a pair of 8 port switches, a pair of
AlphaServer DS10s and a pair of HSG80 controllers in the following
configuration:

   switch1 +---------------------------+ switch2
    +---+ | | +---+
    | 7 | | +---------------------------+ | 7 |
    | | | | | | | |
    | 6 ----+ +----|-----+----|-----+ | +--- 6 |
    | | | 0 | 1 | hsg01a | | | |
    | 5 ----------+ +----------+----------+ +------ 5 |
    | | | | 0 | 1 | hsg01b | | |
    | 4 | | +----|-----+----|-----+ | | 4 |
    | | | | | | | |
  +-- 3 | +----------+ +-------------------+ | 3 ----+
  | | | | | |
  | | 2 ----------+ +------------+ +------------+ +------- 2 | |
  | | | | | | | | | | | |
  | | 1 | +----- === -----+ | 1 | |
  | | | | (DS10) | | (DS10) | | | |
  | | 0 | +------------+ +------------+ | 0 | |
  | +---+ +---+ |
  | +------------+ |
  | | | |
  +--------------------- -----------------------------------+
                       | (AS1200) |
                       +------------+

The HSG80 controllers are in `multibus failover' mode.

The switches are at firmware 2.1.9k, the HSG80s are at Software version
8.6F-1 hardware E12, the AlphaServers are at SRM 5.9, and HBA firmware 3.81A4.

We used `wwidmgr' on the AS1200 to configure the HBAs into `Fabric' mode,
and to add a volume we previously created on the HSG80s.

When we try to boot the OS installation CD (either `Compaq Tru64 UNIX 4.0G
Operating System Volume 1', or `Compaq Tru64 UNIX 5.1 Operating System
Volume 1') we have problems.

The process of booting both OS Vol. 1 CDs never gets to the point
where the installation program is ready to start installing software.
Full logs are attached.

The V4.0G OS Vol. 1 CD exits to a # prompt with the following error message:

   INIT: SINGLE-USER MODE
   
   
   Initializing system for Digital UNIX installation. Please wait...
   
   
   *** Performing CDROM Installation
   
   Loading installation process and scanning system hardware.
   
          Welcome to the DIGITAL UNIX Installation Procedure
   
   This procedure installs DIGITAL UNIX onto your system. You will
   be asked a series of system configuration questions. Until you
   answer all questions, your system is not changed in any way.
   
   During the question and answer session, you can go back to any
   previous question and change your answer by entering: history
   You can get more information about a question by entering: help
   
   /isl/Install [ccii_initialize], FATAL ERROR:
       "finder -q" failure file "/tmp/finder.abort" exists
   
   
   
   
   
   
   The installation procedure failed or was intentionally exited. To restart
   the installation, halt and reboot the system or enter: restart
   
   #

The V5.1 OS Vol. 1 CD crashes with the following error message:

    INIT: SINGLE-USER MODE
    
    
    Initializing system for Tru64 UNIX installation. Please wait...
    
    Assigning a cluster device number to root
    
    trap: invalid memory read access from kernel mode
    
        faulting virtual address: 0x0000000000000193
        pc of faulting instruction: 0xffffffff004342ac
        ra contents at time of fault: 0xffffffff004341b0
        sp contents at time of fault: 0xfffffe0422f8ece0
    
    panic (cpu 0): kernel memory fault
    syncing disks... done
    
    DUMP: Will attempt to compress 38903808 bytes of dump
        : into 470917088 bytes of memory.
    DUMP: Dump to 0x200005: ..: End 0x200005
    succeeded
    CP - SAVE_TERM routine to be called
    CP - SAVE_TERM exited with hlt_req = 1, r0 = 00000000.00000000
    
    halted CPU 0
    
    halt code = 5
    HALT instruction executed
    PC = ffffffff0026e980
    P00>>>

We have noticed that sometimes the switch ports that the AS1200 is connected
to are in `No_Sync' mode. If they are in this mode, whenever we boot the
V4.0G OS Vol. 1 CD or the V5.1 OS Vol. 1 CD the ports change to
`Online' mode part way through the boot process. After the 4.0G CD returns
to the # prompt, the ports remain in `Online' mode, but after the 5.1 CD
crashes, the ports return to `No_Sync' mode.

We have also noticed that after issuing the `boot DKA400' command,
the AS1200 responds with the following message:

    AlphaServer 1200 Console V5.9-5, 8-JAN-2001 10:05:24
    
    CPU 0 booting
    
    could not find controller for device non-existent
    device non-existent is invalid
    could not find controller for device device
    device device is invalid
    (boot dka400.4.0.1.1)

At this point, the boot process continues.

We wish to be able to install Compaq Tru64 UNIX V5.1 onto a volume on the SAN
(ie. a volume on the HSG80 controller combination) which the AS1200 can
use to boot. Is this configuration supported? Can anyone see some obvious
mistake we have made?

Scott

Content-id: <21115.1000272815.2_at_ses-sb.its.dias.qut.edu.au>

Boot of Tru64 UNIX V5.1:

>>> boot dka400
11:48:46 Initializing...
11:48:46 ø
11:48:47 SROM V3.0 on cpu0
11:48:47 XSROM V5.9 on cpu0
11:48:48 BCache testing complete on cpu0
11:48:51 mem_pair0 - 256 MB
11:48:51 mem_pair1 - 256 MB
11:48:51 20..21..23..
11:48:52 please wait 18 seconds for T24 to complete
11:48:52 24..
11:49:07 Memory testing complete on cpu0
11:49:07 starting console on CPU 0
11:49:09 sizing memory
11:49:09 0 256 MB DIMM
11:49:09 1 256 MB DIMM
11:49:09 probing IOD1 hose 1
11:49:13 bus 0 slot 1 - NCR 53C810
11:49:13 bus 0 slot 2 - QLogic ISP10X0
11:49:14 bus 0 slot 3 - NCR 53C810
11:49:14 probing IOD0 hose 0
11:49:14 bus 0 slot 1 - PCEB
11:49:14 probing EISA Bridge, bus 1
11:49:14 bus 0 slot 2 - DECchip 21140-AA
11:49:14 bus 0 slot 3 - KGPSA-C
11:49:14 bus 0 slot 4 - KGPSA-C
11:49:14 configuring I/O adapters...
11:49:14 ncr0, hose 1, bus 0, slot 1
11:49:14 isp0, hose 1, bus 0, slot 2
11:49:16 ncr1, hose 1, bus 0, slot 3
11:49:16 floppy0, hose 0, bus 1, slot 0
11:49:18 tulip0, hose 0, bus 0, slot 2
11:49:18 kgpsa0, hose 0, bus 0, slot 3
11:49:19 kgpsa1, hose 0, bus 0, slot 4
11:49:20 System temperature is 20 degrees C
11:49:20 AlphaServer 1200 Console V5.9-5, 8-JAN-2001 10:05:24
11:49:21
11:49:21 CPU 0 booting
11:49:21
11:49:21 (boot dka400.4.0.1.1)
11:49:21 block 0 of dka400.4.0.1.1 is a valid boot block
11:49:31 reading 14 blocks from dka400.4.0.1.1
11:49:31 bootstrap code read in
11:49:32 Building FRU table
11:49:32 base = 200000, image_start = 0, image_bytes = 1c00
11:49:32 initializing HWRPB at 2000
11:49:32 initializing page table at 1f2000
11:49:32 initializing machine state
11:49:32 setting affinity to the primary CPU
11:49:32 jumping to bootstrap code
11:49:32
11:49:44 UNIX boot - Thursday August 24, 2000
11:49:44
11:49:44 Loading vmunix ...
11:49:55 Loading at 0xffffffff00000000
11:49:55
11:49:55 Sizes:
11:49:55 text = 7105408
11:49:55 data = 1932608
11:50:00 bss = 2314608
11:50:01 Starting at 0xffffffff0000f780
11:50:01
11:50:01 Alpha boot: available memory from 0x1400000 to 0x2fffc000
11:50:16 Compaq Tru64 UNIX V5.1 (Rev. 732); Thu Aug 24 23:20:42 EDT 2000
11:50:16 physical memory = 512.00 megabytes.
11:50:16 available memory = 488.03 megabytes.
11:50:16 using 636 buffers containing 4.96 megabytes of memory
11:50:16 Firmware revision: 5.9
11:50:20 PALcode: UNIX version 1.23
11:50:20 AlphaServer 1200 5/533 4MB
11:50:20 pci1 (primary bus:1) at mcbus0 slot 5
11:50:20 Loading SIOP: script c0000000, reg 7feee00, data c000a000
11:50:20 scsi0 at psiop0 slot 0 rad 0
11:50:23 isp0 at pci1 slot 2
11:50:23 isp0: QLOGIC ISP1040B/V2 - Differential Mode
11:50:23 isp0: Firmware revision 5.57 (loaded by console)
11:50:23 isp0: Fast RAM timing enabled.
11:50:23 scsi1 at isp0 slot 0 rad 0
11:50:28 Loading SIOP: script c004e000, reg 7feef00, data c0012000
11:50:28 scsi2 at psiop1 slot 0 rad 0
11:50:35 pci0 (primary bus:0) at mcbus0 slot 4
11:50:35 eisa0 at pci0
11:50:35 ace0 at eisa0
11:50:35 ace1 at eisa0
11:50:35 lp0 at eisa0
11:50:35 gpc0 at eisa0
11:50:35 fdi0 at eisa0
11:50:35 fd0 at fdi0 unit 0
11:50:35 tu0: DECchip 21140: Revision: 2.2
11:50:35 tu0: auto negotiation capable device
11:50:35 tu0 at pci0 slot 2
11:50:35 tu0: DEC TULIP (10/100) Ethernet Interface, hardware address: 00-00-F8-10-11-5A
11:50:35 tu0: auto negotiation off: selecting 100BaseTX (UTP) port: full duplex
11:50:35 emx0 at pci0 slot 3
11:50:35 KGPSA-CA : Driver Rev 1.25a : F/W Rev 3.81A4(2.01A0) : wwn 1000-0000-c923-a338
11:50:35 emx0: Using console topology setting of : Fabric
11:50:35 scsi3 at emx0 slot 0 rad 0
11:50:36 emx2 at pci0 slot 4
11:50:36 KGPSA-CA : Driver Rev 1.25a : F/W Rev 3.81A4(2.01A0) : wwn 1000-0000-c923-a25d
11:50:36 emx2: Using console topology setting of : Fabric
11:50:36 scsi4 at emx2 slot 0 rad 0
11:50:36 Created FRU table binary error log packet
11:50:36 kernel console: ace0
11:50:36 NetRAIN configured.
11:50:36 i2c: Server Management Hardware Present
11:50:36 vm_swap_init: warning UNSPECIFIED swap device not found
11:50:43 vm_swap_init: swap is re-set to lazy (over commitment) mode
11:50:43
11:50:45 INIT: SINGLE-USER MODE
11:50:45
11:50:52
11:50:52 Initializing system for Tru64 UNIX installation. Please wait...
11:50:52
11:50:52 Assigning a cluster device number to root
11:51:11
11:51:13 trap: invalid memory read access from kernel mode
11:51:13
11:51:13 faulting virtual address: 0x0000000000000193
11:51:13 pc of faulting instruction: 0xffffffff004342ac
11:51:13 ra contents at time of fault: 0xffffffff004341b0
11:51:13 sp contents at time of fault: 0xfffffe0422f8ece0
11:51:13
11:51:13 panic (cpu 0): kernel memory fault
11:51:13 syncing disks... done
11:51:13
11:51:13 DUMP: Will attempt to compress 38903808 bytes of dump
11:51:13 : into 470917088 bytes of memory.
11:51:14 DUMP: Dump to 0x200005: ..: End 0x200005
11:51:16 succeeded
11:51:16 CP - SAVE_TERM routine to be called
11:51:16 CP - SAVE_TERM exited with hlt_req = 1, r0 = 00000000.00000000
11:51:16
11:51:16 halted CPU 0
11:51:16
11:51:16 halt code = 5
11:51:16 HALT instruction executed
11:51:16 PC = ffffffff0026e980
11:51:16 P00>>>

Boot of Tru64 UNIX V5.1:

P00>>>boot dka400
11:57:23 Initializing...
11:57:23 þ
11:57:25 SROM V3.0 on cpu0
11:57:25 XSROM V5.9 on cpu0
11:57:25 BCache testing complete on cpu0
11:57:28 mem_pair0 - 256 MB
11:57:28 mem_pair1 - 256 MB
11:57:28 20..21..23..
11:57:29 please wait 18 seconds for T24 to complete
11:57:29 24..
11:57:44 Memory testing complete on cpu0
11:57:44 starting console on CPU 0
11:57:46 sizing memory
11:57:46 0 256 MB DIMM
11:57:46 1 256 MB DIMM
11:57:46 probing IOD1 hose 1
11:57:51 bus 0 slot 1 - NCR 53C810
11:57:51 bus 0 slot 2 - QLogic ISP10X0
11:57:51 bus 0 slot 3 - NCR 53C810
11:57:51 probing IOD0 hose 0
11:57:51 bus 0 slot 1 - PCEB
11:57:51 probing EISA Bridge, bus 1
11:57:51 bus 0 slot 2 - DECchip 21140-AA
11:57:51 bus 0 slot 3 - KGPSA-C
11:57:51 bus 0 slot 4 - KGPSA-C
11:57:51 configuring I/O adapters...
11:57:51 ncr0, hose 1, bus 0, slot 1
11:57:51 isp0, hose 1, bus 0, slot 2
11:57:53 ncr1, hose 1, bus 0, slot 3
11:57:53 floppy0, hose 0, bus 1, slot 0
11:57:55 tulip0, hose 0, bus 0, slot 2
11:57:55 kgpsa0, hose 0, bus 0, slot 3
11:57:56 kgpsa1, hose 0, bus 0, slot 4
11:57:57 System temperature is 20 degrees C
11:57:57 AlphaServer 1200 Console V5.9-5, 8-JAN-2001 10:05:24
11:57:58
11:57:58 CPU 0 booting
11:57:58
11:57:58 could not find controller for device non-existent
11:57:58 device non-existent is invalid
11:57:58 could not find controller for device device
11:57:58 device device is invalid
11:57:58 (boot dka400.4.0.1.1)
11:57:58 block 0 of dka400.4.0.1.1 is a valid boot block
11:58:08 reading 13 blocks from dka400.4.0.1.1
11:58:08 bootstrap code read in
11:58:09 Building FRU table
11:58:09 base = 200000, image_start = 0, image_bytes = 1a00
11:58:09 initializing HWRPB at 2000
11:58:09 initializing page table at 1f2000
11:58:09 initializing machine state
11:58:09 setting affinity to the primary CPU
11:58:09 jumping to bootstrap code
11:58:09
11:58:21 UNIX boot - Sun May 14 05:34:40 EDT 2000
11:58:21
11:58:21 Loading vmunix ...
11:58:33 Loading at 0xffffffff00000000
11:58:33 Mapping Image Address Space
11:58:33 Mapping complete
11:58:33
11:58:33 Sizes:
11:58:33 text = 6170672
11:58:33 data = 1956288
11:58:37 bss = 3487584
11:58:38 Starting at 0xffffffff00224740
11:58:38
11:58:38 Alpha boot: available memory from 0x1200000 to 0x2fffc000
11:58:54 Digital UNIX V4.0G (Rev. 1530); Sun May 14 06:24:24 EDT 2000
11:58:54 physical memory = 512.00 megabytes.
11:58:54 available memory = 490.82 megabytes.
11:58:54 using 652 buffers containing 5.09 megabytes of memory
11:58:55 emx: dynamic addressing enabled
11:58:58 Firmware revision: 5.9
11:59:00 PALcode: UNIX version 1.23
11:59:00 AlphaServer 1200 5/533 4MB
11:59:00 pci1 at mcbus0 slot 5
11:59:00 psiop0 at pci1 slot 1
11:59:00 Loading SIOP: script c0001e00, reg 7feee00, data c000bd68
11:59:00 scsi0 at psiop0 slot 0
11:59:02 rz4 at scsi0 target 4 lun 0 (LID=0) (DEC RRD46 (C) DEC 1337)
11:59:02 isp0 at pci1 slot 2
11:59:02 isp0: QLOGIC ISP1040B/V2 - Differential Mode
11:59:03 isp0: Firmware revision 5.57 (loaded by console)
11:59:03 isp0: Fast RAM timing enabled.
11:59:03 scsi1 at isp0 slot 0
11:59:06 psiop1 at pci1 slot 3
11:59:06 Loading SIOP: script c001de00, reg 7feef00, data c0026168
11:59:06 scsi2 at psiop1 slot 0
11:59:08 gpc0 at eisa0
11:59:08 pci0 at mcbus0 slot 4
11:59:08 eisa0 at pci0
11:59:08 ace0 at eisa0
11:59:09 lp0 at eisa0
11:59:09 fdi0 at eisa0
11:59:09 fd0 at fdi0 unit 0
11:59:09 tu0: DECchip 21140: Revision: 2.2
11:59:10 tu0: auto negotiation capable device
11:59:11 tu0 at pci0 slot 2
11:59:13 tu0: DEC TULIP (10/100) Ethernet Interface, hardware address: 00-00-F8-10-11-5A
11:59:13 tu0: auto negotiation off: selecting 100BaseTX (UTP) port: full duplex
11:59:13 emx0 at pci0 slot 3
11:59:13 KGPSA-CA : Driver Rev 1.21 : F/W Rev 3.81A4(2.01A0) : wwn 1000-0000-c923-a338
11:59:14 emx0: Using console topology setting of : Fabric
11:59:14 scsi3 at emx0 slot 0
11:59:14 Type 0xc at scsi3 target 0 lun 0 (LID=1) (DEC HSG80CCL V86F) (Wide16)
11:59:14 Type 0xc at scsi3 target 1 lun 0 (LID=9) (DEC HSG80CCL V86F) (Wide16)
11:59:14 emx1 at pci0 slot 4
11:59:14 KGPSA-CA : Driver Rev 1.21 : F/W Rev 3.81A4(2.01A0) : wwn 1000-0000-c923-a25d
11:59:14 emx1: Using console topology setting of : Fabric
11:59:14 scsi4 at emx1 slot 0
11:59:15 Type 0xc at scsi4 target 0 lun 0 (LID=17) (DEC HSG80CCL V86F) (Wide16)
11:59:15 Type 0xc at scsi4 target 1 lun 0 (LID=25) (DEC HSG80CCL V86F) (Wide16)
11:59:15 Created FRU table binary error log packet
11:59:15 kernel console: ace0
11:59:15 i2c: Server Management Hardware Present
11:59:15 vm_swap_init: warning /sbin/swapdefault swap device not found
11:59:17 vm_swap_init: swap is set to lazy (over commitment) mode
11:59:17
11:59:19 INIT: SINGLE-USER MODE
11:59:19
11:59:24
11:59:24 Initializing system for Digital UNIX installation. Please wait...
11:59:24
11:59:24
11:59:35 *** Performing CDROM Installation
11:59:35
11:59:35 Loading installation process and scanning system hardware.
11:59:42
12:00:03 Welcome to the DIGITAL UNIX Installation Procedure
12:00:03
12:00:03 This procedure installs DIGITAL UNIX onto your system. You will
12:00:03 be asked a series of system configuration questions. Until you
12:00:03 answer all questions, your system is not changed in any way.
12:00:03
12:00:03 During the question and answer session, you can go back to any
12:00:03 previous question and change your answer by entering: history
12:00:03 You can get more information about a question by entering: help
12:00:03
12:00:03 /isl/Install [ccii_initialize], FATAL ERROR:
12:00:03 "finder -q" failure file "/tmp/finder.abort" exists
12:00:03
12:00:03
12:00:03
12:00:04
12:00:04
12:00:04
12:00:04 The installation procedure failed or was intentionally exited. To restart
12:00:04 the installation, halt and reboot the system or enter: restart
12:00:04
12:00:04 #
Content-id: <21115.1000272815.3_at_ses-sb.its.dias.qut.edu.au>

Output from `switchShow' before booting Tru64 UNIX V5.1 OS Vol. 1 CD:

kgfcs51:admin> switchshow
switchName: kgfcs51
switchType: 4.1
switchState: Online
switchRole: Principal
switchDomain: 51
switchId: fffc33
switchWwn: 10:00:00:60:69:30:1e:7e
port 0: sw No_Light
port 1: sw No_Light
port 2: sw Online F-Port 10:00:00:00:c9:23:a2:63
port 3: sw No_Sync
port 4: sw No_Light
port 5: sw Online F-Port 50:00:1f:e1:00:0c:1b:c1
port 6: sw Online F-Port 50:00:1f:e1:00:0c:1b:c4
port 7: -- No_Module

Output from `switchShow' during the boot process:

kgfcs51:admin> switchshow
switchName: kgfcs51
switchType: 4.1
switchState: Online
switchRole: Principal
switchDomain: 51
switchId: fffc33
switchWwn: 10:00:00:60:69:30:1e:7e
port 0: sw No_Light
port 1: sw No_Light
port 2: sw Online F-Port 10:00:00:00:c9:23:a2:63
port 3: sw Online F-Port 10:00:00:00:c9:23:a3:38
port 4: sw No_Light
port 5: sw Online F-Port 50:00:1f:e1:00:0c:1b:c1
port 6: sw Online F-Port 50:00:1f:e1:00:0c:1b:c4
port 7: -- No_Module

Output from `switchShow' after the boot process crashes:

kgfcs51:admin> switchshow
switchName: kgfcs51
switchType: 4.1
switchState: Online
switchRole: Principal
switchDomain: 51
switchId: fffc33
switchWwn: 10:00:00:60:69:30:1e:7e
port 0: sw No_Light
port 1: sw No_Light
port 2: sw Online F-Port 10:00:00:00:c9:23:a2:63
port 3: sw No_Sync
port 4: sw No_Light
port 5: sw Online F-Port 50:00:1f:e1:00:0c:1b:c1
port 6: sw Online F-Port 50:00:1f:e1:00:0c:1b:c4
port 7: -- No_Module
Received on Wed Sep 12 2001 - 05:35:24 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT