SCSI CAM strange errors

From: Alex Gorbachev <gorbachev_alex_at_bah.com>
Date: Fri, 16 Jan 1998 18:06:09 -0500

Hi,

Our KSZPSA/dual HSZ50 system has been logging the following type of
errors several times each day, sometimes followed by a bus reset.
HSZ50's HSOF 5.0 has also logged error messages, which are shown in the
HSZ Data area below.

If anyone has an insight into this, I would be very grateful.

Thank you,
Alex Gorbachev
gorbachev_alex_at_bah.com




******************************** ENTRY 575
********************************


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 26.
Timestamp of occurrence 17-NOV-1997 11:34:42
Host name minnie

System type register x00000016 AlphaServer 4000 Series
Number of CPUs (mpnum) x00000001
CPU logging event (mperr) x00000000

Event validity 1. O/S claims event is valid
Event severity 5. Low Priority
Entry type 199. CAM SCSI Event Type


------- Unit Info -------
Bus Number 2.
Unit Number x00A3 Target = 4.
                                     LUN = 3.
------- CAM Data -------
Class x00 Disk
Subsystem x00 Disk
Number of Packets 10.

------ Packet Type ------ 258. Module Name String

Routine Name cam_disk_unit_atten

------ Packet Type ------ 256. Generic String

                                     Event - Unit Attention

------ Packet Type ------ 262. Info Error String

Error Type Information Message Detected
(recovered)

------ Packet Type ------ 257. Device Name String

Device Name DEC HSZ50-AX HSZ50-AX

------ Packet Type ------ 256. Generic String

                                     Active CCB at time of error

------ Packet Type ------ 256. Generic String

                                     CCB request completed with an error

------ Packet Type ------ 1. SCSI I/O Request CCB(CCB_SCSIIO)
Packet Revision 76.

CCB Address xFFFFFC0051800800
CCB Length x00C0
XPT Function Code x01 Execute requested SCSI I/O
Cam Status x84 CCB Request Completed WITH Error
                                     Autosense Data Valid for Target
Path ID 2.
Target ID 4.
Target LUN 3.
Cam Flags x000004C0 Data Direction (11: no data)
                                     Disable the SIM Queue Frozen State
*pdrv_ptr xFFFFFC00518004A8
*next_ccb x0000000000000000
*req_map x0000000000000000
void (*cam_cbfcnp)() xFFFFFC00004D3730
*data_ptr x0000000000000000
Data Transfer Length 0.
*sense_ptr xFFFFFC00518004D0
Auotsense Byte Length 160.
CDB Length 6.
Scatter/Gather Entry Cnt 0.
SCSI Status x02 Check Condition
Autosense Residue Length x00
Transfer Residue Length x00000000
(CDB) Command & Data Buf

          15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order
 0000: 00000000 00000000 00000000 * ............*

Timeout Value x00000014
*msg_ptr x0000000000000000
Message Length 0.
Vendor Unique Flags x0000
Tag Queue Actions x00

------ Packet Type ------ 256. Generic String

                                     Error, exception, or abnormal
condition

------ Packet Type ------ 256. Generic String

                                     UNIT ATTENTION - Medium changed or
target
                                     reset

------ Packet Type ------ 768. SCSI Sense Data
Packet Revision 0.

------- HSZ Data -------
Instance Code x0102030A An unrecoverable firmware
inconsistency
                                     was detected or an intentional
restart or
                                     shutdown of controller operation
was
                                     requested.
                                          
                                     Component ID = Executive
Services.
                                     Event Number = x00000002
                                     Repair Action = x00000003
                                     NR Threshold = x0000000A
Template Type x01 Last Failure Event.
Template Flags x00 HCE = 0, Event did not occur
during Host
                                             Command Execution.
Ctrl Serial # ZG71727363
Ctrl Software Revision V50Z
RAIDSET State x00 NORMAL. All members present and
                                     reconstructed, IF LUN is configured
as a
                                     RAIDSET.

Error Code x70 Current Error
Sense Key x06 Unit Attention
ASC & ASCQ xA000 ASC = x00A0
                                     ASCQ = x0000
                                     Last failure event report.

Last Failure Code x010D0110 Component ID = Executive
Services.
                                     Event Number = x0000000D
                                     Repair Action = x00000001
                                     Flag = 0, Firmware Detected
                                              Inconsistency.
                                     Restart Code = No restart.
                                     Parameter Count = 0.
                                          
                                     The System Information structure
within
                                     the System Information Page has
been reset
                                     to default settings. The only known
cause
                                     for this event is an I960 processor
hang
                                     caused by a reference to a memory
region
                                     that is not implemented. When such
a hang
                                     occurs, controller modules equipped
with
                                     inactivity watchdog timer circuitry
will
                                     spontaneously reboot after the
watchdog
                                     timer expires (within seconds of
the
                                     hang). Controller modules not so
equipped
                                     will just hang as indicated by the
green
                                     LED on the OCP remaining in a
steady
                                     state.
Received on Fri Jan 16 1998 - 23:59:50 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT