Scsi cam errors ??

From: Gary Menna <G.Menna_at_isu.usyd.edu.au>
Date: Thu, 13 May 1999 09:06:10 +1000 (EST)

Alpha 4100A Ver 4.0D patch#3 HSZ70 rz1df-cb

Hi All ,
        We have been getting scsi cam errors that are basically
        resets . They happen on different 'devices' on different
        RAID 5 sets . These raid sets have been partitioned off
        and the logical disks thrown into LSM to use as raw
        (sybase) database devices. The engineer is returning
        to replace the controller , cable etc .
        My question is , could these resets cause a database
        corruption . I am getting NO errors in LSM or anywhere
        else . Our database has corrupted twice in the last week .
        Sybase says yes , it's a hardware problem and this must
        be it . I don't see how .


See below for dia list :


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 8.
Timestamp of occurrence 11-MAY-1999 15:07:11
Host name whykickamoocow

System type register x00000016 Alpha 4000/1200 Series
Number of CPUs (mpnum) x00000002
CPU logging event (mperr) x00000001

Event validity 1. O/S claims event is valid
Event severity 5. Low Priority
Entry type 199. CAM SCSI Event Type

------- Unit Info -------
Bus Number 2.
Unit Number x0096 Target = 2.
                                     LUN = 6.
------- CAM Data -------
Class x00 Disk
Subsystem x00 Disk
Number of Packets 10.

------ Packet Type ------ 258. Module Name String

Routine Name cam_disk_unit_atten

------ Packet Type ------ 256. Generic String

                                     Event - Unit Attention
------ Packet Type ------ 262. Info Error String

Error Type Information Message Detected
(recovered)

------ Packet Type ------ 257. Device Name String

Device Name DEC HSZ70 V70Z

------ Packet Type ------ 256. Generic String

                                     Active CCB at time of error

------ Packet Type ------ 256. Generic String

                                     CCB request completed with an error

------ Packet Type ------ 1. SCSI I/O Request CCB(CCB_SCSIIO)
Packet Revision 76.
CCB Address xFFFFFC003FE17E80
CCB Length x00C0
XPT Function Code x01 Execute requested SCSI I/O
CAM Status x84 CCB Request Completed WITH Error
                                     Autosense Data Valid for Target
Path ID 2.
Target ID 2.
Target LUN 6.
CAM Flags x000044C0 Data Direction (11: no data)
                                     Disable the SIM Queue Frozen State
                                     Attempt Sync Data Xfer - SDTR
*pdrv_ptr xFFFFFC003FE17B28
*next_ccb x0000000000000000
*req_map x0000000000000000
void (*cam_cbfcnp)() xFFFFFC00004F0CC0
*data_ptr x0000000000000000
Data Transfer Length 0.
*sense_ptr xFFFFFC003FE17B50
Auotsense Byte Length 160.
CDB Length 6.
Scatter/Gather Entry Cnt 0.
SCSI Status x02 Check Condition
Autosense Residue Length x00
Transfer Residue Length x00000000
(CDB) Command & Data Buf
         15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order
 0000: 00000000 00000000 00000000 * ............*

Timeout Value x00000014
*msg_ptr x0000000000000000
Message Length 0.
Vendor Unique Flags x0000
Tag Queue Actions x00

------ Packet Type ------ 256. Generic String

                                     Error, exception, or abnormal
condition

------ Packet Type ------ 256. Generic String

                                     UNIT ATTENTION - Medium changed or
target
                                     reset

------ Packet Type ------ 768. SCSI Sense Data
Packet Revision 0.

------- HSx Data -------
Instance Code x03F40064 Device services had to reset the port
to
                                     clear a bad condition. Note that in
this
                                     instance the Associated Target,
Associated
                                     ASC, and Associated ASCQ fields are
                                     undefined.
                                          
                                     Component ID = Device Services.
                                     Event Number = x000000F4
                                     Repair Action = x00000000
                                     NR Threshold = x00000064
Template Type x41 Device Services Non-Transfer Error.
Template Flags x00 HCE = 0, Event did not occur during
Host
                                             Command Execution.
Ctrl Serial # ZG83217501
Ctrl Software Revision V70Z
RAIDSET State x00 NORMAL. All members present and
                                     reconstructed, IF LUN is configured
as a
                                     RAIDSET.

Error Code x70 Current Error
Sense Key x06 Unit Attention
ASC & ASCQ xD203 ASC = x00D2
                                     ASCQ = x0003
                                     Device services had to reset the bus.
Associated Port x03
Associated Target x01
Associated ASC x00
Associated ASCQ x00



Thanking you ,
        

Gary Menna E-Mail g.menna_at_isu.usyd.edu.au
Information Technology Services Phone +61 2 9351-6360
University of Sydney (G05) Fax +61 2 9351-7711


        *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
        * *
        * I do not want anyone to want for me *
        * I want to want for myself . *
        * - Yevgeny Zamyatin *
        * *
        *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
                Bring on the Liquor Pops
Received on Wed May 12 1999 - 23:08:39 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT