Opened 9 years ago

Last modified 9 years ago

#227 accepted defect

fallo en c2t5000CCA369C508E5d0

Reported by: fernando Owned by: fernando
Priority: major Milestone:
Component: TracMeteo Keywords: sas disco error
Cc: antonio

Description (last modified by fernando)

  • zpool status: c2t5000CCA369C508E5d0 FAULTED 0 204 0 too many errors
  • iostat -exn: 0.7 0.8 19.8 29.3 0.0 0.0 0.0 0.8 0 0 0 13 29 42 c2t5000CCA369C508E5d0
    root@seal:~# iostat -Ex sd63
                     extended device statistics
    device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
    sd63      0.7    0.8   19.8   29.3  0.0  0.0    0.8   0   0
    sd63      Soft Errors: 0 Hard Errors: 13 Transport Errors: 29
    Vendor: ATA      Product: Hitachi HDS72302 Revision: A580 Serial No: MN3220F30B2ATE
    Size: 2000.40GB <2000398934016 bytes>
    Media Error: 4 Device Not Ready: 0 No Device: 9 Recoverable: 0
    Illegal Request: 0 Predictive Failure Analysis: 0
    
    Tampoco se puede volver a configurar:
    {{{
    root@seal:~# cfgadm -c configure c9::w5000cca369c508e5,0
    cfgadm: Hardware specific failure: failed to configure SCSI device: I/O error
    
    }}}
    
    

Change History (5)

comment:1 Changed 9 years ago by fernando

  • Description modified (diff)

El disco no responde a los comando de sg3 , y esto queda reflejado en el iostat aumentando el numero de errores de transporte:

root@seal:~# iostat -En|grep c2t5000CCA369C508E5d0
c2t5000CCA369C508E5d0 Soft Errors: 0 Hard Errors: 13 Transport Errors: 27

root@seal:~# sg_inq  -u /dev/rdsk/c2t5000CCA369C508E5d0
inquiry: pass through os error: I/O error
    inquiry: failed, res=-1

root@seal:~# sg_inq  /dev/rdsk/c2t5000CCA369C508E5d0
SCSI INQUIRY failed on /dev/rdsk/c2t5000CCA369C508E5d0, res=-1

root@seal:~# iostat -En|grep c2t5000CCA369C508E5d0
c2t5000CCA369C508E5d0 Soft Errors: 0 Hard Errors: 13 Transport Errors: 29

Tampoco funciona el commando smartctl sobre este dispositivo:

root@seal:~# smartctl -a -d sat /dev/rdsk/c2t5000CCA369C508E5d0
smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: I/O error

El zfs lo ha puesto en "desconfigurado"

root@seal:~#  cfgadm -av c9
Ap_Id                          Receptacle   Occupant     Condition  Information
When         Type         Busy     Phys_Id
c9                             connected    configured   unknown
unavailable  scsi-sas     n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi
c9::es/ses2                    connected    configured   unknown    LSI CORP SAS2X36
unavailable  ESI          n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::es/ses2
c9::smp/expd2                  connected    configured   unknown    LSI CORP SAS2X36
unavailable  smp          n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::smp/expd2
c9::w5e83a972b7f39c50,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5E83A972B7F39C50d0s0(sd58)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5e83a972b7f39c50,0
c9::w5000cca369c4e90b,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C4E90Bd0s0(sd73)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c4e90b,0
c9::w5000cca369c4f888,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C4F888d0s0(sd66)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c4f888,0
c9::w5000cca369c50f1f,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C50F1Fd0s0(sd79)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c50f1f,0
c9::w5000cca369c504d1,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C504D1d0s0(sd61)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c504d1,0
c9::w5000cca369c505d5,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C505D5d0s0(sd64)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c505d5,0
c9::w5000cca369c506af,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C506AFd0s0(sd80)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c506af,0
c9::w5000cca369c506bb,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C506BBd0s0(sd72)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c506bb,0
c9::w5000cca369c508c9,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C508C9d0s0(sd67)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c508c9,0
c9::w5000cca369c508e0,0        connected    configured   unknown    Client Device: /dev/dsk/c2t5000CCA369C508E0d0s0(sd35)
unavailable  disk-path    n        /devices/pci@7a,0/pci8086,340e@7/pci1000,3080@0/iport@f0:scsi::w5000cca369c508e0,0
c9::w5000cca369c508e5,0        connected    unconfigured unknown    Client Device: /dev/dsk/c2t5000CCA369C508E5d0s0(sd63)

El fallo se produjo el dia 28:

root@seal:~# fmdump -m -u 9cd1792d-ff03-ce0c-aad0-be392c7f2c9a
SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Thu Mar 28 11:08:16 CET 2013
PLATFORM: X8DTH-i-6-iF-6F, CSN: 1234567890, HOSTNAME: seal
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: 9cd1792d-ff03-ce0c-aad0-be392c7f2c9a
DESC: The number of I/O errors associated with a ZFS device exceeded
             acceptable levels.  Refer to http://illumos.org/msg/ZFS-8000-FD for more information.
AUTO-RESPONSE: The device has been offlined and marked as faulted.  An attempt
             will be made to activate a hot spare if available.
IMPACT: Fault tolerance of the pool may be compromised.
REC-ACTION: Run 'zpool status -x' and replace the bad device.

comment:2 follow-up: Changed 9 years ago by fernando

  • Status changed from new to accepted

Tampoco deja volverlo a configurar:

root@seal:~# cfgadm -c configure c9::w5000cca369c508e5,0
cfgadm: Hardware specific failure: failed to configure SCSI device: I/O error

comment:3 in reply to: ↑ 2 Changed 9 years ago by antonio

Pero si que puedes acceder al slot y marcarlo como fallo.

El problema que tenemos es que ese disco no sabemos a que backplane y slot estaba asociado. O si?

Estaría bien, observar como el backplane ve ese slot y si detecta que hay algún dispositivo, aunque sea en modo fallo.

A

comment:4 Changed 9 years ago by fernando

Podemos con el numero de serie MN3220F30B2ATE que nos proporciona iostat mirar en las tablas : https://www.meteo.unican.es/trac/meteo/wiki/Jbods

Y con ello sabemos que el dico esta en el Jbod1 front expander Slot 16.me falta poner en las tabla a que /dev/es/ses* corresponde

Entiendo que lo que dices que si funciona es el sg_ses.
Sg_ses nos dice los SAS address de los discos y nos dejaria marcarlos.

comment:5 Changed 9 years ago by antonio

Eso es.

Lo que pasa es que es posible que el sg_sas no te diga el SAS address del dispositivo conctado a ese slot, debido a que está offline.

Aunque como es un error, y el sistema lo sigue viendo es posible que el SAS expander lo tenga todavía asociado el dispositivo pero sin la posibilidad de acceder a el.

Prueba el sg_ses y mira a ver que hace.

Antonio

Note: See TracTickets for help on using tickets.