Opened 11 years ago

Closed 10 years ago

Last modified 10 years ago

#109 closed defect (fixed)

Reemplazar e iniciar RMA del disco de SEAL

Reported by: antonio Owned by: fernando
Priority: critical Milestone:
Component: Cluster Keywords:
Cc:

Description

Reemplazar e iniciar RMA del disco de SEAL

c10t5000CCA369C5081Ed0

admin@seal.macc.unican.es:~$ pfexec /etc/smartmon-ux -O /dev/rdsk/c10t5000CCA369C5081Ed0
SMARTMon-UX [Release 1.60, Build  3-SEP-2011] - Copyright 2001-2011 SANtools(R), Inc. http://www.SANtools.com
Discovered ATA Hitachi HDS72302 S/N "MN3220F30B24BE" on /dev/rdsk/c10t5000CCA369C5081Ed0 (SMART Enabled)(1907729 MB)
  Cumulative errors recorded by disk:               2814 (Last 5 entries only)

    Error #(2814) Contents of registers when command register was written:
     Device state field byte and description:       03 (Active or idle)
     Timestamp (lifetime powered-up hours):         4054
     ERROR Register:                                40
     STATUS Register:                               51 (Error: DRDY, DSC, ERR)
     SECTOR Register:                               29
     LBA LOW Register:                              58
     LBA MIDDLE Register:                           F3
     LBA HIGH Register:                             B0
     DEVICE Register:                               03
     Extended error bytes:                          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     Block Number:                                  b0f358h (11596632)
     Listing of previous 5 commands executed before error (reverse-sequential):
       Time(secs) Command  Feature   Sector  LBA Low  LBA Mid LBA High   Device  DevCtrl  Command Description
       ---------  -------  -------   ------  -------  ------- --------   ------  -------  ------------------------
    203653.869       60       01       40       E2       BC       6C       40       00    READ FPDMA QUEUED
    203653.869       60       01       38       AE       3D       AA       40       00    READ FPDMA QUEUED
    203653.869       60       80       30       80       3E       33       40       00    READ FPDMA QUEUED
    203653.869       60       14       28       09       2D       AC       40       00    READ FPDMA QUEUED
    203653.869       60       22       20       10       57       E9       40       00    READ FPDMA QUEUED

    Error #(2813) Contents of registers when command register was written:
     Device state field byte and description:       03 (Active or idle)
     Timestamp (lifetime powered-up hours):         4054
     ERROR Register:                                40
     STATUS Register:                               51 (Error: DRDY, DSC, ERR)
     SECTOR Register:                               29
     LBA LOW Register:                              58
     LBA MIDDLE Register:                           F3
     LBA HIGH Register:                             B0
     DEVICE Register:                               03
     Extended error bytes:                          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     Block Number:                                  b0f358h (11596632)
     Listing of previous 5 commands executed before error (reverse-sequential):
       Time(secs) Command  Feature   Sector  LBA Low  LBA Mid LBA High   Device  DevCtrl  Command Description
       ---------  -------  -------   ------  -------  ------- --------   ------  -------  ------------------------
    203650.954       60       08       38       8B       D9       34       40       00    READ FPDMA QUEUED
    203650.954       60       1A       30       C1       99       25       40       00    READ FPDMA QUEUED
    203650.954       60       01       28       7E       5F       3F       40       00    READ FPDMA QUEUED
    203650.954       60       22       20       10       57       E9       40       00    READ FPDMA QUEUED
    203650.953       60       14       18       09       2D       AC       40       00    READ FPDMA QUEUED

    Error #(2812) Contents of registers when command register was written:
     Device state field byte and description:       03 (Active or idle)
     Timestamp (lifetime powered-up hours):         4054
     ERROR Register:                                40
     STATUS Register:                               51 (Error: DRDY, DSC, ERR)
     SECTOR Register:                               29
     LBA LOW Register:                              58
     LBA MIDDLE Register:                           F3
     LBA HIGH Register:                             B0
     DEVICE Register:                               03
     Extended error bytes:                          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     Block Number:                                  b0f358h (11596632)
     Listing of previous 5 commands executed before error (reverse-sequential):
       Time(secs) Command  Feature   Sector  LBA Low  LBA Mid LBA High   Device  DevCtrl  Command Description
       ---------  -------  -------   ------  -------  ------- --------   ------  -------  ------------------------
    203648.062       60       01       10       E2       BC       6C       40       00    READ FPDMA QUEUED
    203648.062       60       01       08       E2       50       C5       40       00    READ FPDMA QUEUED
    203648.062       60       38       00       49       F3       B0       40       00    READ FPDMA QUEUED
    203648.042       60       80       00       80       57       E9       40       00    READ FPDMA QUEUED
    203647.990       EF       10       02       00       00       00       00       00    SET FEATURES [Reserved for Serial ATA]

    Error #(2811) Contents of registers when command register was written:
     Device state field byte and description:       03 (Active or idle)
     Timestamp (lifetime powered-up hours):         4054
     ERROR Register:                                40
     STATUS Register:                               51 (Error: DRDY, DSC, ERR)
     SECTOR Register:                               29
     LBA LOW Register:                              58
     LBA MIDDLE Register:                           F3
     LBA HIGH Register:                             B0
     DEVICE Register:                               03
     Extended error bytes:                          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     Block Number:                                  b0f358h (11596632)
     Listing of previous 5 commands executed before error (reverse-sequential):
       Time(secs) Command  Feature   Sector  LBA Low  LBA Mid LBA High   Device  DevCtrl  Command Description
       ---------  -------  -------   ------  -------  ------- --------   ------  -------  ------------------------
    203641.436       60       80       00       80       57       E9       40       00    READ FPDMA QUEUED
    203641.419       60       38       08       49       F3       B0       40       00    READ FPDMA QUEUED
    203641.419       60       08       00       5C       DC       9A       40       00    READ FPDMA QUEUED
    203641.419       2F       00       01       10       00       00       00       00    READ LOG EXT
    203638.613       60       38       00       49       F3       B0       40       00    READ FPDMA QUEUED

    Error #(2810) Contents of registers when command register was written:
     Device state field byte and description:       03 (Active or idle)
     Timestamp (lifetime powered-up hours):         4054
     ERROR Register:                                40
     STATUS Register:                               51 (Error: DRDY, DSC, ERR)
     SECTOR Register:                               29
     LBA LOW Register:                              58
     LBA MIDDLE Register:                           F3
     LBA HIGH Register:                             B0
     DEVICE Register:                               03
     Extended error bytes:                          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     Block Number:                                  b0f358h (11596632)
     Listing of previous 5 commands executed before error (reverse-sequential):
       Time(secs) Command  Feature   Sector  LBA Low  LBA Mid LBA High   Device  DevCtrl  Command Description
       ---------  -------  -------   ------  -------  ------- --------   ------  -------  ------------------------
    203638.613       60       38       00       49       F3       B0       40       00    READ FPDMA QUEUED
    203638.612       2F       00       01       10       00       00       00       00    READ LOG EXT
    203635.814       60       22       18       10       57       E9       40       00    READ FPDMA QUEUED
    203635.801       60       14       20       09       2D       AC       40       00    READ FPDMA QUEUED
    203635.797       60       08       00       5C       DC       9A       40       00    READ FPDMA QUEUED
  Note:  All ATA registers represented by single HEX byte. The timestamp represents the elapsed
    time in seconds since previous power=on. This wraps back to zero approximately every 50 days because that
    represents 2 ^ 32 milliseconds. Only the last 5 errors are retained by the disk drive per ANSI specification.

Change History (5)

comment:1 in reply to: ↑ description Changed 11 years ago by fernando

Se trata de un disco en controladora c10 ( ses4 parte delantera bay 12)
Serial MN3220F30B24BE
Sustituyo fisicamente por disco nuevo con serial: MN3220F30B0WRE

admin@seal.macc.unican.es:~$ pfexec /etc/smartmon-ux -Health  /dev/rdsk/c10t5000CCA369C50370d0
SMARTMon-UX [Release 1.60, Build  3-SEP-2011] - Copyright 2001-2011 SANtools(R), Inc. http://www.SANtools.com
 - DISK and TAPE health assessment report (short) format
Discovered ATA Hitachi HDS72302 S/N "MN3220F30B0WRE" on /dev/rdsk/c10t5000CCA369C50370d0 (SMART Enabled)(1907729 MB)
 Inquiry Text Page Data - ANSI defined fields
   Device Type:                         disk
   Removable Device:                    NO
   Vendor Identification:               ATA
   Product Identification:              Hitachi HDS72302
   Firmware Revision:                   A580
   Total Capacity (In Bytes):           2000398934016
   Total Primary (factory) defects:     (unsupported)
   IEEE Unique ID:                      50-00-CC-A3-69-C5-03-70
   NAA IEEE ID:                         50-03-04-80-00-D7-14-58

 Inquiry Text Page Data - ATA defined fields
   Device Type:                         Fixed Disk
   Model Number:                        Hitachi HDS723020BLA642
   Serial Number:                       MN3220F30B0WRE
   Interface:                           ATA-8/ACS rev 4
   Firmware Revision:                   MN6OA580
   Usable addressable sectors LBA mode: 268435455
   Total Capacity (In Bytes):           2000398934016
   Download microcode supported:        YES
   Read/write DMA queue code supported: NO
   CFA feature set supported:           NO
   Advanced power management supported: YES
   Max LBA in 48-bit address mode:      3907029168
   Total bytes in 48-bit address mode:  2000398934016
   Logical sector size (bytes):         512
   Form factor:                         3.5 inch
   Rotation speed:                      7200
 S.M.A.R.T. Attributes and Thresholds (Note - Alert made if Current BELOW threshold):
  Attribute# and Description                     Flags  Current Worst Threshold     Value  [Notes]
  (1) Total number of read errors:               0x000b   100    100      16            0
  (2) Throughput Performance:                    0x0005   100    100      54            0
  (3) Spin Up Time:                              0x0007   100    100      24          354
  (4) Start/Stop Count:                          0x0012   100    100       0            6
  (5) Reallocated Sector Count:                  0x0033   100    100       5            0
  (7) Seek Error Rate:                           0x000b   100    100      67            0
  (8) Seek Time Performance:                     0x0005   100    100      20            0
  (9) Power On Hours Count:                      0x0012   100    100       0            1  [0 days 01 hrs]
 (10) Spin Retry Count:                          0x0013   100    100      60            0
 (12) Device Power Cycle Count:                  0x0032   100    100       0            6
(192) Power-off Retract Count:                   0x0032   100    100       0            6
(193) Load Cycle Count:                          0x0012   100    100       0            6
(194) HDD Temperature (Degrees C):               0x0002   214    214       0    128850526236
(196) Reallocated Event Count:                   0x0032   100    100       0            0
(197) Current Pending Sector Count:              0x0022   100    100       0            0
(198) Off-Line Scan Uncorrectable Count:         0x0008   100    100       0            0
(199) Ultra ATA CRC Error Rate:                  0x000a   200    200       0            0

 The current device temperature is:  28C (82F) degrees

 Statistical log pages dump below [# of bytes reserved for value in device]:

/dev/rdsk/c10t5000CCA369C50370d0 polled at Fri Aug 26 13:49:04 2011 Status:Passed


comment:2 Changed 11 years ago by antonio

Al añadirlo al pool se queja de que el disco forma parte de otro pool activo.
El disco lo debemos haber usado para alguna prueba y por ello hay que forzar el comnado:

root@seal.macc.unican.es:~# pfexec zpool add oceano spare c10t5000CCA369C50370d0
invalid vdev specification
use '-f' to override the following errors:
/dev/dsk/c10t5000CCA369C50370d0s0 is part of exported or potentially active ZFS pool tank. Please see zpool(1M).
root@seal.macc.unican.es:~# pfexec zpool add -f oceano spare c10t5000CCA369C50370d0

comment:3 Changed 10 years ago by antonio

  • Resolution set to fixed
  • Status changed from new to closed

comment:4 Changed 10 years ago by fernando

Se sustituye el disco que esta en c5 (ses3 Rear expander trasero) (Bay 2)

  • /dev/rdsk/c5t5000CCA369C52EE4d0
  • S/N "MN3220F30BDGKE"
 iostat -Mexn 60
0.1    0.0    0.0    0.0  0.0  0.1   23.4  553.8   0   1   0   7  12  19 c5t5000CCA369C52EE4d0

root@seal.macc.unican.es:~# pfexec /etc/smartmon-ux -O  /dev/rdsk/c5t5000CCA369C52EE4d0
SMARTMon-UX [Release 1.63, Build  6-OCT-2011] - Copyright 2001-2011 SANtools(R), Inc. http://www.SANtools.com
Discovered ATA Hitachi HDS72302 S/N "MN3220F30BDGKE" on /dev/rdsk/c5t5000CCA369C52EE4d0 (SMART Enabled)(1907729 MB)
  Cumulative errors recorded by disk:               146 (Last 5 entries only)

    Error #(146) Contents of registers when command register was written:
     Device state field byte and description:       03 (Active or idle)
     Timestamp (lifetime powered-up hours):         10667
     ERROR Register:                                40
     STATUS Register:                               51 (Error: DRDY, DSC, ERR)
     SECTOR Register:                               77
     LBA LOW Register:                              09
     LBA MIDDLE Register:                           57
     LBA HIGH Register:                             85
     DEVICE Register:                               01
     Extended error bytes:                          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     Block Number:                                  855709h (8738569)
     Listing of previous 5 commands executed before error (reverse-sequential):
       Time(secs) Command  Feature   Sector  LBA Low  LBA Mid LBA High   Device  DevCtrl  Command Description
       ---------  -------  -------   ------  -------  ------- --------   ------  -------  ------------------------
     76463.410       60       10       00       54       D0       B8       40       00    READ FPDMA QUEUED
     76463.401       60       01       48       D8       CF       B8       40       00    READ FPDMA QUEUED
     76463.394       60       02       30       1B       31       B8       40       00    READ FPDMA QUEUED
     76463.392       60       01       28       8D       83       B7       40       00    READ FPDMA QUEUED
     76463.383       60       01       00       47       9F       B5       40       00    READ FPDMA QUEUED



root@seal.macc.unican.es:~# pfexec /etc/smartmon-ux -Health  /dev/rdsk/c5t5000CCA369C52EE4d0
SMARTMon-UX [Release 1.63, Build  6-OCT-2011] - Copyright 2001-2011 SANtools(R), Inc. http://www.SANtools.com
 - DISK and TAPE health assessment report (short) format
Discovered ATA Hitachi HDS72302 S/N "MN3220F30BDGKE" on /dev/rdsk/c5t5000CCA369C52EE4d0 (SMART Enabled)(1907729 MB)
 Inquiry Text Page Data - ANSI defined fields
   Device Type:                         disk
   Removable Device:                    NO
   Vendor Identification:               ATA
   Product Identification:              Hitachi HDS72302
   Firmware Revision:                   A580
   Total Capacity (In Bytes):           2000398934016
   Total Primary (factory) defects:     (unsupported)
   IEEE Unique ID:                      50-00-CC-A3-69-C5-2E-E4
   NAA IEEE ID:                         50-03-04-80-01-10-1D-4A

 Inquiry Text Page Data - ATA defined fields
   Device Type:                         Fixed Disk
   Model Number:                        Hitachi HDS723020BLA642
   Serial Number:                       MN3220F30BDGKE
   Interface:                           ATA-8/ACS rev 4
   Firmware Revision:                   MN6OA580
   Usable addressable sectors LBA mode: 268435455
   Total Capacity (In Bytes):           2000398934016
   Max LBA in 48-bit address mode:      3907029168
   Total bytes in 48-bit address mode:  2000398934016
   Logical sector size (bytes):         512
   Form factor:                         3.5 inch
   Rotation speed:                      7200
 S.M.A.R.T. Attributes and Thresholds (Note - Alert made if Current BELOW threshold):
  Attribute# and Description                     Flags  Current Worst Threshold     Value  [Notes]
  (1) Raw Read Error Rate:                       0x000b    32     32      16    390915870
  (2) Throughput Performance:                    0x0005   136    136      54           83
  (3) Spin Up Time:                              0x0007   131    131      24          438  [Average 438]
  (4) Start/Stop Count:                          0x0012   100    100       0           21
  (5) Reallocated Sector Count:                  0x0033     1      1       5         1929
  (7) Seek Error Rate:                           0x000b   100    100      67            0
  (8) Seek Time Performance:                     0x0005   130    130      20           28
  (9) Power On Hours Count:                      0x0012    99     99       0        10682  [445 days 02 hrs]
 (10) Spin Retry Count:                          0x0013   100    100      60            0
 (12) Device Power Cycle Count:                  0x0032   100    100       0           21
(192) Power-off Retract Count:                   0x0032   100    100       0           69
(193) Load Cycle Count:                          0x0012   100    100       0           69
(194) HDD Temperature (Degrees C)+:              0x0002   193    193       0           31  [Lifetime Min=25C, Max=33C]
(196) Reallocated Event Count:                   0x0032     1      1       0         2462
(197) Current Pending Sector Count:              0x0022   100    100       0           45
(198) Off-Line Scan Uncorrectable Count:         0x0008   100    100       0            1
(199) Ultra ATA CRC Error Rate:                  0x000a   200    200       0           59

 The current device temperature is:  31C (87F) degrees

 Statistical log pages dump below [# of bytes reserved for value in device]:

/dev/rdsk/c5t5000CCA369C52EE4d0 polled at Tue May 29 11:06:53 2012 Status:FAILED - Failure imminent (Hardware impending failure general hard drive failure)

comment:5 Changed 10 years ago by fernando

Al sacarlo desde el estado offline:

May 29 12:51:53 seal.macc.unican.es genunix: [ID 408114 kern.info] /pci@0,0/pci8086,340c@5/pci1000,3080@0/iport@f0/disk@w5000cca369c52ee4,0 (sd103) offline

Al meter el disco nuevo:

May 29 12:58:37 seal.macc.unican.es scsi: [ID 583861 kern.info] sd156 at mpt_sas3: unit-address w5000cca369d347ce,0: w5000cca369d347ce,0
May 29 12:58:37 seal.macc.unican.es genunix: [ID 936769 kern.info] sd156 is /pci@0,0/pci8086,340c@5/pci1000,3080@0/iport@f0/disk@w5000cca369d347ce,0
May 29 12:58:37 seal.macc.unican.es genunix: [ID 408114 kern.info] /pci@0,0/pci8086,340c@5/pci1000,3080@0/iport@f0/disk@w5000cca369d347ce,0 (sd156) online

Seguia apareciendo el dispositivo antiguo como offline.Asi que comienzo de nuevo, desconfigurando manualmente:

cfgadm -c unconfigure c5::dsk/c5t5000CCA369C52EE4d0

Como el disco estropeado aun estando en el estado Offliene no desaparece:

zpool detach oceano c5t5000CCA369C52EE4d0

Me ha iniciado de nuevo el resilvering:Y para añadir el nuevo disco:

 zpool add oceano spare  c5t5000CCA369D347CEd0
Last edited 10 years ago by fernando (previous) (diff)
Note: See TracTickets for help on using tickets.