Version 5 (modified by fernando, 8 years ago) (diff) |
---|
Await y DT01ACA200
En los nodos twin con discos TOSHIBA DT01ACA200, tienen raid software, la escritura cae a niveles de 10Mb/s, sin patron aparente, cuando esto sucede:
- Solo uno uno de los discos que forma el dispositivo md raid se atora, se ve con "iostat -xd 2" fijandonos en await
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdb 0.00 854.50 0.00 35.50 0.00 7120.00 200.56 0.29 8.11 0.41 1.45 sda 0.00 882.50 0.00 31.00 0.00 29688.00 957.68 129.44 '''3470.82''' 32.26 100.00
- Paremetros de S.M.A.R.T que varian sus valores habituales :
- Raw_Read_Error_Rate: cuando va bien 0 despues valores >0 sin significado decimal
- Throughput_Performance y Seek_Time_Performance incrementan su valor por encima de los habituales
- Los test "smartctl -t long" y "badblocks -s v" no dan errores sobre el disco atorado.
Buscando Soluciones
Por el momento la solucion es poner al disco en standby (no afecta al sistema, ni al raid, cuestion de segundos ), y con esto el disco vuelve a tasas habituales.
[root@wn013 sbin]# hdparm -C /dev/sda; hdparm -y /dev/sda ;hdparm -C /dev/sda /dev/sda: drive state is: active/idle /dev/sda: issuing standby command /dev/sda: drive state is: standby
Con esta operacion aumentalos los contadores SMART: start_stop_count , power-off_retract_count, load_cycle_count
cexec macc2:1,3,5,7,9,11,13,15 "smartctl -a /dev/sda |grep -e Start -e Power_C -e Power-Off -e Load ; smartctl -a /dev/sdb |g ************************* macc2 ************************* --------- wn011--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 12 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 11 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 21 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 21 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 13 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 12 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 21 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 21 --------- wn013--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 17 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 15 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 25 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 25 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 16 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 27 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 27 --------- wn015--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 45 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 44 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 51 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 51 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 39 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 37 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 45 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 45 --------- wn017--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 22 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 34 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 34 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 23 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 33 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 33 --------- wn019--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 12 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 12 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 21 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 21 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 13 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 12 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 20 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 20 --------- wn021--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 4 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 42 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 42 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 4 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 42 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 42 --------- wn023--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 3 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 41 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 41 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 3 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 41 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 41 --------- wn025--------- 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 14 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 26 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 26 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 16 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 24 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 24