不良セクタ(Offline_Uncorrectable)の復旧を試してみた備忘録
1. S.M.A.R.T. に不良セクタが出てる状態
# smartctl -A /dev/sdk smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.11.10-200.fc19.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 3 3 Spin_Up_Time 0x0027 166 166 021 Pre-fail Always - 6700 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 111 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 062 062 000 Old_age Always - 27878 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 109 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 85 193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 4251894 194 Temperature_Celsius 0x0022 120 103 000 Old_age Always - 30 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 5
※ Offline Uncorrectable が 1つ
2. HDDに対してテストを行って不良セクタの位置を調べる
# smartctl -t short /dev/sdk smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.11.10-200.fc19.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in off-line mode". Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 2 minutes for test to complete. Test will complete after Wed Dec 18 02:18:19 2013 Use smartctl -X to abort test.
3. 結果の確認
# smartctl -l selftest /dev/sdk smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.11.10-200.fc19.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 10% 27877 47974528
4. 不良セクタにhdparmを使用して0を書いてみる
# hdparm --write-sector 47974528 --yes-i-know-what-i-am-doing /dev/sdk /dev/sdk: re-writing sector 47974528: succeeded
※ なんとも素敵なオプション名である。
5. 再度テストを行う
# smartctl -t short /dev/sdk smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.11.10-200.fc19.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in off-line mode". Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 2 minutes for test to complete. Test will complete after Wed Dec 18 02:44:20 2013 Use smartctl -X to abort test.
6. 再テスト結果の確認
# smartctl -l selftest /dev/sdk smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.11.10-200.fc19.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 27878 - # 2 Short offline Completed: read failure 10% 27877 47974528
やったね!エラーが無くなった
-t long を行うとS.M.A.R.T.上の表示も消えるらしいけど、実行時間が掛かり過ぎるのでやらないことにした。