Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Troubleshooting Sure Solution 1008190.1 : Troubleshooting Sun StorEdge [TM] 33x0/351x Disk Failures
PreviouslyPublishedAs 211249
Applies to:Sun Storage 3320 SCSI ArraySun Storage 3310 Array Sun Storage 3510 FC Array Sun Storage 3511 SATA Array All Platforms PurposeTo discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Storage Disk 3000 Series RAID Arrays Symptoms:
Purpose/scope: This document addresses troubleshooting disk devices in a StorEdge TM] 33x0/351x array. Last Review DateMay 13, 2011Instructions for the ReaderA Troubleshooting Guide is provided to assist
in debugging a specific issue. When possible, diagnostic tools are included in the document
to assist in troubleshooting.
Troubleshooting DetailsSteps to FollowNOTE: This is a sub-set of DocID 1011431.1: "Troubleshooting Sun StorEdge[TM] 33x0/351x Hardware." Before replacing a failed drive, save the configuration settings to NVRAM as described in Section 6.2, Saving the NVRAM Configuration Settings in the: Sun StorEdge 3000 Family FRU Installation Guide or refer to: Sun StorEdge[TM] 3000: Saving and Restoring NVRAM and Logical Drive Configuration (Doc ID 1012254.1) Note:
This document references steps for an array at 4.x firmware
revision. If array has 3.2x controller firmware, you will need to reference the appropriate revision of the Firmware User's Guide. Troubleshooting Steps Step 1 - Verify the physical disk status is ONLINE by issuing a sccli>show disks command or through the firmware interface steps described in: Viewing the Status of a Physical Drive in the Sun StorEdge[TM] 3000 Family RAID Firmware 4.2x User's Guide Refer to the Physical Drive Status Table for a list of all possible drive status. Step 2 - Verify the logical drive status is GOOD, INITING, or REBUILDING by issuing the sccli>show logical drive command, or through the firmware interface steps described in: Viewing the Logical Drive Status Table the Sun StorEdge[TM] 3000 Family RAID Firmware 4.2x User's Guide which also includes a list of possible logical drive states. Step3 - view the eventlog or persistent eventlog to confirm the 'show disk' and 'show logical' output by running the sccli> show events or sccli> show persistent-events identifying any failed disk drives. Step 4 - if more than one drive is in a MISSING or BAD state, or a logical drive is in a FATAL FAIL state: Determine that this isn't a Redundant Loop Failure as described in : Troubleshooting StorEdge [TM] 351x Redundant Loop Failures (Doc ID 1006856.1) For 3510 arrays only, issue the sccli>show disks command to determine if the drives are Fujitsu drives and meet all the following criteria:
Step 5 - if more than one drive is in a MISSING or BAD state, or the logical drive is in a FATAL FAIL state, follow the steps described in: Recovering From Fatal Drive Failure in the Sun StorEdge 3000 Family Installation, Operation, and Service Manual. Step 6 - For multiple drives with an amber LED status, or if a logical drive is in an INCOMPLETE or DRV ABSENT state where there are more than 2 drives missing or failed, power cycle the array following the procedures in: Section 2.2.5.2 Checking and Performing the Correct Power-up Sequence in the: Sun StorEdge 3000 Family FRU Installation Guide Step 7 - If the logical drive is in a degraded (DRV FAIL) state and you have one failed drive (BAD or ABSENT): - for a Raid array (not JBOD): identify the failed drive by following the steps in Identifying the Defective Disk Drive in a Raid Array in in the: Sun StorEdge 3000 Family FRU Installation Guide -for a JBOD: identify the failed drive following the steps in Identifying the Defective Disk Drive in a JBOD Array in the: Sun StorEdge 3000 Family FRU Installation Guide Replace the identified failed drive, following the instructions in: Removing a Defective Disk Drive in a RAID or JBOD Array followed by: Installing a New Disk Drive in a RAID or JBOD Array. Step 8 - Verify the state of the new disk is FRMT, NEW, USED, or GOOD by following step 1 above. Step 9 - Verify the state of the logical drive by following step 2 above to verify the status is either GOOD or REBUILDING. - If the target logical drive status is GOOD, the spare disk is successfully protected and is now integrated into the logical drive, and the replacement disk drive is available to be assigned as a global spare. See: Assigning a Disk Drive as a Spare . - If the target logical drive status is DEGRADED, follow the steps: Assigning a Disk Drive as a Spare and then initiate a REBUILD operation. Step 10 - Run the command sccli>show events to determine if there are "Drive Recovered Error Reported" messages. If this is true, refer to: Sun StorEdge [TM] 351x FC: How to Handle "Drive Recovered Error Reported" and Other Disk Drive Messages (Doc ID 1008255.1) Step 11 - If disk problems still persist, refer back to Troubleshooting Sun StorEdge[TM] 33x0/351x Hardware (Doc ID 1011431.1) Internal Comments This document contains normalized content and is managed by the the Domain Lead(s) of the respective domains. To notify content owners of a knowledge gap contained in this document, and/or prior to updating this document, please contact the domain engineers that are managing this document via the "Document Feedback" alias(es) listed below: storage-os-disk-low-domain@sun.com 3510, 3511, Drive Failure, disk failure, double drive failure, normalized, audited Previously Published As 89045 Change History Date: 2010-01-14 User Name: sue.copeland@sun.com Action: Currency & Update Attachments This solution has no attachment |
||||||||||||
|