Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Sun Alert Sure Solution 1001052.1 : Logical Drives May be Lost if "auto-global-spare" is Enabled
PreviouslyPublishedAs 201379 Product Sun StorageTek 3310 SCSI Array Sun StorageTek 3510 FC Array Sun StorageTek 3320 SCSI Array Sun StorageTek 3511 SATA Array Bug Id <SUNBUG: 6423521> Date of Workaround Release 09-NOV-2006 Date of Resolved Release 31-Mar-2008 On a Sun StorEdge 3310/3320/3510/3511 array, logical drives may be lost ... see below: 1. Impact On a Sun StorEdge 3310/3320/3510/3511 array, logical drives may be lost if "auto-global-spare" is enabled when the drives are in a "NONE BAD" state and the array is rebooted. Drives in existing logical drives that are marked as "NONE BAD" by the controller become "Global Spares" upon an array reboot. If the value "auto-global-spare" is enabled, any existing logical drives may be lost. 2. Contributing Factors This issue can occur on the following platforms:
This issue only occurs if "auto-global-spare" is enabled. To determine if "auto-global-spare" is enabled, the following command can be used: % sccli> show drive-parameters spin-up: disabled disk-access-delay: 15s scsi-io-timeout: 30s queue-depth: 32 polling-interval: disabled enclosure-polling-interval: 30s auto-detect-swap-interval: disabled smart: disabled auto-global-spare: disabled Note: "auto-global-spare" is disabled by default. 3. Symptoms If the described issue occurs, the drives that the controller flags as "failed" will have their status changed to "NONE BAD" when the array is rebooted. After the reboot these drives are not believed to belong to any logical drive since they were failed. If the configuration has "auto-global- spare" enabled, devices that are not used in a logical device are flagged as "Global Spares". In the following example, the drives are in a "NONE BAD" state, and the logical drives are dead: * disks Ch Id Size Speed LD Status IDs ----------------------------------------------------------------------- 3 0 0MB 200MB NONE BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTVW 3 1 0MB 200MB NONE BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTU6 3 2 0MB 200MB NONE BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTWP 3 3 0MB 200MB NONE BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTSL 3 4 0MB 200MB ld0 BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTU7 3 5 0MB 200MB NONE BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTU5 3 6 0MB 200MB NONE BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTYB 3 7 0MB 200MB NONE BAD <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTVE 3 8 68.37GB 200MB NONE USED <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTL9 3 9 68.37GB 200MB NONE USED <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTLC 3 10 68.37GB 200MB NONE USED <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTU1 3 11 68.37GB 200MB NONE USED <vendor> MAP3735F SUN72G 1601 S/N 000507Q0JTJM * logical-drives LD LD-ID Size Assigned Type Disks Spare Failed Status ------------------------------------------------------------------------- ld0 1AC556BB 272.47GB Primary RAID5 3 0 1 Dead ld1 7F9563E1 272.47GB Primary RAID5 3 0 1 Dead The drives that were marked "BAD" in logical drives 0 and 1 were moved to "Global Spares" when the array was rebooted. If "auto-global-spare is in the enabled state, it will prevent any possible logical disk recovery options. 4. Workaround To work around the described issue, under disk parameters, set the value for "Auto-Global Spare" to disabled. From the main menu: Choose "view and edit Configuration Parameters -> Drive-side Parameters -> Auto-Assign Global Spare Drive," and set the value for "Auto-Global Spare" to "disable". This is recommended to keep any available drives from automatically becoming a Global Spare when they are scanned in. 5. Resolution There are no further updates planned for this Sun Alert document. If This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements. Copyright 2000-2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. Modification History 31-Mar-2008: no further updates. Resolved. Previously Published As 102452 Internal Comments Important Note: An escalation needs to be raised upon customer questions and then the Sun Alert can be updated as the CR is updated. Email Submitter: Sue.Copeland@Sun.COM and Steven.Kent@Sun.COM if you have questions. ~~~~ After a backend loop failure, described below, when drives get scanned in, the controller will set these newly scanned drives as "Global Spares" if "auto-global-spare" is enabled. If there were existing logical drives, then they are lost, and there will be no way to recover the logical drives other than to retrieve them from backup. The backend loop failure found on the array was found to be caused by two bugs. Existing information shows repeated charging cycles, a clear sign of SES bugID 6315439 <http://monaco.sfbay/detail.jsp?cr=6315439>. This bug puts LIP's on the fiber channel loop which triggered Fujitsu drive bugID 6319828 <http://monaco.sfbay/detail.jsp?cr=6319828>. With both bugs in affect, multiple drives are removed from the FC loop. On the reboot of the array the "auto global spare", which was changed to enabled, moved all the removed drives into the loop as "Global Spares". Internal Contributor/submitter Sue.Copeland@Sun.COM Internal Eng Business Unit Group NWS (Network Storage) Internal Eng Responsible Engineer keqiu.chen@sun.com Internal Services Knowledge Engineer jeff.folla@sun.com, karen.edwards@sun.com Internal Escalation ID 1-16830721 Internal Sun Alert & FAB Admin Info Critical Category: Data Loss Significant Change Date: 2006-11-09 Avoidance: Workaround Responsible Manager: tejinder.singh@sun.com Original Admin Info: [WF 09-Nov-2006, karen: BU did not give Jeff approval to release. Per the following email I am releasing. From: Steven Kent To: Karen.Edwards@Sun.COM CC: Sue Copeland Fine by me. Rgds, Steve Karen Edwards wrote: > Then we must proceed with a Sun Alert. > > We can release the Sun Alert with what we know and > include in the Internal section that an escalation > needs to be raised upon customer questions and > then the Sun Alert can be updated as the CR is updated. > > thanks, > karen] [Resubmitted 05-Nov-2006, dave m: metrics ran out of control on this, setting resubmit date to comply...] [WF 25-Jul-2006, Jeff Folla: This Sun Alert is being put on-hold per Steve Kent's email above in the desc. section.] Internal Sun Alert Kasp Legacy ID 102452 Product_uuid 3db30178-43d7-4d85-8bbe-551c33040f0d|Sun StorageTek 3310 SCSI Array 58553d0e-11f4-11d7-9b05-ad24fcfd42fa|Sun StorageTek 3510 FC Array 95288bce-56d3-11d8-9e3a-080020a9ed93|Sun StorageTek 3320 SCSI Array 9fdbb196-73a6-11d8-9e3a-080020a9ed93|Sun StorageTek 3511 SATA Array Attachments This solution has no attachment |
||||||||||||
|