Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1001052.1
Update Date:2011-02-28
Keywords:

Solution Type  Sun Alert Sure

Solution  1001052.1 :   Logical Drives May be Lost if "auto-global-spare" is Enabled  


Related Items
  • Sun Storage 3510 FC Array
  •  
  • Sun Storage 3310 Array
  •  
  • Sun Storage 3511 SATA Array
  •  
  • Sun Storage 3320 SCSI Array
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Data Loss
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
201379


Product
Sun StorageTek 3310 SCSI Array
Sun StorageTek 3510 FC Array
Sun StorageTek 3320 SCSI Array
Sun StorageTek 3511 SATA Array

Bug Id
<SUNBUG: 6423521>

Date of Workaround Release
09-NOV-2006

Date of Resolved Release
31-Mar-2008

On a Sun StorEdge 3310/3320/3510/3511 array, logical drives may be lost ... see below:

1. Impact

On a Sun StorEdge 3310/3320/3510/3511 array, logical drives may be lost if "auto-global-spare" is enabled when the drives are in a "NONE BAD" state and the array is rebooted.

Drives in existing logical drives that are marked as "NONE BAD" by the controller become "Global Spares" upon an array reboot. If the value "auto-global-spare" is enabled, any existing logical drives may be lost.


2. Contributing Factors

This issue can occur on the following platforms:

  • Sun StorEdge 3310 SCSI array
  • Sun StorEdge 3320 SCSI array
  • Sun StorEdge 3510 FC array
  • Sun StorEdge 3511 FC array

This issue only occurs if "auto-global-spare" is enabled. To determine if "auto-global-spare" is enabled, the following command can be used:

    % sccli> show drive-parameters
    spin-up: disabled
    disk-access-delay: 15s
    scsi-io-timeout: 30s
    queue-depth: 32
    polling-interval: disabled
    enclosure-polling-interval: 30s
    auto-detect-swap-interval: disabled
    smart: disabled
    auto-global-spare: disabled

Note: "auto-global-spare" is disabled by default.


3. Symptoms

If the described issue occurs, the drives that the controller flags as "failed" will have their status changed to "NONE BAD" when the array is rebooted. After the reboot these drives are not believed to belong to any logical drive since they were failed. If the configuration has "auto-global- spare" enabled, devices that are not used in a logical device are flagged as "Global Spares".

In the following example, the drives are in a "NONE BAD" state, and the logical drives are dead:

    * disks
    Ch  Id        Size   Speed  LD     Status      IDs
    -----------------------------------------------------------------------
    3   0        0MB   200MB  NONE     BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTVW
    3   1        0MB   200MB  NONE     BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTU6
    3   2        0MB   200MB  NONE     BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTWP
    3   3        0MB   200MB  NONE     BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTSL
    3   4        0MB   200MB  ld0      BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTU7
    3   5        0MB   200MB  NONE     BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTU5
    3   6        0MB   200MB  NONE     BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTYB
    3   7        0MB   200MB  NONE     BAD      <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTVE
    3   8     68.37GB   200MB  NONE     USED    <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTL9
    3   9     68.37GB   200MB  NONE     USED    <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTLC
    3  10     68.37GB   200MB  NONE     USED    <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTU1
    3  11     68.37GB   200MB  NONE     USED    <vendor> MAP3735F SUN72G 1601
    S/N 000507Q0JTJM
    * logical-drives
    LD    LD-ID        Size  Assigned    Type    Disks Spare  Failed Status
    -------------------------------------------------------------------------
    ld0   1AC556BB    272.47GB  Primary    RAID5     3    0    1       Dead
    ld1   7F9563E1    272.47GB  Primary    RAID5     3    0    1       Dead

The drives that were marked "BAD" in logical drives 0 and 1 were moved to "Global Spares" when the array was rebooted. If "auto-global-spare is in the enabled state, it will prevent any possible logical disk recovery options.


4. Workaround

To work around the described issue, under disk parameters, set the value for "Auto-Global Spare" to disabled.

From the main menu:

    Choose "view and edit Configuration Parameters -> Drive-side Parameters ->
    Auto-Assign Global Spare Drive," and set the value for "Auto-Global Spare"
    to "disable".

This is recommended to keep any available drives from automatically becoming a Global Spare when they are scanned in.


5. Resolution
There are no further updates planned for this Sun Alert document. If
you need additional assistance regarding this issue, please contact Sun
Services.


This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.

Copyright 2000-2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.


Modification History
31-Mar-2008: no further updates. Resolved.


Previously Published As
102452
Internal Comments


Important Note:



An escalation needs to be raised upon customer questions and then the Sun Alert can be updated as the CR is updated.



Email Submitter: Sue.Copeland@Sun.COM and Steven.Kent@Sun.COM if you have questions.



                       ~~~~



After a backend loop failure, described below, when drives get scanned in, the controller will set these newly scanned drives as "Global Spares" if "auto-global-spare" is enabled. If there were existing logical drives, then they are lost, and there will be no way to recover the logical drives other than to retrieve them from backup.



The backend loop failure found on the array was found to be caused by two bugs. Existing information shows repeated charging cycles, a clear sign of SES bugID 6315439 <http://monaco.sfbay/detail.jsp?cr=6315439>. This bug puts LIP's on the fiber channel loop which triggered Fujitsu drive bugID 6319828 <http://monaco.sfbay/detail.jsp?cr=6319828>.



With both bugs in affect, multiple drives are removed from the FC loop. On the reboot of the array the "auto global spare", which was changed to enabled, moved all the removed drives into the loop as "Global Spares".


Internal Contributor/submitter
Sue.Copeland@Sun.COM

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Eng Responsible Engineer
keqiu.chen@sun.com

Internal Services Knowledge Engineer
jeff.folla@sun.com, karen.edwards@sun.com

Internal Escalation ID
1-16830721

Internal Sun Alert & FAB Admin Info
Critical Category: Data Loss
Significant Change Date: 2006-11-09
Avoidance: Workaround
Responsible Manager: tejinder.singh@sun.com
Original Admin Info: [WF 09-Nov-2006, karen: BU did not give Jeff approval to release. Per the following email I am releasing.
From: Steven Kent
To: Karen.Edwards@Sun.COM
CC: Sue Copeland , jeff folla , Susan Chang , Tejinder Singh
Fine by me.

Rgds, Steve

Karen Edwards wrote:

> Then we must proceed with a Sun Alert.
>
> We can release the Sun Alert with what we know and
> include in the Internal section that an escalation
> needs to be raised upon customer questions and
> then the Sun Alert can be updated as the CR is updated.
>
> thanks,
> karen]
[Resubmitted 05-Nov-2006, dave m: metrics ran out of control on this, setting resubmit date to comply...]

[WF 25-Jul-2006, Jeff Folla: This Sun Alert is being put on-hold per Steve Kent's email above in the desc. section.]
Internal Sun Alert Kasp Legacy ID
102452

Product_uuid
3db30178-43d7-4d85-8bbe-551c33040f0d|Sun StorageTek 3310 SCSI Array
58553d0e-11f4-11d7-9b05-ad24fcfd42fa|Sun StorageTek 3510 FC Array
95288bce-56d3-11d8-9e3a-080020a9ed93|Sun StorageTek 3320 SCSI Array
9fdbb196-73a6-11d8-9e3a-080020a9ed93|Sun StorageTek 3511 SATA Array

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback