Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1000043.1
Update Date:2011-02-22
Keywords:

Solution Type  Sun Alert Sure

Solution  1000043.1 :   Changing the Array Volume-to-LUN Mappings May Cause Array Controllers to Initiate a Boot-Loop  


Related Items
  • Sun Storage 6540 Array
  •  
  • Sun Storage 5310 NAS Appliance
  •  
  • Sun Storage 6130 Array
  •  
  • Sun Storage 6140 Array
  •  
  • Sun Storage 5320 NAS Appliance
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
200057


Product
Sun StorageTek 5310 NAS Appliance
Sun StorageTek 5320 NAS Appliance
Sun StorageTek 6130 Array
Sun StorageTek 6140 Array
Sun StorageTek 6540 Array

Bug Id
<SUNBUG: 6542513>

Date of Preliminary Release
03-MAY-2007

Date of Resolved Release
21-Oct-2008

Changing the Array Volume-to-LUN Mappings May Cause Array Controllers to Initiate a Boot-Loop

1. Impact

For Sun StorEdge/StorageTek arrays with certain firmware, changing the array volume-to-LUN mappings may cause array controllers to initiate a boot-loop and immediate loss of access to the arrays.

2. Contributing Factors


This issue can occur on the following platforms:
  • Sun StorEdge 5310 with patch 119351-10 or later
  • Sun StorageTek 5320 with patch 119352-05 or later
  • Sun StorEdge 6130 with firmware 06.19.23.10 or later
  • Sun StorageTek 6140 with firmware 06.19.23.10 or later
  • Sun StorageTek 6540 with firmware 06.19.23.10 or later

Note: Firmware version 06.19.23.10 is released with Sun StorageTek Common Array Manager (CAM) 5.1.

This issue presents itself after manipulating the volume mappings on the array via the host mapping interface, or by the creation or deletion of Remote Replication sets.


3. Symptoms

Data hosts using the impacted arrays will see evidence of a loss of access in the host messages file. The messages vary from platform to platform, but will indicate:
  • SCSI retries
  • Fibre Channel resets
  • Devices disappearing from the Fibre Channel switch fabric

Management hosts attempting get status information from the array will get a communication error. This error can vary, since the controller boot cycle can allow brief connections via the user interface. Communications errors may present themselves during any activity.

For 5310 and 5320 NAS Appliances, users will be unable to access shares or NFS export filesystems.

The best way to identify the boot loop issue is by connecting to the array via the serial port, and observing any messages that appear. The message signature are similar (but not limited to) the following, and are easily identified by the scrolling nature of the messages:

    03/30/07-14:45:51 (GMT) (tRAID): NOTE:  CCM: MirrorInit, clearing MOS (2)
    03/30/07-14:45:56 (GMT) (tRAID): NOTE:  Volume discovery...
    03/30/07-14:45:56 (GMT) (tRAID): NOTE:  Scanning usable NATIVE drives
    03/30/07-14:45:59 (GMT) (ssmTimer): NOTE:  CCM BMG: Battery is OK
    03/30/07-14:46:01 (GMT) (tRAID): NOTE:  evf - Peering Disabled (Alt in SOD)
    03/30/07-14:46:01 (GMT) (tRAID): WARN:
    -=<###>=-
    Attaching interface lo0...done
    Adding 7246 symbols for standalone.

OR:

    03/30/07-14:47:44 (GMT) (tRAID): WARN:  evf - Bad Check In
    TransportException Error
    03/30/07-14:47:44 (GMT) (iacTask1): NOTE:  RPM: Unable to find
    RemoteResponse - msgId: 18
     03/30/07-14:47:44 (GMT) (tRAID): WARN:
    -=<###>=-
    Attaching interface lo0...done
    Adding 7246 symbols for standalone.

OR:

    03/30/07-14:49:04 (GMT) (tRAID): NOTE:  rpaBattMgr instantiation complete.
    03/30/07-14:49:07 (GMT) (tRAID): NOTE:  Inter-Controller Communication
    Channels Opened
    03/30/07-14:49:07 (GMT) (utlTimer): NOTE:  fcnChannelReport ==> ~0 ~1
    -=<###>=-
    Attaching interface lo0...done
    Adding 7246 symbols for standalone.
    -> WARNING: Reset by alternate controller

4. Workaround

It is recommended to update to 06.19.25.16 as soon as possible, as it includes additional debugging for this issue.

Note: Reverting to an earlier revision of the firmware is not advised since this would cause the loss of other fixes bundled into this version of the firmware.

Note: With the above mentioned update, the array will now capture a Diagnostic Data Collection when the condition occurs. The data can be collected after the issue is reported and corrected. By design, the data collection will turn on the global fault LED on an array when the data is present for download. If your system has experienced this issue, do a Diagnostic Data Collection capture and contact Sun Support.


5. Resolution


Please see the "Workaround" section above.

This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.
Copyright 2000-2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.

Modification History
16-Oct-2007: Updated Workaround section
21-Oct-2008: Updated Workaround section for Resolution; Resolved



Previously Published As
102904
Internal Contributor/submitter
curtis.decotis@sun.com, jose.sousa@sun.com

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Eng Responsible Engineer
curtis.decotis@sun.com

Internal Services Knowledge Engineer
david.mariotto@sun.com

Internal Escalation ID
1-21426863, 1-21504582

Internal Sun Alert Kasp Legacy ID
102904

Internal Sun Alert & FAB Admin Info
Critical Category: Availability ==> Severe
Significant Change Date: 2007-05-03
Avoidance: None
Responsible Manager: larry.robinson@sun.com
Original Admin Info: [WF 03-May-2007, dave m: review ok, send for release]
[WF 02-May-2007, dave m: draft created, waiting on more info, send for review]
21-Oct-2008, david m: Rewrite Workaround as Resolution, republish Resolved


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback