Hosts Utilizing Sun StorEdge 6920 Arrays May Lose Access to the Array During a Volume Snapshot Device Failure

Asset ID:	1-77-1000469.1
Update Date:	2011-02-28
Keywords:

Solution Type Sun Alert Sure

Solution 1000469.1 : Hosts Utilizing Sun StorEdge 6920 Arrays May Lose Access to the Array During a Volume Snapshot Device Failure

Related Items


Sun Storage 6920 System

Related Categories


GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
 GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved

PreviouslyPublishedAs
200607

Product
Sun StorageTek 6920 System

Bug Id
<SUNBUG: 6293457>

Date of Workaround Release
09-SEP-2005

Date of Resolved Release
03-JAN-2007

Impact

Hosts utilizing Sun StorEdge 6920 arrays may lose access to the array during a volume snapshot device failure.

Contributing Factors

This issue can occur on the following platforms:

Sun StorEdge 6920 without Data Services Platform (DSP) 1000 firmware version 3.0.1.13

Symptoms

Should the described issue occur, StorADE for the Sun StorEdge 6920 array will report messages similar to the following:

    LMay 18 10:24:55 dsp00  05/18/2005 09:24:47 LOG_CRIT     (SNAPSHOT: 0-0)
    Snapshot (vol/SRVol-02) is 100 percent full.
    LMay 18 10:30:11 dsp00  05/18/2005 09:30:02 LOG_CRIT     (VSM: 4-4)
    Snapshot failure (snpIOIntentInstWrite/snpIOContextCopyOpWrite), WWN =
    fffffff0-0002c400-00002c40-00000706 (dev = 0x821

The error messages above may be followed by application failure messages.

Workaround

To work around the described issue, increase the percentage change that is allowed for the snapshot by using either the "Configuration Services Graphical User Interface", or from the sscs(1M) command line. This process requires that the existing snapshot, and snapshot reserve space be deleted.

Note: The below workaround will not address snapshot failures that are the result of a hardware component failure of the Snapshot Reserve Pool.

For Configuration Services:

1. Input your Service Processor URL into a web browser.

2. Log into the Service Processor GUI.

3. Click on the "Configuration Services" link.

4. Click on the "Logical" Tab.

5. Click on the "Volumes" Tab.

6. Click on the source volume name for the snapshot(s) to get to the "Volume Details" page for this volume.

7. Scroll Down to the "Additional Information" section and click on the "Snapshots" link.

8. Click on each snapshot to access the volume details for that snapshot, and then do the following:

a. Click "Unmap" if highlighted

b. Click "Delete"

9. Go back to the "Volume Details" page for the source volume, and click the "Delete Snapshot Reserve" button.

10. Click on the "Snapshot" button to open the "New Snapshot Creation Wizard".

11. For Step 1, enter the new snapshot name and description in the appropriate fields and click "Next".

12. For Step 2, enter the number of snapshots, and set the % difference to 100 and click "Next".

Note: As indicated, a level(-L) of "full" allocates storage space equal to the source volume specified. The amount of space is calculated as:

    Snapshot Count * Size of Source = Snapshot Reserve Space

13. If you wish to map the snapshot to an initiator, do so in Step 3 and click "Next".

14. Review and click "Finish".

This will create a single snapshot. To create another snapshot using the GUI, click on the "Snapshot" button, and repeat the above steps.

Note: The "Configuration Services" help menu provides additional information about the snapshot utility.

    Help >> Configuring System >> Configuring Volume Snapshot >> Managing Volume Snapshots

From the sscs(1M) command line:

1. login to sscs(1M):

    % sscs login -h <management host> -u storage

2. Unmap the snapshot volume:

    % sscs unmap -S <domain> volume <snapshot volume>

3. Delete the snapshot volume:

    % sscs delete -S <domain> volume <snapshot volume>

4. Delete the snapshot reserve:

    % sscs snapshot -r volume <source_volume>

5. Create new snapshot reserve with 100% space:

    % sscs snapshot -v <source_volume> -C <#snapshots> -L full volume <snapshot volume name>

This will create a single snapshot with the above name, for additional snapshots, do the following:

    % sscs snapshot -v <source volume> volume <new_snapshot_name>

Note: As indicated, a level(-L) of "full" allocates storage space equal to the source volume specified. The amount of space is calculated as:

    Snapshot Count * Size of Source = Snapshot Reserve Space

Use the man page for sscs(1M) to supplement this procedure, if required.

Resolution

This issue is addressed in the following releases:

Sun StorEdge 6920 with Data Services Platform (DSP) 1000 firmware version 3.0.1.13 or later

Modification History
Date: 03-JAN-2007

State: Resolved
Updated Contributing Factors and Relief/Workaround sections

Previously Published As
101912
Internal Comments

This issue occurs because the ASC/ASCQ code returned for the snapshot failure causes the host pathing software to retry and fail in a shorter period of time than that which is required for the DSP to reconfigure the ports after the snapshot failure.

We will now return a different, more generic code which will cause the host to timeout and retry at a much longer interval (up to ten minutes).

Internal Contributor/submitter
Curtis.DeCotis@sun.com

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Eng Responsible Engineer
alex.iannicelli@sun.com

Internal Services Knowledge Engineer
jeff.folla@sun.com

Internal Escalation ID
1-9201973

Internal Sun Alert Kasp Legacy ID
101912

Internal Sun Alert & FAB Admin Info
Critical Category: Availability ==> HA-Failure
Significant Change Date: 2005-09-09, 2007-01-03
Avoidance: Upgrade, Workaround
Responsible Manager: peter.s.thompson@sun.com
Original Admin Info: [WF 07-Sep-2005, Jeff Folla: Sent for release.]

[WF 07-Sep-2005, Jeff Folla: Sent for review.]

Product_uuid
67794720-356d-11d7-8ef2-ce2ac2bc9136|Sun StorageTek 6920 System

Attachments

This solution has no attachment