Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1000356.1
Update Date:2011-02-25
Keywords:

Solution Type  Sun Alert Sure

Solution  1000356.1 :   Certain sscs(1M) Commands, Array/StorEdge 3900SL CLI Commands, or Certain StorEdge 3900SL/6320/6130 GUI Actions May Cause Loss of Connectivity to a Host(s)  


Related Items
  • Sun Storage 6320 System
  •  
  • Sun Storage T3+ Array
  •  
  • Sun Storage T3 Array
  •  
  • Sun Storage 6120 Array
  •  
  • Sun Storage 6130 Array
  •  
  • Sun Storage 3910 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
200478


Product
Sun StorageTek 3900 Series
Sun StorageTek T3 Array
Sun StorageTek T3+ Array
Sun StorageTek 6120 Array
Sun StorageTek 6320 System
Sun StorageTek 6130 Array

Bug Id
<SUNBUG: 6197128>, <SUNBUG: 6202414>

Date of Workaround Release
14-JAN-2005

Date of Resolved Release
01-JUN-2005

Impact

Under rare conditions, the use of certain sscs(1M) commands, array/StorEdge 3900SL CLI commands, or certain StorEdge 3900SL/6320/6130 GUI actions to manage a Sun StorEdge 3900SL/6120/6130/6320/T3+ Array, attached via certain Fibre Channel (FC) switches (listed below), on certain switch firmware releases (listed below), and with Host Bus Adapters (HBA) using the Sun QLC HBA driver, may cause loss of connectivity to a host(s). As a result, it is possible the use of these commands can cause a path failure, which could lead to a complete loss of host access to the array.


Contributing Factors

This issue can occur in the following platforms:

SPARC Platform

  • Sun StorEdge 3900SL Array
  • Sun StorEdge 6120/6130/6320 Arrays
  • Sun StorEdge T3+ Array

connected to the following switch models:

  • SG-XSWBRO3200 - 3200 switch with 8 ports with FabOS 3.1.2a (as delivered with firmware patch 115360-03) and without patch 115360-05
  • SG-XSWBRO3200 - 3200 switch with 8 ports with FabOS 3.1.3 (as delivered with firmware patch 115360-04) and without patch 115360-05
  • SG-XSWBRO3800 - 3800 switch with 16 ports with FabOS 3.1.2a (as delivered with firmware patch 115360-03) and without patch 115360-05
  • SG-XSWBRO3800 - 3800 switch with 16 ports with FabOS 3.1.3 (as delivered with firmware patch 115360-04) and without patch 115360-05

The described issue may occur in the configurations described above when the following sscs(1M) commands, or array/StorEdge 3900SL CLI commands are issued:

sscs(1M) commands:

  • sscs modify volgroup
  • sscs create volume
  • sscs create initiator
  • sscs create pool
  • sscs modify array
  • sscs add initgroup
  • sscs map

StorEdge 6120/T3+ telnet(1) commands:

  • lun perm
  • hwwn
  • volslice
  • vol mount
  • sys mp_support

StorEdge 3900SL Service Processor (SP) CLI commands:

The following menu options in the program "/opt/SUNWsecfg/runsecfg" :

  • 3) Configure Sun StorEdge T3+ Array(s)
  • 6) Modify Sun StorEdge T3+ Array Sys Parameters
  • 8) Manage Sun StorEdge T3+ Array LUN Slicing
  • 9) Manage Sun StorEdge T3+ Array LUN Masking

The following commands from the directory "/opt/SUNWsecfg/bin" on the Service Processor (SP):

  • createt3group
  • addtot3group
  • delfromt3group
  • rmt3group
  • createt3slice
  • rmt3slice
  • modifyt3config
  • savet3config
  • modifyt3params
  • sett3lunperm

Notes:

1. StorEdge 3900SL/6130/6320 GUI actions equivalent to these commands may also cause the issue to occur.

2. The following Read-Only commands will not trigger the described issue:

  • lun perm list
  • hwwn list
  • hwwn listgrp
  • volslice list

3. The described issue may be encountered only under the above mentioned conditions.


Symptoms

If the described issue occurs, on hosts running Sun "fp" and "mpxio" drivers, "PLOGI timeout" messages and host messages from STMS reporting that LUNs are being offlined, and that the paths allowing access to those LUNs are now degraded due to the loss of one path, will be displayed in the array syslog:

Example message with FabOS 3.1.2a:

    [date time hostname] fp: [ID 517869 kern.info] NOTICE: fp(1): PLOGI to 10f00 failed state=Timeout,
reason=Hardware Error...
[date time hostname]       PLOGI to D_ID=0x10f00 failed: State:Timeout, Reason:Hardware Error. Giving up
[date time hostname] scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp1):
[date time hostname]       offlining lun=b (trace=0), target=10f00 (trace=2800101)
[date time hostname] mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g60003ba4e7fbe00041862b4700047ffc (ssd1)
multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp1) to target address: 20030003ba4e7fbe,b
is offline. Load balancing: round-robin

Note: The above are examples only. On each system, the LUN numbers, target numbers, and device paths will vary. To identify that this issue is being seen, check the target trace value ("trace=2800101" above) and the overall sequence of events, where many LUNs failover, and a path is reported to be "offline", after performing any of the commands shown in section 2.

Alternatively, on a switch running FabOS 3.1.3, the switch port to the host and storage may become unresponsive. This is identified by issuing a "cfgadm -al -o show_FCP_dev" command, where the outputs via this path may be marked with a Condition of "failing" or the controller itself may revert to a Type of "fc". The switch CLI will also show symptoms; if a "switchshow" is performed, the command will not return and will "freeze" on the port that was connected to the storage being manipulated.

Example message with FabOS 3.1.3:

    Ap_Id                          Type         Receptacle   Occupant     Condition
c3                             fc-fabric    connected    configured   unknown
c3::20030003ba4e7fbe,0         disk         connected    configured   failing
c3::20030003ba4e7fbe,1         disk         connected    configured   failing
c3::20030003ba4e7fbe,2         disk         connected    configured   failing
c3::20030003ba4e7fbe,3         disk         connected    configured   failing
c3::20030003ba4e7fbe,4         disk         connected    configured   failing
c3::20030003ba4e7fbe,5         disk         connected    configured   failing
c3::20030003ba4e7fbe,6         disk         connected    configured   failing
c3::20030003ba4e7fbe,7         disk         connected    configured   failing
c3::20030003ba4e7fbe,8         disk         connected    configured   failing
c3::20030003ba4e7fbe,9         disk         connected    configured   failing
c4                             fc-fabric    connected    configured   unknown
c4::20030003ba4e8ad1,0         disk         connected    configured   unknown
c4::20030003ba4e8ad1,1         disk         connected    configured   unknown
c4::20030003ba4e8ad1,2         disk         connected    configured   unknown
c4::20030003ba4e8ad1,3         disk         connected    configured   unknown
c4::20030003ba4e8ad1,4         disk         connected    configured   unknown
c4::20030003ba4e8ad1,5         disk         connected    configured   unknown
c4::20030003ba4e8ad1,6         disk         connected    configured   unknown
c4::20030003ba4e8ad1,7         disk         connected    configured   unknown
c4::20030003ba4e8ad1,8         disk         connected    configured   unknown
c4::20030003ba4e8ad1,9         disk         connected    configured   unknown

Example Message with 3.1.3 CLI

    Switch:admin> switchshow
switchName:     Switch
switchType:     9.2
switchState:    Online
switchMode:     Native
switchRole:     Principal
switchDomain:   1
switchId:       fffc01
switchWwn:      10:00:00:60:69:51:8a:a3
switchBeacon:   OFF
Zoning:         ON (example)
port  0: id N1 Online         F-Port 21:00:00:e0:8b:0c:12:15
port  1: id N1 Online         F-Port 21:00:00:e0:8b:0c:6e:16
port  2: id AN No_Sync
port  3: -- N2 No_Module
port  4: id N2 Online         F-Port 21:01:00:e0:8b:27:81:b4
port  5: -- N2 No_Module
port  6: id 1G Online         F-Port 50:02:0f:23:00:00:06:f2
port  7: id N1 Online         F-Port 50:02:0f:23:00:01:08:bb
port  8: id N2 No_Light
port  9: id N2 No_Light
port 10: id AN No_Sync
port 11: id AN No_Sync
port 12: id N2 Online         F-Port 21:01:00:e0:8b:37:6b:12
port 13: id N2 Online         F-Port 21:00:00:e0:8b:17:3b:14
port 14: id N2 Online         F-Port 21:00:00:e0:8b:17:6b:18

**Note that the command did not return to CLI prompt and that port 15 (attached to storage being manipulated) is missing **


Workaround

If the above issue occurs, wait for LUN failovers to complete and follow the recommendations shown below:

On the hosts(s) where the above STMS "offlining lun" and "multipath status: degraded" messages were seen, run the following luxadm(1M) command as root:

    # luxadm -e port
Found path to 2 HBA ports
/devices/pci@1d,700000/SUNW,qlc@1/fp@0,0:devctl                CONNECTED
/devices/pci@1d,700000/SUNW,qlc@1,1/fp@0,0:devctl              NOT CONNECTED

To reconnect the path, issue the following "luxadm -e forcelip" command for the path that was shown in the STMS error message reporting as "multipath status: degraded".

In the example above, the error occurred on "/pci@1d,700000/SUNW,qlc@1,1/fp@0,0" so the following command is used:

    # luxadm -e forcelip /devices/pci@1d,700000/SUNW,qlc@1,1/fp@0,0:devctl

After running "luxadm -e forcelip" on the path(s) required above, you can confirm that all paths are now usable by running "luxadm -e port" again as shown below:

    # luxadm -e port
Found path to 2 HBA ports
/devices/pci@1d,700000/SUNW,qlc@1/fp@0,0:devctl         CONNECTED
/devices/pci@1d,700000/SUNW,qlc@1,1/fp@0,0:devctl       CONNECTED

With FabOS version 3.1.3, if the switch port has become unresponsive then the switch will require a reboot to restore connectivity on this path. This action will potentially affect the connectivity of other hosts to the storage so it is imperative to ensure that the switch has indeed become unresponsive and that any other host on this switch has an alternative path to the storage, prior to resetting the affected switch.


Resolution

This issue is addressed in the following releases:

SPARC Platform

  • Sun StorEdge 3900SL Array
  • Sun StorEdge 6120/6130/6320 Arrays
  • Sun StorEdge T3+ Array

connected to the following switch models:

  • SG-XSWBRO3200 - 3200 switch with 8 ports with patch 115360-05
  • SG-XSWBRO3800 - 3800 switch with with 16 ports with patch 115360-05


Modification History
Date: 01-JUN-2005
  • State: Resolved
  • Updated Contributing Factors and Relief/Workaround sections

Date: 15-JUL-2005
  • Updated the Contributing Factors and the Resolution sections


References

<SUNPATCH: 115360-05>

Previously Published As
101614
Internal Comments



Issue 6197128 (Brocade defect 46912) is identified in FabOS versions 3.1.2a and 3.1.3



Issue 6202414 (Brocade defect 51204)is identified in FabOS versions 3.1.3 and 3.1.3a



At time of writing, FabOS 3.1.3 is the highest FW version qualified by Sun in patch 115360-04



It is not known (untested) at the time writing if this issue affects SUN JNI HBAs running jfca driver, or indeed other HBA driver combinations.



See CR/bug 6197128 (Brocade defect 46912) and CR/bug 6202414 (Brocade defect 51204) for more details.



Internal Contributor/submitter
brian.austin@sun.com

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Eng Responsible Engineer
brian.austin@sun.com

Internal Services Knowledge Engineer
jeff.folla@sun.com

Internal Escalation ID
1-4337327

Internal Resolution Patches
115360-05

Internal Sun Alert Kasp Legacy ID
101614, 57695 (Sun Alert)

Internal Sun Alert & FAB Admin Info
Critical Category: Availability ==> Pervasive
Significant Change Date: 2005-01-14, 2005-06-01
Avoidance: Patch, Workaround
Responsible Manager: Karl-Heinz.Wegener@Sun.COM
Original Admin Info: [WF 13-Jul-2005, Jeff Folla: Updated the Sun Alert to include the 6130 per Matt.Ascroft@Sun.COM e-mail.]

[WF 01-Jun-2005, Jeff Folla: Patch is now available. Updating to Resolved and sending for re-release.]

This document has been imported from KMS Creator and may need adjustment before re-publishing.

This imported document has been reviewed/adjusted by:
Review Name:
Review Date:

Original KMS Creator attributes below:

--- PLEASE DO NOT MAKE ANY CHANGES BELOW THIS LINE! ---

Sun Alert ID: 57695
Synopsis: Certain sscs(1M) Commands, Array/StorEdge 3900SL CLI Commands, or Certain StorEdge 3900SL/6320 GUI Actions May Cause Loss of Connectivity to a Host(s)
Category: Availability
Product: Sun StorEdge 3900SL Array, Sun StorEdge 6120/6320 Array, Sun StorEdge T3+ Array
BugIDs: 6197128, 6202414
Avoidance: Workaround
State: Committed
Date Released: 14-Jan-2005
Date Closed:
Date Modified:
Escalation IDs: 1-4337327
Pending Patches:
Resolution Patches:
FIN:
FCO:
Date Submitted: 02-Dec-2004, 11-Jan-2005
Submitter: brian.austin@sun.com
Responsible Engineer: brian.austin@sun.com
Responsible Manager: Karl-Heinz.Wegener@Sun.COM
CTE group: NWS
Responsible Writer: jeff.folla@sun.com
Distribution: Contract SunSolve

Workflow History:

WF State: Issued, 07-Mar-2005, Karen Edwards
WF Note: The Synopsis exceeded 256 allowed for our migration. so i hacked down the synopsis


WF State: Issued, 14-Jan-2005, Jeff Folla
WF Note: Sent for release.


WF State: Draft, 07-Jan-2005, Jeff Folla
WF Note: Joseph Poon requested that we remove the reference to Brocade from the
sun alert and continue the release process.

WF State: Draft, 10-Dec-2004, Jeff Folla
WF Note: Sent for review on 12/8. Waiting for Brocade to review and approve for release.

WF State: Draft, 06-Dec-2004, Jeff Folla
WF Note: Sent for review.

Exported from KMS Creator Sat May 21 09:15:07 2005 GMT, olaf.reineke@sun.com
Internal SA-FAB Eng Submission
Certain sscs(1M) Commands, Array/StorEdge 3900SL CLI Commands, or Certain StorEdge 3900SL/6320 GUI Actions May Cause Loss of Connectivity to a Host(s)

Product_uuid
04ccc2c2-16a1-11d7-9f9a-f83fdd2e2f1b|Sun StorageTek 3900 Series
2a6d7d50-0a18-11d6-8e0b-f0bd33b24928|Sun StorageTek T3 Array
2a714b10-0a18-11d6-86e2-d56b387d4fbf|Sun StorageTek T3+ Array
2cd2e7d2-2980-11d7-9c3f-c506fe37b7ef|Sun StorageTek 6120 Array
4de60cc2-a08e-4610-b8bf-6a1881cb59c6|Sun StorageTek 6320 System
8252cb91-d771-11d8-ab52-080020a9ed93|Sun StorageTek 6130 Array

References

SUNPATCH:115360-05

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback