Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Sun Alert Sure Solution 1000356.1 : Certain sscs(1M) Commands, Array/StorEdge 3900SL CLI Commands, or Certain StorEdge 3900SL/6320/6130 GUI Actions May Cause Loss of Connectivity to a Host(s)
PreviouslyPublishedAs 200478 Product Sun StorageTek 3900 Series Sun StorageTek T3 Array Sun StorageTek T3+ Array Sun StorageTek 6120 Array Sun StorageTek 6320 System Sun StorageTek 6130 Array Bug Id <SUNBUG: 6197128>, <SUNBUG: 6202414> Date of Workaround Release 14-JAN-2005 Date of Resolved Release 01-JUN-2005 Impact Under rare conditions, the use of certain sscs(1M) commands, array/StorEdge 3900SL CLI commands, or certain StorEdge 3900SL/6320/6130 GUI actions to manage a Sun StorEdge 3900SL/6120/6130/6320/T3+ Array, attached via certain Fibre Channel (FC) switches (listed below), on certain switch firmware releases (listed below), and with Host Bus Adapters (HBA) using the Sun QLC HBA driver, may cause loss of connectivity to a host(s). As a result, it is possible the use of these commands can cause a path failure, which could lead to a complete loss of host access to the array. Contributing Factors This issue can occur in the following platforms: SPARC Platform
connected to the following switch models:
The described issue may occur in the configurations described above when the following sscs(1M) commands, or array/StorEdge 3900SL CLI commands are issued: sscs(1M) commands:
StorEdge 6120/T3+ telnet(1) commands:
StorEdge 3900SL Service Processor (SP) CLI commands: The following menu options in the program "/opt/SUNWsecfg/runsecfg" :
The following commands from the directory "/opt/SUNWsecfg/bin" on the Service Processor (SP):
Notes: 1. StorEdge 3900SL/6130/6320 GUI actions equivalent to these commands may also cause the issue to occur. 2. The following Read-Only commands will not trigger the described issue:
3. The described issue may be encountered only under the above mentioned conditions. Symptoms If the described issue occurs, on hosts running Sun "fp" and "mpxio" drivers, "PLOGI timeout" messages and host messages from STMS reporting that LUNs are being offlined, and that the paths allowing access to those LUNs are now degraded due to the loss of one path, will be displayed in the array syslog: Example message with FabOS 3.1.2a: [date time hostname] fp: [ID 517869 kern.info] NOTICE: fp(1): PLOGI to 10f00 failed state=Timeout, reason=Hardware Error... [date time hostname] PLOGI to D_ID=0x10f00 failed: State:Timeout, Reason:Hardware Error. Giving up [date time hostname] scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp1): [date time hostname] offlining lun=b (trace=0), target=10f00 (trace=2800101) [date time hostname] mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g60003ba4e7fbe00041862b4700047ffc (ssd1) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp1) to target address: 20030003ba4e7fbe,b is offline. Load balancing: round-robin Note: The above are examples only. On each system, the LUN numbers, target numbers, and device paths will vary. To identify that this issue is being seen, check the target trace value ("trace=2800101" above) and the overall sequence of events, where many LUNs failover, and a path is reported to be "offline", after performing any of the commands shown in section 2. Alternatively, on a switch running FabOS 3.1.3, the switch port to the host and storage may become unresponsive. This is identified by issuing a "cfgadm -al -o show_FCP_dev" command, where the outputs via this path may be marked with a Condition of "failing" or the controller itself may revert to a Type of "fc". The switch CLI will also show symptoms; if a "switchshow" is performed, the command will not return and will "freeze" on the port that was connected to the storage being manipulated. Example message with FabOS 3.1.3: Ap_Id Type Receptacle Occupant Condition c3 fc-fabric connected configured unknown c3::20030003ba4e7fbe,0 disk connected configured failing c3::20030003ba4e7fbe,1 disk connected configured failing c3::20030003ba4e7fbe,2 disk connected configured failing c3::20030003ba4e7fbe,3 disk connected configured failing c3::20030003ba4e7fbe,4 disk connected configured failing c3::20030003ba4e7fbe,5 disk connected configured failing c3::20030003ba4e7fbe,6 disk connected configured failing c3::20030003ba4e7fbe,7 disk connected configured failing c3::20030003ba4e7fbe,8 disk connected configured failing c3::20030003ba4e7fbe,9 disk connected configured failing c4 fc-fabric connected configured unknown c4::20030003ba4e8ad1,0 disk connected configured unknown c4::20030003ba4e8ad1,1 disk connected configured unknown c4::20030003ba4e8ad1,2 disk connected configured unknown c4::20030003ba4e8ad1,3 disk connected configured unknown c4::20030003ba4e8ad1,4 disk connected configured unknown c4::20030003ba4e8ad1,5 disk connected configured unknown c4::20030003ba4e8ad1,6 disk connected configured unknown c4::20030003ba4e8ad1,7 disk connected configured unknown c4::20030003ba4e8ad1,8 disk connected configured unknown c4::20030003ba4e8ad1,9 disk connected configured unknown Example Message with 3.1.3 CLI Switch:admin> switchshow switchName: Switch switchType: 9.2 switchState: Online switchMode: Native switchRole: Principal switchDomain: 1 switchId: fffc01 switchWwn: 10:00:00:60:69:51:8a:a3 switchBeacon: OFF Zoning: ON (example) port 0: id N1 Online F-Port 21:00:00:e0:8b:0c:12:15 port 1: id N1 Online F-Port 21:00:00:e0:8b:0c:6e:16 port 2: id AN No_Sync port 3: -- N2 No_Module port 4: id N2 Online F-Port 21:01:00:e0:8b:27:81:b4 port 5: -- N2 No_Module port 6: id 1G Online F-Port 50:02:0f:23:00:00:06:f2 port 7: id N1 Online F-Port 50:02:0f:23:00:01:08:bb port 8: id N2 No_Light port 9: id N2 No_Light port 10: id AN No_Sync port 11: id AN No_Sync port 12: id N2 Online F-Port 21:01:00:e0:8b:37:6b:12 port 13: id N2 Online F-Port 21:00:00:e0:8b:17:3b:14 port 14: id N2 Online F-Port 21:00:00:e0:8b:17:6b:18 **Note that the command did not return to CLI prompt and that port 15 (attached to storage being manipulated) is missing ** Workaround If the above issue occurs, wait for LUN failovers to complete and follow the recommendations shown below: On the hosts(s) where the above STMS "offlining lun" and "multipath status: degraded" messages were seen, run the following luxadm(1M) command as root: # luxadm -e port Found path to 2 HBA ports /devices/pci@1d,700000/SUNW,qlc@1/fp@0,0:devctl CONNECTED /devices/pci@1d,700000/SUNW,qlc@1,1/fp@0,0:devctl NOT CONNECTED To reconnect the path, issue the following "luxadm -e forcelip" command for the path that was shown in the STMS error message reporting as "multipath status: degraded". In the example above, the error occurred on "/pci@1d,700000/SUNW,qlc@1,1/fp@0,0" so the following command is used: # luxadm -e forcelip /devices/pci@1d,700000/SUNW,qlc@1,1/fp@0,0:devctl After running "luxadm -e forcelip" on the path(s) required above, you can confirm that all paths are now usable by running "luxadm -e port" again as shown below: # luxadm -e port Found path to 2 HBA ports /devices/pci@1d,700000/SUNW,qlc@1/fp@0,0:devctl CONNECTED /devices/pci@1d,700000/SUNW,qlc@1,1/fp@0,0:devctl CONNECTED With FabOS version 3.1.3, if the switch port has become unresponsive then the switch will require a reboot to restore connectivity on this path. This action will potentially affect the connectivity of other hosts to the storage so it is imperative to ensure that the switch has indeed become unresponsive and that any other host on this switch has an alternative path to the storage, prior to resetting the affected switch. Resolution This issue is addressed in the following releases: SPARC Platform
connected to the following switch models:
Modification History Date: 01-JUN-2005
Date: 15-JUL-2005
References<SUNPATCH: 115360-05>Previously Published As 101614 Internal Comments
Internal Contributor/submitter brian.austin@sun.com Internal Eng Business Unit Group NWS (Network Storage) Internal Eng Responsible Engineer brian.austin@sun.com Internal Services Knowledge Engineer jeff.folla@sun.com Internal Escalation ID 1-4337327 Internal Resolution Patches 115360-05 Internal Sun Alert Kasp Legacy ID 101614, 57695 (Sun Alert) Internal Sun Alert & FAB Admin Info Critical Category: Availability ==> Pervasive Significant Change Date: 2005-01-14, 2005-06-01 Avoidance: Patch, Workaround Responsible Manager: Karl-Heinz.Wegener@Sun.COM Original Admin Info: [WF 13-Jul-2005, Jeff Folla: Updated the Sun Alert to include the 6130 per Matt.Ascroft@Sun.COM e-mail.] [WF 01-Jun-2005, Jeff Folla: Patch is now available. Updating to Resolved and sending for re-release.] This document has been imported from KMS Creator and may need adjustment before re-publishing. This imported document has been reviewed/adjusted by: Review Name: Review Date: Original KMS Creator attributes below: --- PLEASE DO NOT MAKE ANY CHANGES BELOW THIS LINE! --- Sun Alert ID: 57695 Synopsis: Certain sscs(1M) Commands, Array/StorEdge 3900SL CLI Commands, or Certain StorEdge 3900SL/6320 GUI Actions May Cause Loss of Connectivity to a Host(s) Category: Availability Product: Sun StorEdge 3900SL Array, Sun StorEdge 6120/6320 Array, Sun StorEdge T3+ Array BugIDs: 6197128, 6202414 Avoidance: Workaround State: Committed Date Released: 14-Jan-2005 Date Closed: Date Modified: Escalation IDs: 1-4337327 Pending Patches: Resolution Patches: FIN: FCO: Date Submitted: 02-Dec-2004, 11-Jan-2005 Submitter: brian.austin@sun.com Responsible Engineer: brian.austin@sun.com Responsible Manager: Karl-Heinz.Wegener@Sun.COM CTE group: NWS Responsible Writer: jeff.folla@sun.com Distribution: Contract SunSolve Workflow History: WF State: Issued, 07-Mar-2005, Karen Edwards WF Note: The Synopsis exceeded 256 allowed for our migration. so i hacked down the synopsis WF State: Issued, 14-Jan-2005, Jeff Folla WF Note: Sent for release. WF State: Draft, 07-Jan-2005, Jeff Folla WF Note: Joseph Poon requested that we remove the reference to Brocade from the sun alert and continue the release process. WF State: Draft, 10-Dec-2004, Jeff Folla WF Note: Sent for review on 12/8. Waiting for Brocade to review and approve for release. WF State: Draft, 06-Dec-2004, Jeff Folla WF Note: Sent for review. Exported from KMS Creator Sat May 21 09:15:07 2005 GMT, olaf.reineke@sun.com Internal SA-FAB Eng Submission Certain sscs(1M) Commands, Array/StorEdge 3900SL CLI Commands, or Certain StorEdge 3900SL/6320 GUI Actions May Cause Loss of Connectivity to a Host(s) Product_uuid 04ccc2c2-16a1-11d7-9f9a-f83fdd2e2f1b|Sun StorageTek 3900 Series 2a6d7d50-0a18-11d6-8e0b-f0bd33b24928|Sun StorageTek T3 Array 2a714b10-0a18-11d6-86e2-d56b387d4fbf|Sun StorageTek T3+ Array 2cd2e7d2-2980-11d7-9c3f-c506fe37b7ef|Sun StorageTek 6120 Array 4de60cc2-a08e-4610-b8bf-6a1881cb59c6|Sun StorageTek 6320 System 8252cb91-d771-11d8-ab52-080020a9ed93|Sun StorageTek 6130 Array ReferencesSUNPATCH:115360-05Attachments This solution has no attachment |
||||||||||||
|