Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Sun Alert Sure Solution 1000879.1 : Loss of SAN Storage Path Following Link Event or Configuration Change
PreviouslyPublishedAs 201160 Product Sun StorageTek 3900 Series Bug Id <SUNBUG: 6221750> Date of Resolved Release 27-APR-2005 Impact Brocade SilkWorm switches running certain Fabric Operating Systems (FabOS) versions may experience a loss of access to storage via one or more paths. Contributing Factors This issue can occur on the following platforms:
Note: This issue only occurs on Brocade Silkworm switches running FabOS prior to 4.4.0b, following a "link event" (online/offline/online transition or array/switch reconfiguration) requiring the host to issue a port login (PLOGI) to the storage device. Symptoms On hosts running Sun "fp" and "MPxIO" drivers, "PLOGI timeout" messages and host messages from STMS will report that LUNs are being offlined, and that the paths allowing access to those LUNs are now degraded due to the loss of one (or more) path(s). These messages will be displayed in the array syslog, similar to the following example: Jan 24 11:34:27 [date time hostname] fp: [ID 517869 kern.info] NOTICE: fp(1): PLOGI to 21200 failed state=Timeout, reason=Hardware Error Jan 24 11:34:27 [date time hostname] scsi: [ID 243001 kern.warning] WARNING: /sbus@54,0/SUNW,qlc@1,30400/fp@0,0 (fcp1): Jan 24 11:34:27 [date time hostname] PLOGI to D_ID=0x21200 failed: State:Timeout, Reason:Hardware Error. Giving up Jan 24 11:36:11 [date time hostname] fp: [ID 517869 kern.info] NOTICE: fp(1): PLOGI to 21200 failed state=Timeout, reason=Hardware Error Jan 24 11:36:11 [date time hostname] scsi: [ID 243001 kern.warning] WARNING: /sbus@54,0/SUNW,qlc@1,30400/fp@0,0 (fcp1): Jan 24 11:36:11 [date time hostname] PLOGI to D_ID=0x21200 failed: State:Timeout, Reason:Hardware Error. Giving up Jan 24 11:36:31 [date time hostname] scsi: [ID 243001 kern.info] /sbus@54,0/SUNW,qlc@1,30400/fp@0,0 (fcp1): Jan 24 11:36:31 [date time hostname] offlining lun=1 (trace=0), target=21200 (trace=2800101) Jan 24 11:36:31 [date time hostname] mpxio: [ID 779286 kern.info] /scsi_vhci/ssd@g60020f20000097ab41b6d08700036e53 (ssd33) multipath status: degraded, path /sbus@54,0/SUNW,qlc@1,30400/fp@0,0 (fp1) to target address: 50020f23000097ab,1 is offline The above are examples only. On each system, the LUN numbers, target numbers, and device paths will vary. To identify that this issue is being seen, check the target trace value ("trace=2800101" above) , that the PLOGI failure state was "timeout", and check the overall sequence of events. Note: It is important to differentiate configuration issues that will not allow port login (causing PLOGI errors), against instances where hosts should be able to PLOGI to SAN storage that they have previously used. Certain HBA drivers will not allow port login (PLOGI) from another host, generating misleading PLOGI errors. Certain storage arrays (i.e. SE99x0) that have no configured LUNs available to the host in question, will not allow port login (PLOGI), thereby generating misleading PLOGI errors. Workaround Should the described issue occur, wait for any LUN failovers to complete (if applicable) and proceed with the following recommendations: On the hosts(s) where the above STMS "offlining lun" and "multipath status: degraded" messages were seen immediately following the "PLOGI" timeout to that device, run the luxadm(1M) command as "root" to the identified World-Wide-Number (WWN). In the example shown above in "Symptoms", the error occurred to the storage device with a WWN of 50020f23000097ab, so the command to run would be : # luxadm -e forcelip 50020f23000097ab Following the above luxadm(1M) command, check for storage connectivity via the same path displayed: # luxadm display 50020f23000097ab This should show the path to storage being either ONLINE or STANDBY. If the path being shown is neither ONLINE or STANDBY, and the "luxadm -e forecelip <WWN>" resulted in additional PLOGI errors being recorded, then the attempted recovery has not worked and recovery will require the rebooting of the switch itself. This action will potentially affect the connectivity of other hosts to the storage, so it is imperative to ensure that the issue is not due to incorrect configuration (SAN zoning or LUN mapping) and that any other host on this switch has an alternative path to the storage prior to resetting the affected switch. Resolution This issue is addressed on the following platforms:
Note: The appropriateness of any switch FW version for host and storage connectivity should be investigated prior to installation. Modification History References<SUNPATCH: 115361-06>Previously Published As 101680 Internal Comments
Internal Contributor/submitter brian.austin@sun.com Internal Eng Business Unit Group NWS (Network Storage) Internal Eng Responsible Engineer brian.austin@sun.com Internal Services Knowledge Engineer david.mariotto@sun.com Internal Escalation ID 1-4710999 Internal Resolution Patches 115361-06 Internal Sun Alert Kasp Legacy ID 101680, 57773 (Sun Alert) Internal Sun Alert & FAB Admin Info Critical Category: Availability ==> Pervasive Significant Change Date: 2005-04-27 Avoidance: Patch Responsible Manager: Karl-Heinz.Wegener@Sun.COM Original Admin Info: This document has been imported from KMS Creator and may need adjustment before re-publishing. This imported document has been reviewed/adjusted by: Review Name: Review Date: Original KMS Creator attributes below: --- PLEASE DO NOT MAKE ANY CHANGES BELOW THIS LINE! --- Sun Alert ID: 57773 Synopsis: Loss of SAN Storage Path Following Link Event or Configuration Change Category: Availability Product: Brocade SilkWorm Switch 12000, 24000, 3250, 3850, 3900 BugIDs: 6221750 Avoidance: Patch State: Resolved Date Released: 27-Apr-2005 Date Closed: 27-Apr-2005 Date Modified: Escalation IDs: 1-4710999 Pending Patches: Resolution Patches: 115361-06 FIN: FCO: Date Submitted: 19-Apr-2005 Submitter: brian.austin@sun.com Responsible Engineer: brian.austin@sun.com Responsible Manager: Karl-Heinz.Wegener@Sun.COM CTE group: NWS Responsible Writer: david.mariotto@sun.com Distribution: Public SunSolve Workflow History: WF State: Issued, 27-Apr-2005, David Mariotto WF Note: sending for release WF State: Draft, 27-Apr-2005, David Mariotto WF Note: Brocade approved per Jerry Shen. send for release WF State: Draft, 26-Apr-2005, David Mariotto WF Note: Sending to Jerry Shen to get Brocade approval. (9am) WF State: Draft, 25-Apr-2005, David Mariotto WF Note: sending for review 11:00am. Send to Jerry Shen tomorrow for Brocade approval. WF State: Draft, 22-Apr-2005, David Mariotto WF Note: OK to send for review per Roberta - send on Monday WF State: Draft, 22-Apr-2005, David Mariotto WF Note: Roberta will be on vacation starting Monday - turned over approval to team - Have not heard anything back yet for approval from team members Bill Stearn and Mark Keener WF State: Draft, 20-Apr-2005, David Mariotto WF Note: waiting on BU draft approval (Roberta), notified Jerry Shen we will need Brocade approval WF State: Draft, 19-Apr-2005, David Mariotto WF Note: Article created. Exported from KMS Creator Sat May 21 09:17:57 2005 GMT, olaf.reineke@sun.com Internal SA-FAB Eng Submission Loss of SAN Storage Path Following Link Event or Configuration Change Product_uuid 04ccc2c2-16a1-11d7-9f9a-f83fdd2e2f1b|Sun StorageTek 3900 Series ReferencesSUNPATCH:115361-06Attachments This solution has no attachment |
||||||||||||
|