Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1008193.1
Update Date:2011-01-31
Keywords:

Solution Type  Troubleshooting Sure

Solution  1008193.1 :   Troubleshooting Sun Storage[TM] 351x Cabling  


Related Items
  • Sun Storage 3511 SATA Array
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - 3xxx Arrays
  •  

PreviouslyPublishedAs
211252
Troubleshooting Sun Storage[TM] 351x Cabling.

Applies to:

Sun Storage 3511 SATA Array
All Platforms

Purpose

Description

This document will assist if the following symptoms are seen with a Sun Storage 3510 or 3511 Array:

-Path degraded/offline

-Loop offline

-SCSI time-outs

-Drives  and/or Controllers  missing or failed

-No access to LUNs

- LED unlit

Sample Warnings  from /var/adm/messages:

Aug  4 23:25:09 edb mpxio: [ID 779286 kern.info] /scsi_vhci/ssd@g600c0ff000000000002eb07e180f0300 (ssd6) multipath status: degraded, path /pci@9,600000/SUNW,qlc@1/fp@0,0 (fp2) to target address: 226000c0ffb02eb0,0 is offline
Aug  4 23:25:09 edb mpxio: [ID 779286 kern.info] /scsi_vhci/ssd@g600c0ff000000000002eb07e180f0300 (ssd6) multipath status: failed, path /pci@8,700000/SUNW,qlc@4/fp@0,0 (fp3) to target address: 216000c0ff802eb0,0 is offline

Timeouts and SCSI transport errors:

Nov  8 10:33:20 lccdb4 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@1,1/fp@0,0/ssd@w256000c0ffc83b5c,2 (ssd2):
Nov 8 10:33:20 lccdb4 SCSI transport failed: reason 'timeout': retrying command
Nov 8 13:57:54 lccdb4 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@1,1/fp@0,0/ssd@w256000c0ffc83b5c,0 (ssd4):
Nov 8 13:57:54 lccdb4 SCSI transport failed: reason 'timeout': retrying command
Nov 8 15:08:49 lccdb4 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@1,1/fp@0,0/ssd@w256000c0ffc83b5c,2 (ssd2):
Nov 8 15:08:49 lccdb4 SCSI transport failed: reason 'timeout': giving up

Loop Offline:

  qlc: [ID 686697 kern.info] NOTICE: Qlogic qlc(2): Loop OFFLINE
fctl: [ID 517869 kern.warning] WARNING: 162=>fp(2)::OFFLINE timeout
scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/SUNW,qlc@5,1/fp@0,0/ssd@w226000c0ff901ef4,0 (ssd12):
transport rejected (-2)
scsi: [ID 243001 kern.info] /pci@8,700000/SUNW,qlc@5,1/fp@0,0 (fcp2):
offlining lun=0 (trace=0), target=a6 (trace=2800004)

From array event log:

Wed Feb 18 10:13:31 2004
[0111] #4: StorEdge Array SN#8002658 Controller ALERT: redundant
controller failure detected

Last Review Date

January 31, 2011

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details



Steps to Follow:

NOTE:  This is a sub-set of : "Troubleshooting Sun Storage[TM]  33x0/351x Hardware."

1.  Confirm supported cabling configuration  is used:

http://download.oracle.com/docs/cd/E19236-01/816-7300-21/appb_jbod.html#pgfId-1000275

See Sun Storage[TM] 3510 FC JBOD Array Attached to a Single HBA Port:

http://download.oracle.com/docs/cd/E19236-01/816-7300-21/appb_jbod.html#pgfId-1000285

Note: 3511 FC does not support the use of JBOD arrays:

http://download.oracle.com/docs/cd/E19236-01/816-7300-21/appb_jbod.html#pgfId-1000092

See Cabling to Expansion Units:

http://download.oracle.com/docs/cd/E19236-01/816-7300-21/ch04_cable.html#20126

For configurations with Sun Cluster, see Sun Cluster 3.x With Sun Storage[TM] 3510 or 3511
FC RAID Array Manual.

2.  Confirm that the appropriate  ports are in use:

Dual controller


3510

3511

Total number of ports

12

16

Channel 0

1 FC0 host or drive port; default: host port

2 FC0 dedicated host ports

Channel 1

1 FC1 host or drive port; default: host port

2 FC1 dedicated host ports

Channel 2

1 FC2 dedicated drive port

1 FC2 dedicated drive port

Channel 3

1 FC3 dedicated drive port

1 FC3 dedicated drive port

Channel 4

1 FC4 host or drive port; default: host port

1 FC4 host or drive port; default: host port

Channel 5

1 FC5 host or drive port; default: host port

1 FC5 host or drive port; default: host port

For Sun Storage[TM] 3511 SATA arrays, channels 0 and 1 automatically configure their ports to match the transfer speed and communication method of each connection. Channels 4 and 5 only support a 2-Gbyte transfer rate. If you connect two servers to channel 0 or to channel 1, use host filtering if you want to control host access to logical drives.

 

3. Confirm the connected SFPs are green (if  visual inspection of the cable and SFP connection  is possible).

SFP link status : Solid green - Active good

FC Connection Off - Empty or failed FC connection

4. Verify the correct cable is used.  

Cable Part Numbers:

3510:

https://support.oracle.com/handbook_private/Systems/3510_R/components.html#Cables

3511:

https://support.oracle.com/handbook_private/Systems/3511/components.html#Cables

5. Verify  channel/port connection type  and speed  by issuing a  show channels  command from the sccli prompt:

3510 example:

sccli> 
show channels





Ch Type Media Speed Width PID / SID
--------------------------------------------
0 Host FC(L) 2G Serial 40 / 41
1 Host FC(L) 2G Serial 43 / 42
2 DRV+RCC FC(L) 2G Serial 14 / 15
3 DRV+RCC FC(L) 2G Serial 14 / 15
4 Host FC(L) 2G Serial 44 / 45
5 Host FC(L) 2G Serial 47 / 46
6 Host LAN N/A Serial NA / NA

From the above 3510 output  and looking at the speed column we see hosts are attached  to channels 0, 1, 4 and 5 with 2-Gbyte transfer speeds. There is no host connected to channel 6 (speed=N/A).

Note - Speed values are displayed for the primary controller only. Therefore, if a user maps one LUN to the primary controller, and another LUN to a secondary controller, only the established connection to the primary controller is displayed. As a result, if a primary ID is not mapped to a channel, and a secondary ID is mapped, "Async" displays in the Speed field.

Note - For FC or SATA,   a speed value of  Async   may mean no link or link down if it is on a channel with a PID .

6. Confirm  SFP connection  by using  the sccli   show bypass  command.

The show bypass SFP command displays the bypass status of all small form-factor (SFP) transceivers on a specified loop.

Note - Loop A and Loop B refer to the redundant FC loops that each device is connected to. The SES device in the top slot of the chassis is connected to Loop A, which is the first drive channel. The bottom SES device is connected to Loop B, which is the second drive channel.

sccli> show bypass sfp ses-channel 2 loop loopa
PORT ENCL-ID ENCL-TYPE   LOOP    BYP-STATUS  ATTRIBUTES
---- ------- ---------     ----    ----------  SH

0 0 RAID LOOP-A Unbypassed --

1 0 RAID LOOP-A Not-Installed --

L 0 RAID LOOP-A Bypassed - -H

R 0 RAID LOOP-A Not-Installed --

4 0 RAID LOOP-A Not-Installed --

5 0 RAID LOOP-A Bypassed -H

Note:  The  L  and the  R  as shown above or any designated drive channel must not have an unused SFP installed. It will show a BYPASS status of Bypassed and ATTRIBUTES of -H.

LOOP-A refers to the controller in  the top slot and LOOP-B is the controller in the bottom slot.

If a device is bypassed, the Attributes returned values include S, F, or H.

*  An S means the device was bypassed due to a Sun Storage CLI command.

*  An F means a drive fault caused the bypass.

*  An H means the device was bypassed due to a hardware problem (no signal was present).

The following example  on channel 2 and shows the bypass information for a Sun Storage 3511 SATA array on loop A:

sccli> show bypass sfp ses-channel 2 loop loopa

PORT    ENCL-ID ENCL-TYPE       LOOP    BYP-STATUS      ATTRIBUTES
---- ------- --------- ---- ---------- SH--------
0L 0 RAID LOOP-A Unbypassed --
0R 0 RAID LOOP-A Unbypassed --
1L 0 RAID LOOP-A Not-Installed --
1R 0 RAID LOOP-A Not-Installed --
2 0 RAID LOOP-A Bypassed -H
3 0 RAID LOOP-A Not-Installed --
4 0 RAID LOOP-A Not-Installed --
5 0 RAID LOOP-A Bypassed -H
AL 1 JBOD LOOP-A Unbypassed --
AR 1 JBOD LOOP-A Unbypassed --
BL 1 JBOD LOOP-A Unbypassed --
BR 1 JBOD LOOP-A Bypassed -H

The Port returned values indicate the type of device, FC or SATA, that is attached to the loop.

On a Sun Storage 3510 FC RAID IOM board, from left to right, there are six ports: channel 0, channel 1, channel 2(3) Left, channel 2(3) Right, channel 4 and channel 5. Valid values for the Sun Storage 3510 FC RAID IOM board include 0, 1, 4, 5, L and R.

On a Sun Storage 3510 FC JBOD IOM board, from left to right, there are two ports: Left and Right. Valid values for port include L and R.

On a Sun Storage[TM] 3511 SATA RAID IOM board, from left to right, there are eight ports: channel 0 left, channel 0 right, channel 1 left, channel 1 right, channel 2, channel 3, channel 4 and channel 5. Valid values for the Sun StorEdge 3511 SATA RAID IOM board include 0L, 0R, 1L, 1R, 2, 3, 4 and 5.

On a Sun Storage[TM] 3511 SATA JBOD IOM board, from left to right, there are four ports: loop A left, loop A right, loop B left and loop B right. Valid value ports for the Sun Storage 3511 SATA JBOD IOM include AL, AR, BL and BR.

More Examples:
Connection detected on  Channel 0 and 1 (host ports) to a Server HBA. The second Left (L) Channel is connected to a JBOD
sccli> show bypass sfp ses-channel 2 loop a
PORT    ENCL-ID ENCL-TYPE       LOOP    BYP-STATUS      ATTRIBUTES
---- ------- --------- ---- ---------- SH--------
0 0 RAID LOOP-A Unbypassed --
1 0 RAID LOOP-A Unbypassed --
L 0 RAID LOOP-A Unbypassed --
R 0 RAID LOOP-A Bypassed -H
4 0 RAID LOOP-A Bypassed -H
5 0 RAID LOOP-A Bypassed -H
L 1 JBOD LOOP-A Unbypassed --
R 1 JBOD LOOP-A Not-Installed --
With cables removed, we see connections hardware (-H) bypassed:
sccli> show bypass sfp ses-channel 2 loop a
PORT    ENCL-ID ENCL-TYPE       LOOP    BYP-STATUS      ATTRIBUTES
---- ------- --------- ---- ---------- SH--------
0 0 RAID LOOP-A Bypassed -H
1 0 RAID LOOP-A Bypassed -H
L 0 RAID LOOP-A Bypassed -H
R 0 RAID LOOP-A Bypassed -H
4 0 RAID LOOP-A Bypassed -H
5 0 RAID LOOP-A Bypassed -H

If the connection remains Bypassed after the connection has been verified , cable and SFP  swapped or replaced, execute an Explorer from the host and escalate. More advanced diag commands can be run to test the bypass status.

Note:The  sccli show fru  ouput , event log or /var/adm/messages will not report  cable or SFP failure . Additional troubleshooting may be required  to identify a marginally operating,  failed  component or connection. See:  Troubleshooting Fibre Channel Devices from the OS.

7. If no problems are found during the course of this document, refer back to  Troubleshooting Sun Storage[TM] 33x0/351x Hardware .






@ Change History
Date: 2011-01-31
User Name: susan.copeland@oracle.com
Action: Update doc links





Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback