Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1000314.1
Update Date:2011-03-01
Keywords:

Solution Type  Sun Alert Sure

Solution  1000314.1 :   Sun StorEdge 3310 SCSI Array and 3510/3511 FC Array Event Hardware Failure Messages May Result in Incorrect FRU Replacement  


Related Items
  • Sun Storage 3510 FC Array
  •  
  • Sun Storage 3310 Array
  •  
  • Sun Storage 3511 SATA Array
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
200432


Product
Sun StorageTek 3310 SCSI Array
Sun StorageTek 3510 FC Array
Sun StorageTek 3511 SATA Array

Bug Id
<SUNBUG: 4995551>

Date of Workaround Release
18-MAR-2005

Date of Resolved Release
29-AUG-2005

Impact

Event hardware failure messages without a unique FRU identifier may result in the replacement of the wrong FRU on a Sun StorEdge 3310 SCSI or 3510/3511 FC array.

The serial number in the event messages is always the serial number of the primary controller regardless of which CPU has the fault. As a result, the good FRU that matches the serial number may incorrectly be replaced while the bad FRU remains in the system. If an event occurs on the bad controller at the time the good controller is being replaced, a full array outage is possible.


Contributing Factors

This issue can occur on the following platforms:

SPARC Platform

  • Sun StorEdge 3310 SCSI Array without patch 113722-10
  • Sun StorEdge 3510 FC Array without patch 113723-10
  • Sun StorEdge 3511 FC Array without patch 113724-04

Symptoms

Example event messages:

    [0123] #298: StorEdge Array SN#26103 Peripheral Device ALERT: CPU high
temperature threshold exceeded ( 96.00C)
Mon Feb 14 12:45:31 2005 [01A3] #299: StorEdge Array SN#26103 Peripheral
Device NOTICE: CPU temperature high temperature now back to non-critical
level ( 92.00C)

Based upon the serial number, the primary controller was replaced. The above messages are from the primary controller AFTER the replacement, indicating that the wrong FRU was replaced. Aside from the serial number, there is no identifying FRU information contained in the message.


Workaround

There is no workaround. Please see the "Resolution" section below.


Resolution

This issue is addressed in the following releases:

SPARC Platform

  • Sun StorEdge 3310 SCSI Array with patch 113722-10 or later
  • Sun StorEdge 3510 FC Array with patch 113723-10 or later
  • Sun StorEdge 3511 FC Array with patch 113724-04 or later


Modification History
Date: 16-MAY-2005
  • Updated Contributing Factors and Resolution sections

Date: 14-JUN-2005

14-Jun-2005:

  • Updated Contributing Factors

Date: 16-JUN-2005
  • Updated Contributing Factors and Resolution sections

Date: 29-AUG-2005
  • State: Resolved
  • Updated Contributing Factors and Resolution sections


References

<SUNPATCH: 113723-10>
<SUNPATCH: 113724-04>
<SUNPATCH: 113722-10>

Previously Published As
101663
Internal Comments



The 4.11/2.0 release will fix this problem by providing the correct information for the controller at fault. The 4.11/2.0 will not include the serial number, but it will indicate primary or secondary controller.



Following is a sample of event logs that can be obtained through telnet:


    "Thu Mar  3 13:38:22 2005
[Primary] Alert
Peripheral Device ALERT: CPU Hot Temperature Detected(60.0(C) )

Thu Mar 3 13:39:16 2005
[Primary] Alert
SES(C2 I12)Temp Sensor 4:Over Temperature Failure

Thu Mar 3 13:39:16 2005
[Primary] Alert
SES(C2 I12)Temp Sensor 5:Over Temperature Failure

Thu Mar 3 13:39:30 2005
[Secondary] Alert
Peripheral Device ALERT: CPU Hot Temperature Detected(60.0(C) )

Thu Mar 3 13:40:15 2005
[Primary] Notification
SES(C2 I12)Temp Sensor 4:Temperature Back To Non-Critical LVs

Thu Mar 3 13:40:15 2005
[Primary] Notification
SES(C2 I12)Temp Sensor 5:Temperature Back To Non-Critical LVs
sccli>"


The Primary/Secondary controller can be identified by either of the following methods:



1. By Checking the controller's green LED.



The primary controller's LED continuously blinks, while the secondary controller's LED is steadily ON.



2. Telnet/user serial connection to the controller.



In the main menu check under "System Information" where it shows the primary controller in either slot A or B.



Slot A = the top slot



Slot B = the bottom slot





Internal Contributor/submitter
michiel.bijlsma@sun.com

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Eng Responsible Engineer
sushil.shirke@sun.com

Internal Services Knowledge Engineer
jeff.folla@sun.com

Internal Escalation ID
1-7008278

Internal Resolution Patches
113723-10, 113724-04, 113722-10

Internal Sun Alert Kasp Legacy ID
101663, 57753 (Sun Alert)

Internal Sun Alert & FAB Admin Info
Critical Category: Availability ==> Pervasive
Significant Change Date: 2005-03-18, 2005-08-29
Avoidance: Patch
Responsible Manager: Sunil-Kumar.Bali@Sun.COM
Original Admin Info: [WF 29-Aug-2005, Jeff Folla: Updated with final patch now available. This is now resolved. Re-published.]

[WF 14-Jun-2005, Jeff Folla: Updated sun alert with available patch 113723-10 .]

[WF 14-Jun-2005, Dave M; updated CF section for product clarification]
This document has been imported from KMS Creator and may need adjustment before re-publishing.

This imported document has been reviewed/adjusted by:
Review Name:
Review Date:

Original KMS Creator attributes below:

--- PLEASE DO NOT MAKE ANY CHANGES BELOW THIS LINE! ---

Sun Alert ID: 57753
Synopsis: Sun StorEdge 3310/3510/3511 FC Array Event Hardware Failure Messages May Result in Incorrect FRU Replacement
Category: Availability
Product: Sun StorEdge 3310/3510/3511 FC Array
BugIDs: 4995551
Avoidance: Patch
State: Engineering Complete
Date Released: 18-Mar-2005
Date Closed:
Date Modified: 16-May-2005
Escalation IDs: 1-7008278
Pending Patches: 113724-05, 113722-10
Resolution Patches: 113723-10
FIN:
FCO:
Date Submitted: 08-Mar-2005, 14-Mar-2005
Submitter: michiel.bijlsma@sun.com
Responsible Engineer: sushil.shirke@sun.com
Responsible Manager: Sunil-Kumar.Bali@Sun.COM
CTE group: NWS
Responsible Writer: jeff.folla@sun.com
Distribution: Contract SunSolve

Workflow History:

WF State: Issued, 16-May-2005, Jeff Folla
WF Note: Updating with patch available for the 3510

WF State: Rework, 16-May-2005, Jeff Folla
WF Note: Updating with available patch.

WF State: Issued, 23-Mar-2005, Jeff Folla
WF Note: This is now showing on sunsolve.

WF State: Issued, 21-Mar-2005, Jeff Folla
WF Note: This sun alert is not showing on sunsolve. Checked
with Jeff B to see if it was harvestable by tis. Jeff
confirmed it was not showing. Added a second
workflow to attempt to fix the issue. The document
still did not show as being harvestable after the
second workflow was added.

WF State: Issued, 18-Mar-2005, Jeff Folla
WF Note: Sent for release.

WF State: Draft, 17-Mar-2005, Jeff Folla
WF Note: Sent for review.



WF State: Draft, 17-Mar-2005, Jeff Folla
WF Note: Sent to submitter and responsible engineer to review on 3/16before
sending it out for general review since there were several changes
from the original draft.

WF State: Draft, 11-Mar-2005, Jeff Folla
WF Note: Article created.

Exported from KMS Creator Sat May 21 09:17:16 2005 GMT, olaf.reineke@sun.com
Product_uuid
3db30178-43d7-4d85-8bbe-551c33040f0d|Sun StorageTek 3310 SCSI Array
58553d0e-11f4-11d7-9b05-ad24fcfd42fa|Sun StorageTek 3510 FC Array
9fdbb196-73a6-11d8-9e3a-080020a9ed93|Sun StorageTek 3511 SATA Array

References

SUNPATCH:113722-10
SUNPATCH:113723-10
SUNPATCH:113724-04

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback