Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1020306.1
Update Date:2011-03-22
Keywords:

Solution Type  Sun Alert Sure

Solution  1020306.1 :   Limited Number of Sun Fire T2000 and SPARC Enterprise T2000 Servers may Experience a Shutdown with SC Alert "Chassis cover removed"  


Related Items
  • Sun Fire T2000 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
255948


Bug Id
<SUNBUG: 6815610>

Product
Sun Fire T2000 Server

Date of Workaround Release
27-Mar-2009

Date of Resolved Release
16-Apr-2009

A limited number of Sun Fire T2000 and SPARC Enterprise T2000 servers may experience a shutdown with SC Alert: "Chassis cover removed" (see below for details)

1. Impact

A limited number of Sun Fire T2000 and SPARC Enterprise T2000 servers may experience a system shutdown after the System Controller (SC) Alert: "Chassis cover removed" is displayed on the console, causing system downtime.

In addition, this issue may result in unnecessary hardware replacement.


2. Contributing Factors

This issue can occur on the following platforms:
  • Sun Fire T2000 Server without patch 139434-02

Note: This issue rarely occurs and has only been observed on the above mentioned T2000 servers.


3. Symptoms

The system will report the following errors on the system console, which will also be recorded in the ALOM logs.  An example from 'showlogs -v' would be similar to the following:

      02:24:25: 0004007c: "System poweron is disabled."
      02:24:25: 00040083: "Chassis cover removed."
      02:24:25: 0004000e: "SC Request to Power Off Host Immediately."
      02:24:26: 0004004f: "Indicator SYS/ACT is now STANDBY BLINK"
      02:24:27: 0004007d: "System poweron is enabled."
      02:24:31: 00040029: "Host system has shut down."

As shown in the example, the key to identify this issue is that in the logs, the line "Chassis cover removed" will be followed by the line "SC Request to Power Off Host Immediately".  If the line "SC Request to Power Off Host Immediately" is missing from the above message, then this is a different issue and may indicate a hardware condition with the cover interlock switch.


4. Workaround

There is no workaround. Please see Resolution below.


5. Resolution

This issue is addressed on the following platform:
  • Sun Fire T2000 Server with patch 139434-02 or later



This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.

Copyright 2000-2009 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.


Modification History
01-Apr-2009: updated Workaround section
16-Apr-2009: updated Contributing Factors and Resolution sections. Now Resolved.


Internal Eng Responsible Engineer
steve.trullo@sun.com, grant.gredvig@sun.com

Internal Contributor/submitter
Dencho.Kojucharov@sun.com

Internal Services Knowledge Engineer
karen.edwards@sun.com

Internal Eng Business Unit Group
SSG WGS (Workgroup Systems)

Internal Sun Alert & FAB Admin Info
26-Mar-2009: karen, created based on FAB and request from customer. Sending to sunalert_review today.

Internal Comments (for SAs)


Root Cause


The suspected root cause is invalid CI (Chassis Intrusion) bit read
from the ADM1026, either caused by i2c corruption or low ADM1026 CI pin
noise tolerance.  Also, the ALOM shutdown (based on SystemPowerON
check) after failed Read from ADM1026 should be disabled, because in a
real CI, the FPGA will have already turned off power.



So the poweron check, in conjuction with the root cause (i2c corruption
or over-sensitive adm1026 CI pin), causes the host to power off with
the message "SC Request to Power Off Host Immediately".



A firmware patch is being developed to permit up to three retry reads
to ADM1026, with clear in between to confirm status.  If ALOM is still
reporting a chassis cover problem after 3 tries, it will display a
message, but will NOT shutdown the box.



Per CR 6815610:



The fix for this will involve 4 changes:



a) there should be multiple retries for the read, 3 retries seems reasonable.



b) the error message should only be printed after the last failed retry



c) the message text should change to "Chassis cover interlock open" It needs

to be investigated if this can be done by changing just the text or if a new

event needs to be created.



d) if the fault has cleared by the next monitoring cycle a fault cleared message

 should be printed "Chassis cover interlock is OK"





see also FAB 254469




http://sunsolve.sun.com/search/document.do?assetkey=1-63-254469-1




Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback