Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1021044.1
Update Date:2011-03-21
Keywords:

Solution Type  Sun Alert Sure

Solution  1021044.1 :   Improper AC Input Power Supply Redundancy Testing on Sun SPARC Enterprise M3000/M4000/M5000 Servers May Result in Domain Outages, or Cause Components to be Falsely Marked as Faulty  


Related Items
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
269808


Bug Id
<SUNBUG: 6729779>

Product
Sun SPARC Enterprise M3000 Server
Sun SPARC Enterprise M4000 Server
Sun SPARC Enterprise M5000 Server

Date of Resolved Release
14-Oct-2009

Improper AC Input Power Supply Redundancy Testing on Sun SPARC Enterprise M3000/M4000/M5000 Servers May Result in Domain Outages, or Cause Components to be Falsely Marked as Faulty

1. Impact

On Sun SPARC Enterprise M3000/M4000/M5000 Servers running XCP firmware version prior to 1090, improper AC input power supply redundancy testing may result in domain outages or boot failures. In addition, certain components may be falsely marked as faulty, prompting unnecessary replacement of good components.

2. Contributing Factors

This issue can occur in the following release:

SPARC Platform
  • XCP firmware versions prior to 1090 for Sun SPARC Enterprise M3000/M4000/M5000 Servers
Note: The x86 platform is not affected by this issue. No other SPARC platform systems are affected.

This issue can occur when performing AC input power supply redundancy testing using an improper procedure, as in the following example:

1. Power cord in one of the power supplies is pulled or grid is switched off.
2. Insert power cord that was removed in previous step, or switch on grid that was switched off.
3. Wait for power recovery message similiar to "Aug 10 19:40:38 UTC 2009 SCF:Power recovery (/PSU#0)".
4. Remove power cord or switch off grid of a different power supply.

which may result in one or both of the following:
  • Domain outage or boot failure
  • Components will be falsely marked as faulty
To determine the version of XCP firmware installed on a system, the following command can be used at the XSCF> prompt:
XSCF> version -c xcp
XSCF#0 (Active )
XCP0 (Current): 1080
XCP1 (Reserve): 1080
XSCF>
If the value listed as "Current" is less than 1090, the system may be vulnerable to this issue.

3. Symptoms

Messages similiar to the following will be in the "showlogs monitor" output:
Aug 10 19:40:31 UTC 2009 Power failure (/PSU#1)
Aug 10 19:40:38 UTC 2009 Power recovery (/PSU#1)
Aug 10 19:40:52 UTC 2009 Power failure (/PSU#0)
Aug 10 19:41:23 UTC 2009 Power recovery (/PSU#0)
Aug 10 19:42:07 UTC 2009 All domains shutdown started
4. Workaround

If current XCP firmware version is before 1090, proper procedure for AC input power supply redundancy testing is as follows:

1.  Switch off all main line switches of AC GRID A.
2.  Confirm that the POWER LED on the operator panel is on.
3.  At the XSCF Shell, verify power failure with the showlogs event command.
4.  Switch on all the main line switches of AC GRID A (switched off in Step 2).
5.  At the XSCF Shell, verify power recovery with the showlogs event command.
6.  Check that the AC-LED and DC-LED are lit on PSU#0 (M4000 server) or PSU#0/PSU#2 (M5000 server).
7.  From the XSCF Shell, confirm a Power Status is On by executing the showhardconf command.
8.  Switch off all main line switches of AC GRID B.
9.  Confirm that the POWER LED on the operator panel is on.
10.  At the XSCF Shell, verify power failure with the showlogs event command.
11.  Switch on all the main line switches of AC GRID B (switched off in Step 9).
12.  At the XSCF Shell, verify power recovery with the showlogs event command.

5. Resolution

This issue is addressed in the following release:

SPARC Platform
  • XCP firmware version 1090 or later for Sun SPARC Enterprise M3000/M4000/M5000 Servers
XCP 1090 packages are available from the following link:

http://www.oracle.com/technetwork/server-storage/sun-sparc-enterprise/downloads/index.html

Note: A new "Power recovery complete" message has been added to XCP release 1090 to show when it is safe to power off the next grid.


Modification History
12-Jan-2010: Updated Impact and Workaround sections for clarification
10-Jun-2010: Updated to include M3000 Server

Internal Comments (for SAs)
Please send technical questions to the following email:
sunalert-tech-questions@sun.com
and CC the following persons:
Internal Contributor/Submitter
Internal Eng Responsible Engineer
Internal Services Knowledge Engineer
Internal Contributor/submitter
michael.cootware@sun.com
Internal Eng Responsible Engineer
James.Hsieh@sun.com
Internal Services Knowledge Engineer
david.mariotto@sun.com
Internal Eng Business Unit Group
Systems Group - Enterprise Systems


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback