Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1017414.1
Update Date:2010-08-19
Keywords:

Solution Type  FAB (standard) Sure

Solution  1017414.1 :   OBP update required for Panther based Sun Fire V490 and V890 systems to enhance Processor Illegal Command Detection.  


Related Items
  • Sun Fire V890 Server
  •  
  • Sun Fire V490 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Controlled Proactive
  •  

PreviouslyPublishedAs
228509


Bug Id
<SUNBUG: 6364004>

Product
Sun Fire V490 Server
Sun Fire V890 Server

Some V490 and V890 Systems may not have illegal instruction trapping enabled (see details below).

Impact

Some V490 and V890 Systems may not have illegal instruction trapping enabled.  Currently, high-end and midrange Sun systems have illegal command detection enabled.


Contributing Factors

Systems with UltraSparc IV+ (Panther) CPU modules installed.


Symptoms

Illegal commands from the CPU have the potential to be very disruptive. If left undetected, they may result in random and very difficult to diagnose system crashes.


Root Cause

On Panther based V490 and V890 Systems there are a set of bits within the DCDS data switch device that are configured by the system OBP at boot time.

One of these bits enables Illegal CPU command detection, detecting that the CPU has provided the DCDS with an undefined operation request.

This bit is disabled on V490 and V890 Systems with OBP versions prior to 4.18.8.


Resolution

Update OBP to 4.18.8 (or greater) which contains code that turns on Safari Select Error illegal command checking.  This will NOT resolve the illegal command, but if one occurs it will correctly detect it and keep the system from progressing and possibly corrupting data.

When system detects Invalid Commands in 4.18.8 (or greater), DCDS will force the error pin to change the state and error will be propagated to BBC chip, which will force the FATAL RESET of the system.

During processing of the FATAL RESET, OBP will print out all CPU and JTAG scan data. The scan chain and FATAL RESET will print out as one long output.  This dump will indicate the Safari Invalid Command bit set in DCDS chip.

NOTE:  If you are unable to confirm that you have experienced this issue via the Fatal Reset Decoder, escalate to TSC for further analysis.  If it is clear to you that the FATAL RESET was NOT a SSEL type, use normal documented troubleshooting procedures for analyzing FATAL RESET outputs.

Comments

V490V890 Systems with OBP 4.18.8 (or greater) allows for the JTAG SCAN Dump/Shadow Scan Chain output to be appended to the traditional Fatal Reset console output.

Enter the entire dump into the Fatal Reset Decoder...

  http://systems-tsc/twiki/pub/Cgisafe/FatalResetDecoder/fataldecoder.exe

Search the decoded Scan Chain output for SXSEL.  Safari Select Error illegal command has been detected from the Centerplane/Motherboard slot if SXSEL bits are enabled, similar to the below example:

Cheetah0 Command Error Reg.
Shadow Scan Register of device DCDS_0_Slot A, length 15 is:

SXSEL_ACC_ERR[1] = 1 (SX Accumulated Invalid Command Error)

FE_ERR[1] 1 (First err bit is set if the first err in the DCDS was in this reg)

SXSEL_ERR[1] = 1 (SX Invalid Command Error)

In this example DCDS_0 for CPU/MEM board A (Slot A) detected an illegal instruction and initiated the Fatal Reset.  CPU/MEM board A should be replaced.  Note on the part return tag that it was a Fatal Reset of type "Safari Select Error illegal command" and include the salient portion of the Scan Chain per the example above.

Related Information
  • URL:
     http://sunsolve2.central.sun.com/search/document.do?assetkey=1-21-119243-02
     http://sunsolve2.central.sun.com/search/document.do?assetkey=1-21-119244-02
  • Other: RTI #305851


Modification History
Modification History (see details below).

28-Feb-2006: Updated doc to add the long awaited patch number that include
                     the 4.18.8 firmware.
14-Feb-2007: Added decoded failure signature and additional resolution steps,
                     changed PTS to TSC, and corrected Responsible Engineer's email
                     in Contacts section.
05-Mar-2008: Removed statements referencing CPAS process in Resolution Note,
                     moved Internal Comments section into Comments section at the end
                     of the FAB Body, and updated the Responsible Engineer section.

Previously Published As
102180
Internal Contributor/submitter
david.ward@sun.com, Dan.Doyle@Sun.COM

Internal Eng Responsible Engineer
Dan.Doyle@Sun.COM, Don.Palko@Sun.COM, Dong.Kang@Sun.COM, Harry.Huynh.Sun.COM

Internal Services Knowledge Engineer
sean.hassall@sun.com, joe.davis@sun.com

Internal Eng Business Unit Group
SSG WGS

Internal Resolution Patches
119243-02 (or later) - Sun Fire V480/490
119244-02 (or later) - Sun Fire V880/890

Internal Kasp FAB Legacy ID
102180

Internal Sun Alert & FAB Admin Info
Critical Category:
Significant Change Date:
Avoidance: Patch
Responsible Manager: null
Original Admin Info: WF - made changes as outlined in Change History per request from
new Responsible Engineer. - Joe 2/14/07
WF - removed some incorrect wording based on previous update that
only added new information. - Joe 2/15/07

Internal SA-FAB Eng Submission
OBP update required for Panther based Sun Fire V490 and V890 Systems to enhance Processor Illegal Command Detection.

Product_uuid
5c71fc02-5e51-11d7-8add-8938754df22a|Sun Fire V490 Server
5d2816fe-5e51-11d7-8de2-d7bc0dd226fc|Sun Fire V890 Server

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback