Document Audience:INTERNAL
Document ID:A0208-3
Title:440MHz and 450MHz UltraSPARC II Modules in Ultra 60/80, Enterprise 220R/420R and Netra t platforms may experience Red State Exception, Send Mondo Timeout and hard hangs
Copyright Notice:Copyright © 2007 Sun Microsystems, Inc. All Rights Reserved
Update Date:Mon Mar 29 00:00:00 MST 2004

             - Sun Proprietary/Confidential: Internal Use Only -

                             FIELD CHANGE ORDER
            (For Authorized Distribution by Enterprise Services)

FCO #: A0208-3
Status: inactive
Synopsis: 440MHz and 450MHz UltraSPARC II Modules in Ultra 60/80, Enterprise 220R/420R and Netra t platforms may experience Red State Exception, Send Mondo Timeout and hard hangs
Date: Mar/29/2004
SunAlert: Yes
Top FIN/FCO Report: No
Products Reference: Ultra 60/80, Enterprise 220R/420R, Netra t1120,
		   Netra t1125, Netra t1400, Netra t1405
Product Category: Server / System Component
Product Affected: 
Systems Affected:
Mkt_ID   Platform   Model   Description         Serial Number
------   --------   -----   -----------         -------------

for part number 501-6058-02 (or less):
  -        A27       All    Ultra 80            144xxxxx - 246xxxxx
  -        A23       All    Ultra 60            144xxxxx - 246xxxxx
  -        A34       All    Enterprise 220R     144xxxxx - 246xxxxx
for part number 501-6071-01: 
  -        A27       All    Ultra 80            144xxxxx - 246xxxxx
  -        A23       All    Ultra 60            144xxxxx - 246xxxxx
  -        A34       All    Enterprise 220R     144xxxxx - 246xxxxx
  -        A33       All    Enterprise 420R     144xxxxx - 246xxxxx
for part number 501-6209-02: (or less)
  -        N14       All    Netra t1405         144xxxxx - 246xxxxx
  -        N15       All    Netra t1400         144xxxxx - 246xxxxx
  -        N04       All    Netra T1120         144xxxxx - 246xxxxx
  -        N03       All    Netra T1125         144xxxxx - 246xxxxx
  -        N02       All    Netra T1120         144xxxxx - 246xxxxx

for part number 501-5682-03: 

  -        N14       All    Netra t1405         220xxxxx - 319xxxxx
  -        N15       All    Netra t1400         220xxxxx - 319xxxxx
  -        N04       All    Netra T1120         220xxxxx - 319xxxxx
  -        N03       All    Netra T1125         220xxxxx - 319xxxxx
  -        N02       All    Netra T1120         220xxxxx - 319xxxxx

X-Options Affected:

Mkt_ID   Platform   Model    Description                Serial Number
------   --------   -----    -----------	        -------------
X1197A      -         -      440MHz UltraSPARC II module      -
X1195A      -         -      450MHz UltraSPARC II module      -
Parts Affected: 
Part Number	       Description             	     Model
-----------	       -----------	  	     -----

501-6058-02(Or Less)   450MHz UltraSPARC II module     -
501-6071-01            450MHz UltraSPARC II module     -
501-6209-02(or Less)   440MHz UltraSPARC II module     -
501-5682-03            440MHz UltraSPARC II module     -

(SCSI Devices)
Type   Vendor    Model     SerialNumber(Min)   SerialNumber(Max)   Firmware
----   ------    -------   -----------------   -----------------   --------
  ECO: WO_25557
  DPCO: 336.B
  ESC: 538507
  FIN: I0896-1
  FIN: I0755-1
  FIN: I0616-1
  WWStopShip: P001-20085
  Sun Alert: 49945
Issue Description: 
Change History 

A0208-3  Date Modified: Mar/26/04

. PRODUCT AFFECTED:  see "Systems Affected". Now organized by part number
. AFFECTED PARTS:    501-6209-01 was changed to 501-6209-02
. CORRECTIVE ACTION: 501-6209-01 was changed to 501-6209-02
  Also see "Identification of Suspect Part"
. COMMENTS: additional information from internal section of Sun Alert 
  was added to FCO and removed from Sun Alert
A0208-2  Date Modified: Aug/04/03
. AFFECTED PARTS: Added part number 501-5682-03
. CORRECTIVE ACTION: Added "replace 501-5682-03 with 501-5682-04" 


Certain 440MHz and 450MHz UltraSPARC II Modules supported on Ultra 60/80,
Enterprise 220R/420R and Netra t platforms may experience early life
failures resulting in Red State Exception, Send Mondo Timeout and hard
hang errors.

A limited number of systems manufactured between November 01, 2001 and
November 22, 2002 may contain affected CPU modules.  The affected system
serial number range is between 144xxxxx and 246xxxxx.  The probability of
experiencing the described issue is considered low at < or = 4%.

The failures caused by this problem are not unique to this issue.  Failures
include EDP and WP panics, Red State Exception, Send Mondo Timeout panics,
machine hard hangs, and machine reboots.  Typically, after experiencing
such errors, the only means of rebooting the machine is by power cycling
it with the front power button or key switch.  To be sure that a failure
is related to this problem, the failure must be repeatable and, when data
is available, must always identify the same CPU module is at fault.  The
time between failures is typically two to five weeks.

Root cause determined that the socket used on the module has a 90 day
shelf-life.  Modules assembled with sockets less than 90 days old have
not experienced this problem.  Once assembled, socket aging is not an
issue.  All loose sockets in manufacturing and repair that were more
than 90 days old have been scrapped, and field spares have been purged
and reworked with good sockets.

Corrective action was implemented in manufacturing by purging and reworking
all modules with new sockets (less than 90 days old) via Worldwide Stopship
Purge P001-20085 on November 22, 2002.  Modules were then dash rolled via
ECO# WO_25557 as of December 17, 2002.  Corrective action was put in place
in Sun Services via DPCO 336 as of December 17, 2002.
Parts Affected: 
February 28th, 2005
|   |   MANDATORY (Fully Pro-Active)

|   |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan)

Replacement Time Estimate: 
0.5 hours
Special Considerations: 
Due to material availability issues, APac will not be fully ready
to support this FCO until approximately March 12, 2003.  All other
geographies are ready upon release of this FCO.

Because this problem manifests itself as a common system failure,
care should be given to first using the Best Practices Guide to
help determine the exact failure mode.  The E420R Best Practices
Troubleshooting Guide can be located at the following URL;

Corrective Action: 
NOTE! This FCO does not authorize the proactive replacement of any hardware.
      For those customers who insist on having their modules proactively
      replaced, please address this through the CIC program.  Reference
      the following URL for more information;


Upon repeat failure and unsuccessful attempts to correct the failure by
the above mentioned Best Practices, the failing module should be replaced.

Upon Failure replace as follows;

 - replace 501-6058-02 (or less) with 501-6058-03 (or above) or
   501-6058-02 (with DPCO 336 label), or 501-6071-02, or
   501-6071-01 (with DPCO 336 label)

 - replace 501-6071-01 with 501-6071-02 (or above) or with
   501-6071-01 (with DPCO 336 label)

 - replace 501-6209-02 (or less) with 501-6209-03 (or above) or
   501-6209-02 (with DPCO 336 label)
 - replace 501-5682-03 with 501-5682-04

Identification of Suspect Part

For Serial Numbers: 144xxxxx - 246xxxxx:

Systems manufacted between November 2001 and November 2002 may contain
potentially affected CPU modules.  The affected system serial number range is
144XXXXX through 246XXXXX.  (the first 3 digits reflect the year and week of

Locate the CPU module part number.  If it is 501-6071-01 or 501-6058-02 (or
earlier) and without a "DPCO 336" sticker, the system is possibly affected by
the described issue.  

If it is 501-6209-02 (or earlier), please locate the CPU module serial number.
If it is 0 through 1000 or 2000 through 3500, it may be affected.  If it is
1001-1999, or greater than 3500, it will not be affected.  To find the CPU
module serial number, look at the barcoded label on the CPU module.  There will
be a number above it, example 50162091300.  The last 4 digits (1300 in this
case) is the serial number.  In this example, 1300 is within the range of
modules that are not affected.

For Serial Numbers: 220xxxxx - 319xxxxx: 

Systems manufacted between May 2002 and May 2003 may contain potentially
affected CPU modules.  The affected system serial number range is 220XXXXX
through 319XXXXX.  (the first 3 digits reflect the year and week of

For CPU module part number 501-5682-03, please locate the CPU module serial
number.  If it is 103XXX and above, it may be affected.  To find the CPU module
serial number, look at the barcoded label on the CPU module.  There will be a
number above it, example 5015682106778.  The last 6 digits (106778 in this case)
is the serial number.  In this example, 106778 is greater than 103XXX and
therefore possibly affected.
Observed average system MTBF has decreased by approximately 500 hours
on systems containing affected 450MHz CPU modules manufactured between
November 02, 2001 and November 22, 2002.

It is important to note :

1.  These failures have been attributable to manufacture variability, not to 
the design of the CPU, module or the socket.

2.  The vast majority of sockets are still within design specification and will
function as expected.

Additional data that supports a bounded 90 day early life failure condition was
extracted from actual units returned from the field which had failed for a
socket related symptom.  In these cases modules were taken back through extended
system testing under varying corner conditions of temperature, voltage and
frequency margining.  Modules which failed for a socket contact related issue
showed a TTF averaging less then 38 days of continuous power on run time.

Sun has examined the socket subtier supplier process and imposed the
following changes:

1.  A tightened column height tolerance has been implemented on all new sockets
used in all Sun modules.  The contact column height tolerance has been changed
from .026-.035" to .029-.035" to insure a column height that provides optimum
ohmic contact under compression.

2.  Inspection/measurement points have been integrated into the socket
manufactured process flow to insure sockets are made to Sun's required

"Wholesale" replacement of modules has been shown to create potential for
mechanical contact failures and introduce instability into a already stable
Billing Type: 
Warranty: Sun will provide parts at no charge under Warranty
           Service. On-Site Labor Rates are based on how the
           system was initially installed.

 Contract: Sun will provide parts at no charge. On-Site Labor Rates
           are based on the type of service contract.

 Non Contract: Sun will provide parts at no charge. Installation by
               Sun is available based on the On-Site Labor Rates
               defined in the Price List.

Implementation Footnote: 

i)   In case of Mandatory FCOs, Sun Services will attempt to contact
     all known customers to recommend the part upgrade.

ii)  For controlled proactive swap FCOs, Sun Services mission critical
     support teams will initiate proactive swap efforts for their respective
     accounts, as required.

iii) For Replace upon Failure FCOs, Sun Services partners will implement
     the necessary corrective actions as and when they are required.


All released FINs and FCOs can be accessed using your favorite network
browser as follows:

SunWeb Access:

* Access the top level URL of http://sdpsweb.Central/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.

SunSolve Online Access:

* Access the SunSolve Online URL at http://sunsolve.Central/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:

* Access the top level URL of


Send questions or comments to finfco-manager@Sun.COM
