Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1003245.1
Update Date:2010-01-07
Keywords:

Solution Type  Technical Instruction Sure

Solution  1003245.1 :   Sun Fire[TM] 3800-6900: System Controller failover functionality  


Related Items
  • Sun Fire E6900 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire E2900 Server
  •  
  • Sun Fire 4810 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Midrange Servers
  •  

PreviouslyPublishedAs
204459


Description
The Sun Fire[TM] System Controller (SSC) provides management functionality and clock to the Sun Fire platform.

If the master SSC (usually SSC0) fails and the system includes two SSCs, clock and management functionality have to fail over to the spare SSC.

This document explains the commands added to firmware 5.13.X and above which allow manual control over failover between the main and spare SSC.



Steps to Follow
Starting with 5.13.X Sun Fire[TM] 6800/4810/4800/3800 and 5.16.X on Sun Fire[TM] E4900/E6900 systems can be configured with two system controllers for high availability. In a high-availability System Controller* (SC) configuration, one SC serves as the main SC, which manages all the system resources, while the other SC serves as a spare.

When certain conditions cause the main SC to fail, a switch over or failover from the main SC to the spare is triggered automatically, without operator intervention. The spare SC assumes the role of the main and takes over all system controller responsibilities.

*Note:   The Sun Fire[TM] System Controller (SSC) is also  refered to  as  a System Controller (SC).

New commands have been added to manage this functionality: setfailover and showfailover.

setfailover -- set automatic/manual SC failover 
Usage: setfailover [-y|-n] off|on|force 
       setfailover -h 
off    This option prevents a failover until the failover feature is reenabled. 
on     Enables failover for systems that previously had failover disabled due to a failover or an operator request. 
force       Causes a forced failover to the spare SC. 
-y          Do not prompt for confirmation 
-n          Do not execute command if confirmation is requested 
-h          Display the help message for this command 

This command enables you to control automatic or manual SC failover. Be aware that if you force a failover using this command, SC failover is disabled after the manual failover occurs and must be re-enabled manually using the command setfailover on.

showfailover -- Enables you to monitor the state of the SC and clock failover. 
Usage: showfailover [-v] 
       showfailover -h 
-v          Verbose mode. Displays all available command information, which includes both SC and clock failover status. 
-h          Display this help message 

The SC failover state can be one of the following:  

  • enabled and active - SC failover is enabled and functioning normally.

  • disabled - SC failover has been disabled due to an operator request (setfailover off) or because a failover has occurred.

  • enabled but not active - SC failover is enabled, but certain components, such as the spare SC or the centerplane between the main and spare, are not in a failover-ready state (available and responding).

The clock failover state can be one of the following:

  • enabled - Clock failover is enabled.

  • disabled - Clock failover has been automatically disabled due to a hardware problem.

Example of how to force a manual failover:

1. Connect to the System Controller (failover can be initiated from either SC).

System Controller 'sunfire12-sc1': 
    Type  0  for Platform Shell 
    Input:  0  
Platform Shell - Spare System Controlle 
sunfire12-sc1:sc> 

2. Verify that failover is enabled and active.

sunfire12-sc1:sc>  showfailover -v  
SC: SSC1  
Spare System Controller 
SC Failover: disabled <---failover is disabled, so we must enable it. 
sunfire12-sc1:sc>  setfailover on  
Dec 12 16:06:51 sunfire12-sc1 Platform.SC: SC Failover: enabled but not active. 
SC Failover: enabled but not active. 
sunfire12-sc1:sc> Dec 12 16:07:09 sunfire12-sc1 Platform.SC: SC Failover: enabled and active. 
sunfire12-sc1:sc>  showfailover  
SC Failover: enabled and active. <---now failover is enabled and active. 

3. Manually force the Spare System Controller to become the Main System Controller.

sunfire12-sc1:sc>  setfailover force  
SC: SSC1 
Spare System Controller 
SC Failover: enabled and active. 
Clock failover enabled. 
This will abruptly interrupt operations on the other System Controller. 
This System Controller will become the main System Controller. 
Do you want to continue  [no]  yes  
Dec 12 16:10:18 sunfire12-sc1 Platform.SC: SC Failover: becoming main SC 
sunfire12-sc1:sc> Dec 12 16:10:19 sunfire12-sc1 Platform.SC: SC Failover: disabled 
Dec 12 16:10:26 sunfire12-sc1 Platform.SC: Chassis is in single partition mode. 

4. Verify the status of the system controller and re-enable failover.

sunfire12-sc1:SC>  showfailover  
SC Failover: disabled  <---failover will be disabled after a failover has occured and must be re-enabled. 
sunfire12-sc1:SC>  setfailover on  
Dec 12 16:14:14 sunfire12-sc1 Platform.SC: SC Failover: enabled but not active. 
SC Failover: enabled but not active. 
sunfire12-sc1:SC> Dec 12 16:14:18 sunfire12-sc1 Platform.SC: SC Failover: enabled and active. 

For Firmware 5.15.X and above

Starting with 5.15.X Sun Fire 6800/4810/4800/3800 systems cannot execute a manual failover if the date difference between the System Controllers is more than 1 minute.

See the example below:

ssc0:SC>  showdate
Sun Fire E4900 Server
Sun Fire E2900 Server
Sun Fire 6800 Server
Sun Fire 4810 Server
Sun Fire 4800 Server

Internal Comments
The following is strictly for the use of Sun employees:

Check the ReadMe of Patch 800054 and /or Release Notes of future Firmware versions for SSC failover changes.

For more information about System Controller Failover with 5.13.X firmware check the following resources:

Sun Fire 6800/4810/4800/3800 Systems Platform Administration Manual (Firmware Version 5.13.0), http://docs-pdf.sun.com/816-2970-10/816-2970-10.pdf

Sun Fire 6800/4810/4800/3800 System Controller Command Reference Manual (Firmware Version 5.13.0), http://docs-pdf.sun.com/816-2971-10/816-2971-10.pdf

Troubleshooting System Controller Failover issues Technical Instruction < Solution: 206783 >


Audited/updated 01/07/10 - roberto.morandi@Sun.COM, Mid-Range Server Content Team


SunFire, controller, failover, SCC, setfailover, showfailover
Previously Published As
26286

Change History
Date: 2005-03-04
User Name: 71396
Action: Approved
Comment: Performed final review of article.

No changes required.

Publishing.
Version: 8
Date: 2005-03-01

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback