Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1008676.1
Update Date:2011-02-23
Keywords:

Solution Type  Technical Instruction Sure

Solution  1008676.1 :   'Best Practices' and configuring loghost on Sun Fire[TM] 3800,4800,4900,6800, and E6900 servers [Video]  


Related Items
  • Sun Fire E6900 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire 4810 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Midrange Servers
  •  
  • GCS>Support>KM>Content>Video
  •  

PreviouslyPublishedAs
211896


Applies to:

Sun Fire 3800 Server
Sun Fire 4800 Server
Sun Fire 4810 Server
Sun Fire 6800 Server
Sun Fire E4900 Server
All Platforms

Goal

This document gives 'Best Practice's and step by step instructions on how to configure a loghost on the Sun Fire[TM] 3800-6800 and E4900/E6900 servers.

Available for this topic, a Video Tutorial; Brief how-to video tutorials that provide step-by-step instructions answering Sun's most frequently asked questions. View the video answer and/or follow the detailed instructions below.

Video - How to Archive SC Logs (05:30)

Solution

It is considered a 'Best Practice' to configure a 'loghost's for each server and production domain.
A 'Loghost' on a Solaris platform can permanently save messages that are logged in the System Controller's NVRAM buffer. This will insure that they are not lost due to either a power event, or by rolling off of the small first in first out buffer in the system controller. Properly stored, they can be quickly accessed if a domain outage occurs even if the server controlling the domain is unresponsive. These files when sent to SUN Engineers can speed up troubleshooting and resolve problems quickly and accurately.

Configuration Best practices.


1) Log files to an independent Solaris platform not the domain('s) associated with the server. This is to remove the issue which is created when the domain which is down is the same domain which was collecting the loghost data. This issue presents two problems. First, data, that is needed for troubleshooting becomes inaccessible. Secondly, as soon as Solaris goes down on the domain which is logging, all further messages from the system controller are lost forever. Failure to follow these best practices will lead to an increase in the time it takes to troubleshoot severe problems on the server and could drastically increase downtime if the situation arises.

2) The loghost could be any type of Solaris machine running on the same subnet as the system controllers. If the machine can ping the system consoles it could generally be used.

3) Capture Platform and also domain logs in the loghost. Platform logs give a great snapshot of what is happening globally in a server and domain logs give a much more detailed error report of what issues may be occurring on a failing domain.


Getting Started:
The first step is to create 'loghost's is to configure the 'loghost' on the System Controller(SC). Setup the IP address of the admin workstation, and then, configure the syslog.conf file on the admin workstation itself.

1. On the System Controller(SC) of the Sun Fire platform, do the following:

First, log on to the SC, either by telnet, ssh or through the serial port, then
issue the following command:

      r12-1a:SC> setupplatform -p loghost
      Loghosts
--------
      Loghost []: 172.16.40.10
Log Facility [local0]: local0

Note: You must not use a domain on the platform itself as the loghost.

The format is slightly different below 5.12.5, an OBP upgrade is long overdue.

<Document: 1003034.1> has links to the latest OBP patches.

This setting will send platform messages with facility local0, to the admin
workstation with IP address 172.16.40.10. The local0 facility is used to
differentiate messages coming from the platform.

Log Facility will default to local0, this setting allows syslogd on the admin
workstation to determine which file to log the message into.

For domains, setup the loghost the same way, but use the 'setupdomain' command.
For instance, if you want to configure domains A and B loghost facility:

      r12-1a:A> setupdomain -p loghost
      Loghosts
--------
Loghost []: 172.16.40.10
Log Facility [local0]: local1
      r12-1a:B> setupdomain -p loghost
      Loghosts
--------
Loghost []: 172.16.40.10
Log Facility [local0]: local2

NOTE: local8 and higher are not allowed and won't work - there are 8 local
facilities, local0-local7. See the syslog.conf(4) man page for more information.

It's also possible to leave the Log Facility set to it's default of "local0"
but if this is done it will be more  difficult to separate messages coming from the platform and domains, into different messages files on the admin workstation.


2. On the admin workstation, do the following:

Now, configure the syslog.conf file on the admin workstation, to place the
messages into specific messages files. To do this, create the following message
files, and simply add the following lines in the file /etc/syslog.conf:

      % ssh -l root 172.16.40.10
      # touch /var/adm/messages.platform
# touch /var/adm/messages.domainA
# touch /var/adm/messages.domainB
      # vi /etc/syslog.conf
      [...]
#
# non-loghost machines will use the following lines to cause "user"
# log messages to be logged locally.
#
ifdef(`LOGHOST', ,
user.err /dev/sysmsg
user.err /var/adm/messages
user.alert `root, operator'
user.emerg *
)
local0.notice /var/adm/messages.platform <= ADD THIS LINE
local1.notice /var/adm/messages.domainA <= ADD THIS LINE
local2.notice /var/adm/messages.domainB <= ADD THIS LINE

NOTE: You must use TABs between the two entries spaces will not work.

Then restart the syslog daemon:

     # /etc/init.d/syslog stop
# /etc/init.d/syslog start

or just force it to re-read the configuration file:

      # pkill -HUP syslogd

3. To test that your loghost configurations are working correctly, do the
following:

To test the platform shell loghost file, on the main SC run setfailover off
followed by setfailover on You should see the messages appear in the
/var/adm/messages.platform file.

To test the domain shell loghost files, there are three methods. The first method
requires an outage of the domain whereas the second does not. The third one
is very simple and non-intrusive.

Method 1

From the domain shell, perform a "setkeyswitch on". Note, the keyswitch must be
initially in the off position. If the keyswitch is initially in the on position,
you will need to perform a "setkeyswitch off" followed by a "setkeyswitch on".
If the domain loghost setup is correct, you should see the output from LPOST
appear in the domain shell's loghost file.

Method 2

If an outage is not acceptable, the following procedure may be performed. This
procedure relies on a spare System Board being available, or a Dynamic
Reconfiguration(DR) operation.

Spare System Board available

If there is a spare System Board available perform the following:

1) First, add the spare System Board to the domain for which you wish to test
the loghost setup, at the platform shell.

2) Power off the System Board if it is powered on. If it is powered off already,
ignore this step.

3) Power on the System Board

4) Perform a "testboard SB#" in the domain shell you are testing, where # is the
System Board number.

The testboard command will cause an LPOST to be run on the specified System
Board only, and will cause output to be displayed to the domain loghost(if
configured correctly), as well as to the console.

5) You can then re-assign the System Board to another domain for testing, and
follow the steps above, for each domain that is to be tested.

No spare System Boards, DR can be performed

If there are no spare System Boards, and a DR operation can be performed on a
System Board in one of the domains, perform the following:

1) DR a System Board out from a domain on the system, via the cfgadm command
from Solaris.

2) Ensure the System Board is assigned to the domain you wish to test. You may
need to perform a deleteboard/addboard operation at the platform level.

3) Power off the System Board if it is powered on. If it is powered off already,
ignore this step.

4) Power on the System Board

5) Perform a "testboard SB#" in the domain shell you are testing, where # is the
System Board number.

The testboard command will cause an LPOST to be run on the specified System
Board only, and will cause output to be displayed to the domain loghost(if
configured correctly), as well as to the console.

6) You can then re-assign the System Board to another domain for testing, and
follow the steps above for each domain that is to be tested.

7) Once finished, assign the System Board to the original domain, and DR the
board back into the domain.

Method 3

If an outage is not acceptable and you do not have spare board or do not want
to use DR then do the following:

1) Look at the current keyswitch setting in the domain shell

2) If it is 'on' then perform "setkeyswitch secure" and then "setkeyswitch on".
If it is 'secure' then perform "setkeyswitch on" and then "setkeyswitch secure".

The setkeyswitch command will cause a transition of keyswitch between 'on' and
'secure' positions and will cause output to be displayed to the domain loghost
(if configured correctly), as well as to the domain buffer (you may check it
with "showlogs" command).



Product
Sun Fire 6800 Server
Sun Fire 4810 Server
Sun Fire 4800 Server
Sun Fire 3800 Server
Sun Fire E6900 Server
Sun Fire E4900 Server

Internal Comments
Audited/updated 11/10/09 - roberto.morandi@Sun.COM, Mid-Range Server Content Team


The pre 5.12.5 loghost setup looked as below, but note - to be running lower than 5.12.5 means no patching since Jan 2002 and is thus highly unlikely

     For firmware 5.12.4 and lower:

     r12-1a:SC> setupplatform -p loghost

     Loghosts
--------
Loghost []: 172.16.40.10:local0

loghost, system controller, test, configuration, sunfire
Previously Published As
27553

Change History
Date: 2006-08-22
User Name: 97961
Action: Update Canceled
Comment: *** Restored Published Content *** - Audience changed to "Contract" per FvF http://kmo.central/howto/FvF.html
Also refer to: http://kmo.central/howto/content/voyager-contributor-standards.html
Version: 0
Date: 2006-08-22
User Name: 97961
Action: Update Started
Comment: Need to change audience to 'Contract' to abide by FvF policy.
Version: 0

Date: 2006-08-15
User Name: 97961
Action: Approved
Comment: Publishing. No further edits required.
Version: 19

Date: 2006-08-15
User Name: 97961
Action: Accept
Comment:
Version: 0

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback