Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1018867.1 : Sun Fire[TM] 12K/15K/E20K/E25K: Mailbox Framework Failures, Even Without DR Operation
PreviouslyPublishedAs 230667
Applies to:Sun Fire 12K ServerSun Fire 15K Server Sun Fire E20K Server - Version: Not Applicable and later [Release: N/A and later] Sun Fire E25K Server - Version: Not Applicable and later [Release: N/A and later] All Platforms SymptomsSymptomsSun Fire[TM] 12K/15K domains can experience the following error messages, even if there are no DR operations in progress: drmach: [ID 757311 kern.warning] WARNING: mboxsc_putmsg failed: 0x91 Causesee solutionSolutionResolution Steps- Check if any "cfgadm based" programs running on the Sun Fire 12K/15K domain(s) are reporting the warnings. Examples :
- Check for signs of lock contention reported on the System Controllers. The locking issues can be identified in the platform messages files /var/opt/SUNWSMS/SMS/adm/platform/messages*:
Although each SMS version can be affected, it is strongly recommended that you upgrade to SMS 1.6, and use any available patches for the version you are currently using. Relief/Workaround A short term workaround may be to failover the SC's. The following steps will be needed to perform the failover safely and prevent any uncontrolled state (no SC available).
You can rerun the same procedure to fail back to the former main SC. Additional Information Even when there is no DR operation in progress, cfgadm or libcfgadm can be in use. Usage for cfgadm or libcfgadm implies communication with the System Controller via mailboxes and requests to hwad (managing locks to protect HW accesses). If, for any reason (under certain loads, it takes a long time to access shared hardware resource -DARBs, ...) hwad experiences lock contention, there could be delays in responding to the requests, including requests from cfgadm. Sun Fire 12K/15K domains that are running programs based on 'cfgadm' or 'libcfgadm' display these warnings because such requests are made on a regular basis. The lock contention is not necessarily a permanent situation. It can be a consequence of hwad being very busy at that moment. @Internal Comments For Internal Use Only References and bug IDs : Escalations : 549864, 543000 Internal Bug Reports : 4931012 - memory allocation failure from hwad Internal Bug Reports : 4966974 - WARNING: Mailbox framework failure:outgoing Failure 0x91 "ERESTART" stands for restartable system call. This subject has been discussed on the starcat-support alias and logged to the Techmail Archives Note: SRDB 73129 was combined with this document. stephane.dutilleul@sun.com Product Sun Fire 12K Server Sun Fire 15K Server Sun Fire E20K Server Sun Fire E25K Server Keywords dr, mailbox, framework, failure, cfgadm, cst, lock, outgoing, 0x91, mboxsc, hwad, 15K, 12K, SF15K, SF12K, Starcat, E25k, E20K @Previously Published As 73116 @Change History Date: 2010-05-04 User Name: Cootware Action: Content Team Review Comment: Updated products list, content, and keywords. Attachments This solution has no attachment |
||||||||||||
|