Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1002133.1 : Sun Fire[TM] Midrange Servers: Fast Data Access MMU Miss when issuing probe-scsi-all or boot
PreviouslyPublishedAs 203034 Symptoms In a specific configuration, a Sun Fire[TM] domain may encounter a "Fast Data Access MMU Miss" when the command probe-scsi-all or boot is issued at OBP (OpenBoot PROM). This document was written for a specific customer configuration that showed this problem. The configuration included a Cauldron-S card (501-6635) located in slot 0 of an I/O Board. A D240 storage array was attached to Cauldron-S scsi port 0. This array contains the domain boot device. It is unknown at this time whether different I/O cards or storage arrays are affected by this issue or not. But, for this specific case, both probe-scsi-all and boot failed for the same reason. In certain configurations the device path in error in probe-scsi-all may be different then the device path used for booting. In these configurations, boot should not fail with these same errors as probe-scsi-all. In this case, the document might still apply to the situations where boot is fine, and only probe-scsi-all errors. See the "Additional Information" section for more details regarding the failure and identifying if this document applies to your situation. The problem appeared like this in the domain console when trying to boot: {10} ok boot /ssm@0,0/pci@18,700000/pci@1/scsi@2/disk@0,0:a TL = 1, TT = 68. TSTATE= 0x1402 asi = 0x0, pstate = 0x14, cwp = 0x2] TPC= 00000000f0035664 TNPC= 00000000f0035668 SFSR= 0000000000808008, TAGACCESS = 00000000ffffe000 D-SFAR = 00000000ffffffff TICK= 80000021216428ec, TICKCMP = ffffffffffffffff You may see the full or partial error message "Fast Data Access MMU Miss" within the above output. Partial output of this error appeared like this when issuing a probe-scsi-all: {10} ok probe-scsi-all /ssm@0,0/pci@18,700000/pci@1/scsi@2,1 /ssm@0,0/pci@18,700000/pci@1/scsi@2 TL = 1, TT = 68. ata Access MMU Miss TSTATE= 0x1404 r = 0x0, asi = 0x0, pstate = 0x14, cwp = 0x4] TPC= 00000000f0035664 TNPC= 00000000f0035668 SFSR= 0000000000808008, TAGACCESS = 00000000ffffe000 D-SFAR = 00000000ffffffff TICK= 800000214f848d51, TICKCMP = ffffffffffffffff Resolution At this time, the issue has occurred at only one customer site that we know. The configuration is specific: This document applies to failures of probe-scsi-all or the boot command when an I/O card is located in an I/O Board slot 0. As stated earlier, boot will fail for this described reason only if the boot device is located on the same I/O card located in slot 0. At this time, the only reported I/O adapter which has this issue is the Cauldron-S (501-6635), and the issue has only occurred when a storage array (D240) was attached to the card scsi port 0 (Top port; Device path is "scsi@2" in probe-scsi-all). If you encounter a situation similar to this issue using a different I/O card, please contact Sun Support Services with the failure details and the CR and this SRDB will need to be updated with regards to the new failure symptoms/information. Below is the process which will confirm that this document and CR 6238924, applies to your failure situation. It shows how to reproduce the issue and also how to test the workarounds. If the issue can not be reproduced or the workarounds do not resolve your specific situation, then this document and CR may not apply to your situation. Please make sure that this process is executed in the domain console and that console logging is enabled so this information can be sent in to Sun Support Services for documentation purposes.
Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to the document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step. Process: 1) I/O card is located in I/O Board slot 0 (I/O Board number is not relevant). Card scsi port 0 is attached to scsi device (assumes Cauldron-S - and the scsi device attached is a D240 array). - Again, not sure if Cauldron-S is the only card susceptible to this; 2) Setkeyswitch on. 3) At OBP set diag-switch to true. - This will show "probing for device" information as the domain exits POST 4) Issue probe-scsi-all (or boot if the device is the boot device). - If you get a "Fast Data Access MMU Miss" as shown in the "Problem 5) Setkeyswitch standby or off. 6) Setkeyswitch on. 7) Issue reset-all. 8) Issue probe-scsi-all (or boot if the boot device, but note that the device path has changed). - If errors persist, your issue is different than the one described in this 9) Setkeyswitch standby/off. 10) Disconnect the cable from scsi interface 0 and attach it to scsi interface 1 on the I/O card (assumes Cauldron-S; SCSI 0 is top port, SCSI 1 is bottom port). 11) Setkeyswitch on 12) Issue probe-scsi-all (or boot if the boot device, but note that the device path has changed). - If errors persist, your issue is different than the one described in this 13) Setkeyswitch standby/off. 14) Move the I/O card to an empty slot on the same I/O Board. Attach the array to either scsi port (or repeat this later to test each port). 15) Setkeyswitch on. 16) Issue probe-scsi-all (or boot if the boot device, but note that the device path has changed). - If errors persist, your issue is different than the one described in this At
this point, if you have validated that each troubleshooting step
above is true for your environment, and the issue still exists,
further troubleshooting is required. Contact Sun Support. Relief/Workaround After implementing these workarounds the probe-scsi-all and boot commands will function normally. Reminder that boot fails only when the boot device is the same scsi device path as in error in probe-scsi-all. The workarounds are: 1) Leave the I/O card in slot 0 on the I/O Board but attach the array to the other scsi interface on the card (assumes Cauldron-S card which has two interfaces). This means the array would now use scsi interface 1 on the card (Device path "scsi@2,1"; Located on the bottom port of the card). 2) Move the I/O card to a different slot on the I/O Board. Problem is only reported when card is in slot 0 and card port 0 is used. The problem has not been reported when the card is located in other I/O Board slots (regardless of the card port used). NOTE: Options 1 and 2 change the I/O device path for the array, thus you will 3) Leave the I/O card in slot 0 and the array attached to the scsi port 0 ("bad" port). Issue a reset-all prior to issuing the probe-scsi-all or boot command. The reset-all needs to be executed prior to each of these commands. See below for details: 1) reset-all See the "Additional Information" section for details on the workarounds. Additional Information If you have reproduced the problem and confirmed the workaround, then please follow the directions listed in the "Resolution" section of this document to log a Sun Support Services Request and have the CR re-opened for further investigation. Use one of the workarounds provided in the "Temporary Workaround" section of this document while a permanent resolution to the problem is being developed. Product Sun Fire V1280 Server Sun Fire E6900 Server Sun Fire E4900 Server Sun Fire E2900 Server Sun Fire 6800 Server Sun Fire 4810 Server Sun Fire 4800 Server Sun Fire 3800 Server Sun StorageTek D240 Media Tray Internal Comments Feedback email alias: Feedback_MSG_Domain@sun.com Link to Cr 6238924 Cr#<SUNBUG: 6238924> Cauldron, probe-scsi-all, reset-all, boot, Fast Data Access MMU Miss, IO, slot, scsi, normalized Change History Date: 2009-11-25 User Name: Josh Freeman Action: Refreshed Comment: No changes - doc good as is. ESG Content Team update. Date: 2008-11-19 User Name: T230884 Action: Quality Review Date: 2007-03-12 User Name: 7058 Action: Approved Comment: Keywords and disclaimer only. OK to republish. Version: 9 Date: 2007-03-12 User Name: 67850 Action: Approved Serengeti is a codename, not a product name in the tmark database. Please remove Serengeti. I think it's perfectly acceptable for Sun Fire to stand on its own. Attachments This solution has no attachment |
||||||||||||
|