Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type FAB (standard) Sure Solution 1000786.1 : Recommended PCI Card Slotting on V480/V490 to prevent NVRAM Corruption of FCAL HBAs.
PreviouslyPublishedAs 201063 Product Sun Fire V490 Server Sun Fire V480 Server Bug Id <SUNBUG: 6460429> Impact FCAL NVRAM corruption can occur at power cycle time. When NVRAM is corrupted, even by a single bit, the driver will not trust any of the contents including the World Wide Number (WWN). The driver will then use default values. If the leadville driver is in use, the impact is minimal because leadville will generate a WWN for the card/port based on the hostid. IF the Qlogic native driver is in use, the WWN will be zeroed. In this case, if there are 2 systems on the same storage SAN, both will have a duplicate WWN. If two hosts have the same WWN, then the two hosts can not "login" to the SAN since the SAN traffic will not know who is the correct WWN to send SAN traffic to. The WWN is similar to the MAC address for network communications. It should be unique on a SAN Storage network. Contributing Factors This issue can occur on the following platforms:
The following X-Options are affected: X6768 SG-XPCI2FC-QF2(Z) QLogic QLA2342-SUN PCI/PCI-X 2Gigabit/Sec PCI Dual FC Host Adapter or SG-XPCI1FC-QL2 Qlogic QLA2340-CK PCI/PCI-X Amber2A* 2Gigabit/Sec PCI-X Single FC Host Adapter * There is no Xoption # for Amber2A The following parts are affected:
Notes: 1) The QLogic Native FCAL HBA is not sold or serviced by Sun. Sun Service does not stock this HBA. There is no Sun FRU/CRU part number for it. However, the QLogic HBA models affected are QLA2340, QLA2342 and QLA2344. 2) The issue is directly related to PCI Card slotting. If there is a 3.3v/5v PCI card in a PCI slot on the same PCI-Reset line as the slot in which the FCAL HBA is located, there is potentIal for this issue to occur. In the V480 and V490 there are 6 PCI slots: - Slots 0 and 1 are on the same PCI-Reset line - Slots 2 and 3 on another PCI-Reset line - Slots 4 and 5 on the third PCI-Reset line 3) HBA onboard NVRAM corruption cannot be detected by visual inspection of the HBA. Explorers do contain the outputs of luxadm -e dump_map and prtpicl -v. Still other methods of determining if the system contains QLogic based FCAL HBAs and whether they are QLogic Native or Sun Branded, can be found in INTERNAL Infodoc 74531. Symptoms 1) If the leadville driver is loaded the error message is: 50 ohm is not set 2) With QLogic Native HBA: If the Qlogic native driver "qla" is 4.08, the last 6 digits of the WWN is set to 0 and in the system boot messages the last four hex characters of the HBA's WWN will be changed to zero's. Here are 2 examples of HBA0 and HBA1 with corrupt NVRAM's: qsxbat08 qla2300: [ID 364886 kern.info] qla2300-hba0-adapter-node-name="200000e08b000000"; qsxbat08 qla2300: [ID 358785 kern.info] qla2300-hba0-adapter-port-name="210000e08b000000"; qsxbat08 qla2300: [ID 332924 kern.info] qla2300-hba0-adapter-port-id="000000"; qsxbat08 qla2300: [ID 364886 kern.info] qla2300-hba1-adapter-node-name="200000e08b000000"; qsxbat08 qla2300: [ID 358785 kern.info] qla2300-hba1-adapter-port-name="210000e08b000000"; qsxbat08 qla2300: [ID 332924 kern.info] qla2300-hba1-adapter-port-id="000000"; a non corrupted NVRAM would be as follows: qla2300: [ID 364886 kern.info] qla2300-hba0-adapter-node-name="200000e08b1e5e6c"; If the QLogic native driver "qla" is 4.18 or later, as the driver attaches to the HBA, system boot messages will include the string "NVRAM Corrupt" in additions to showing the WWN with the last 6 digits set to 0. 3) With Sun Branded HBA: Take note of Sun Branded HBA onboard NVRAM corruption from modified WWNs as presented in outputs of OBP commands, luxadm, prtpicl, and system boot messages. At the OK Prompt "cd" to the HBA node, then a ".properties" command will show that the right most part of the HBA's WWN has been changed to the system's HostID. You can also use: Luxadm -e dump_map prtpicl -v -c scsi-fp prtdiag -v will identify the QLogic Native HBAs. prtdiag -v will list the Sun Branded HBAs as follows: For example: SUNW,qlc-pci1077,2312.1077.10a.2+ <--x6768 Sun/Qlogic Crystal-2A SUNW,qlc-pci1077,2312.1077.149.2+ <-- Sun/Qlogic Amber2A At system boot time when the Sun "qlc" driver attaches to the HBA it will report: 50 ohm is not set
Root Cause FCAL NVRAM corruption is caused by interference to the correct operation of PCI_RESET lines within the PCI bus. It can occur randomly on system power up and power down. PCI_RESET nets are shared between slots 0 and 1, 2 and 3, 4 and 5. The HBA onboard NVRAM corruption occurs when PCI_RESET is deasserted during the power up or power down of a particular QLogic ASIC on the HBA resulting in unpredictable accesses to the NVRAM. Other PCI cards configured on the PCI bus can cause the PCI_RESET line to be deasserted. Workaround Until a final resolution is available the following is recommended: If the affected HBA is in Slot 0, or 2 or 4, leave the other slot of the associated PCI_RESET Slot pair empty (i.e: Slots 1, 3 and 5, respectively). Alternatively, configure the PCI card slotings so that any combination of the affected Sun Branded FCAL HBAs and the QLogic Native FCAL HBA occupy a PCI_RESET PCI slot pair. For example, a Qlogic Native FCAL HBA in PCI slot 0 and a Sun Branded Crystal-2A in PCI slot 1 prevents this issue from occurring. Resolution At this time there is no engineering fix for this issue. Previously Published As 102755 Internal Contributor/submitter Don.Palko@sun.com, Craig.Gates@sun.com, David.Ward@sun.com, Don.Palko@sun.com, Craig.Gates@sun.com Internal Eng Business Unit Group SSG WGS (Workgroup Systems) Internal Eng Responsible Engineer alexander.rabinovich@sun.com Internal Services Knowledge Engineer karen.edwards@sun.com Internal Escalation ID 1-18894137 Internal Kasp FAB Legacy ID 102755 Internal Sun Alert & FAB Admin Info Critical Category: Significant Change Date: 2006-12-19 Avoidance: Reconfiguration Responsible Manager: Steven Doherty Original Admin Info: [WF 19-Dec-2006, karened: releaseing after Ext review and update] [WF 15-Dec-2006, karened: took me about a week to rewrite from 2 different submittals] Product_uuid 5c71fc02-5e51-11d7-8add-8938754df22a|Sun Fire V490 Server a2b9bc2b-52c6-45c2-a3e0-f19bd2c86953|Sun Fire V480 Server Attachments This solution has no attachment |
||||||||||||
|