Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Sun Alert Sure Solution 1000088.1 : 500GB SATA Drives in Sun StorageTek 6140 and 6540 Arrays May Have the Incorrect Interposer Card Firmware
PreviouslyPublishedAs 200104 Product Sun StorageTek 6140 Array Sun StorageTek 6540 Array Bug Id <SUNBUG: 6502044> Date of Workaround Release 11-DEC-2006 Date of Resolved Release 16-JAN-2007 Impact Systems might experience a high number of SATA drive failures possibly resulting in data loss. Contributing Factors This issue can occur on the following platforms:
Note: This issue only occurs if the interposer firmware is not "LP1145". All drives shipped originally in trays are not affected. To identify possible affected drives, the Common Array Manager(CAM) will show the interposer firmware as LP1131B-K2A0AJ0A. Users can check this by reviewing the Array->Physical->Drives page in the Browser Interface or the following using the sscs(1M) command: sscs list -a array disk t51d10,t51d11 Tray: 51 Disk: t51d10 Capacity: 465.761 GB Type: SATA Speed (RPM): 7200 Status: Optimal State: Enabled Role: Data Virtual Disk: 6 Firmware: LP1131b-K2AOAJ0A <--- Affected Serial number: KRVN65ZAJGMAAF WWN: 20:00:00:A0:B8:25:CB:34 Tray: 51 Disk: t51d11 Capacity: 465.761 GB Type: SATA Speed (RPM): 7200 Status: Optimal State: Enabled Role: Data Virtual Disk: 6 Firmware: LP1145-K2AOAJ0A <--- Not Affected Serial number: KRVP65ZAJPT04F WWN: 20:00:00:A0:B8:25:9F:DE The first string in the Firmware field before the "-" is the interposer firmware. Drives with LP1131b for interposer firmware are affected. Symptoms Systems experience drive failures due to communication failures between the Interposer, Tray IO Module, and Array RAID Controller. Failures include normal drive read or write IO, failures during drive replacement, or during hot sparing of Virtual Disk data to one of the allocated Global Hotspares in the array. Workaround Please see the Resolution below. Resolution This issue is addressed in the following releases:
Please read the patch README for instructions on how to upgrade the SATA Interposer firmware for your affected drives to version "LP1145". This firmware update is an offline process, IO must be quiesced prior to updating the drive firmware. See the patch README for details. Modification History Date: 16-JAN-2007
References<SUNPATCH: 124945-01><SUNPATCH: 124955-01> Previously Published As 102748 Internal Comments The firmware upgrade will require an outage, as the drive update process is an offline process. The upgrade will also cause one or both of the Array RAID Controllers to panic, but will result in the appropriate firmware update. This expectation must be set by PTS/Backline or escalating Front Line engineers and Field Support. Additional Symptoms: The Major Event Log can show event type 100A(Drive CHECK CONDITION) with event specific codes: b/88/1 and 4/88/0 for SATA drives: Example:
The Major Event Log can show event type 100A(Drive CHECK CONDITION) with event specific codes: b/88/1 and 4/88/0 for SATA drives: Example:
Internal Contributor/submitter Curtis.Decotis@sun.com Internal Eng Business Unit Group NWS (Network Storage) Internal Eng Responsible Engineer Veerappan.Nachiappan@Sun.COM Internal Services Knowledge Engineer karen.edwards@sun.com Internal Escalation ID 1-20683373, 43535125, 43536769, 65255955 Internal Resolution Patches 124945-01, 124955-01 Internal Sun Alert Kasp Legacy ID 102748 Internal Sun Alert & FAB Admin Info Critical Category: Data Loss, Availability ==> Pervasive Significant Change Date: 2006-12-11, 2007-01-16 Avoidance: Patch Responsible Manager: Veerappan.Nachiappan@Sun.COM Original Admin Info: [WF 16-Jan-2006, karened: i verified the CR was closed and updated before releasing as resolved] [WF 08-Dec-2006, karened: created - was submitted as FAB] Internal SA-FAB Eng Submission -------- Original Message -------- Subject: Draft Sun Alert: 500GB SATA Drives in Sun StorageTek[TM] 6140 and 6540 Arrays,Can Have the Wrong Interposer Card Firmware Date: Fri, 08 Dec 2006 12:52:47 -0500 From: Curtis DeCotis To: sunalert-submit@sun.com, nws-review@sun.com, pts-storage-sunalerts@sun.com I have spoken to Karen Edwards about this issue, and we agree that it is a Sun Alert. Synopsis: 500GB SATA Drives in Sun StorageTek[TM] 6140 and 6540 Arrays Can Have the Wrong Interposer Card Firmware Category: [X] Availability [ ] Diagnosis [ ] HA-Failure [X] Pervasive (reported by four or more external customers Product: StorageTek 6140 and StorageTek 6540 BugID: 6502044 Avoidance: [ ] Workaround [ ] Binary [ ] T-Patch [ ] Patch [ ] Upgrade [ ] FCO [ ] HW [ X] None (Preliminary) State: [X ] Preliminary [ ] Workaround [ ] Resolved 1. Impact: Customers can experience high number of SATA drive failures, possibly resulting in data loss. 2. Contributing Factors: -Sun StorageTek 6140 Arrays with X-Option XTA-ST1CF-500G7K / 500GB 7200RPM SATA Drive -Sun StorageTek 6540 Arrays with X-Option XTA-ST1CF-500G7K / 500GB 7200RPM SATA Drive -X-Option XTA-ST1CF-500G7K supplied as Part 390-0247 / Hitachi HDS725050KLA360 without interposer firmware LP1145 3. Symptoms: Customers will experience large drive failures due to communication failures between the Interposer, Tray IO Module, and Array RAID Controller. Failures include normal drive read or write IO, failures during drive replacement, or during hot sparing of Virtual Disk data to one of the allocated Global Hotspares in the array. To identify possible drives, Common Array Manager(CAM) will show the interposer firmware as LP1131B-K2A0AJ0A. Users can check this by reviewing the Array->Physical->Drives page for the Browser Interface, and the following using sscs(1M) Command Line Interface: sscs list -a array disk t51d10,t51d11 Tray: 51 Disk: t51d10 Capacity: 465.761 GB Type: SATA Speed (RPM): 7200 Status: Optimal State: Enabled Role: Data Virtual Disk: 6 Firmware: LP1131b-K2AOAJ0A <---BAD!!! Serial number: KRVN65ZAJGMAAF WWN: 20:00:00:A0:B8:25:CB:34 Tray: 51 Disk: t51d11 Capacity: 465.761 GB Type: SATA Speed (RPM): 7200 Status: Optimal State: Enabled Role: Data Virtual Disk: 6 Firmware: LP1145-K2AOAJ0A <---GOOD!! Serial number: KRVP65ZAJPT04F WWN: 20:00:00:A0:B8:25:9F:DE The first string in the Firmware field before the "-" is the interposer firmware. Drives with LP1131b for interposer firmware meet the criteria for this Sun Alert. 4. Relief/Workaround: No Workaround is available at this time. 5. Resolution: Should observe the above symptoms, please contact Sun Services for help in correcting this issue, until a final resolution can be created. A final Resolution is pending completion. 6. Internal Section: Escalation IDs: 1-20683373, 43535125, 43536769, 65255955 Pending Patches: Resolution Patches: FIN: FCO: Submitter: Curtis.Decotis@sun.com Responsible Engineer: Responsible Manager: Veerappan.Nachiappan@Sun.COM PTS/Engineering organization: [ ] SSG WGS (Workgroup Systems) [ ] SSG NSN (Netra Systems and Networking) [ ] SSG ES (Enterprise Systems) [ ] SSG SW (Platform Software) [ ] SSG PNP (Processor) [ ] NSG (Network Systems Group) [X ] NWS (Network Storage) [ ] OP/N1 RPE (Operating Platforms/N1 Revenue Product Engin.) [ ] JPSE (Java Platform Sustaining Engineering) [ ] JWSSE (Java Web Services Sustaining Engineering) [ ] USG (User Software Group) [ ] SSG HS (Horizontal Systems - T2000/Ontario) Distribution: [ ] Public SunSolve [ X] Contract SunSolve Comments: The RSL's for these drive types have been purged. PTS/TSC Backline Escalations should be filed if the customer has the criteria and symptoms addressed above. PTS/Backline Engineers have access to a firmware download utility and interposer firmware for updating customer systems. The firmware upgrade *will* require an outage, as the drive update process is an offline process. The upgrade will also cause one or both of the Array RAID Controllers to panic, but will result in the appropriate firmware update. This expectation must be set by PTS/Backline or escalating Front Line engineers and Field Support. Additional Symptoms: The Major Event Log can show event type 100A(Drive CHECK CONDITION) with event specific codes: b/88/1 and 4/88/0 for SATA drives: Example: Date/Time: 11/28/06 1:02:54 PM Sequence number: 1538 Event type: 100A Event category: Error Priority: Informational Description: Drive returned CHECK CONDITION Event specific codes: b/88/1 Component type: Drive Component location: Tray 85, Slot 6 Logged by: Controller in slot A The Major Event Log can show event type 100A(Drive CHECK CONDITION) with event specific codes: b/88/1 and 4/88/0 for SATA drives: Example: Date/Time: 11/28/06 1:02:54 PM Sequence number: 1538 Event type: 100A Event category: Error Priority: Informational Description: Drive returned CHECK CONDITION Event specific codes: b/88/1 Component type: Drive Component location: Tray 85, Slot 6 Logged by: Controller in slot A PTS Reviewer (approved by): James.Evans@sun.com 1) This process requires a complete outage of data to the array. 2) No activity can take place on the array 3) use the csmservice -s command as defined in the README This can and will cause a controller panic, so you should choose "option 6" to upgrade, then verify the change to the drive firmware using "option 1". 4) check status of both controllers in CAM. Online any failed controller using the Service Advisor in CAM. 5) Repeat for all drives with the AJOA firmware 6) After ensuring that all drives have been updated, and controllers are online and optimal, reset each controller: -CAM->Array->Physical Devices->Controllers -Click "Reset Controller" button for controller A -Wait 2 minutes -Click "Reset Controller" button for controller B 7) collect supportData via the Service Advisor, and forward it to Sun Services. Product_uuid 8ac7dca5-a8bd-11da-85b4-080020a9ed93|Sun StorageTek 6140 Array e35cfcfc-a31a-11da-85b4-080020a9ed93|Sun StorageTek 6540 Array ReferencesSUNPATCH:124945-01SUNPATCH:124955-01 Attachments This solution has no attachment |
||||||||||||
|