Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Technical Instruction Sure Solution 1002936.1 : How to Check for Linux Platform Disk Errors and Online/Offline Status
PreviouslyPublishedAs 204039 To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Sun x86 Systems Description This document describes how to identify if a Linux operating environments disk is online/offline or has reported errors. This document does not detail how to recover data or replace storage devices and does not discuss service processor embedded Linux. Symptoms: - disk errors. Steps to Follow NOTE: Always perform disk management commands as a root or UID0 user. IDENTIFYING DRIVE AVAILABILITY:
Execute the following command to identify the available disk configuration: # /bin/more /proc/scsi/scsi Output will vary depending on platform model and configuration, but will be similar to the following: Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: AMI Model: Virtual CDROM Rev: 1.00 Type: CD-ROM ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: AMI Model: Virtual Floppy Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi2 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST973401LSUN72G Rev: 0556 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST973401LSUN72G Rev: 0556 Type: Direct-Access ANSI SCSI revision: 03 The output above details all devices that are currently available to the platform for use. Now execute the following command to identify the available partitions: # /sbin/fdisk -l Output will vary depending on platform model and configuration, but will be similar to the following: Disk /dev/sdb: 73.4 GB, 73407865856 bytes 255 heads, 63 sectors/track, 8924 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 * 1 131 1052226 83 Linux /dev/sdb2 132 8402 66436807+ 83 Linux /dev/sdb3 8403 8924 4192965 82 Linux swap Disk /dev/sdc: 73.4 GB, 73407865856 bytes 255 heads, 63 sectors/track, 8924 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdc1 * 1 8924 71681998+ 83 Linux The above commands are dynamic and will reflect the currently available disks. Devices that have failed in such a way that they are now offline will not be reflected in this output. If one or more expected disks are not present then your disk is offline and is not able to respond to the scsi and fdisk probes. IDENTIFYING ERRORS FROM MESSAGES: Execute the following command to extract disk related errors from the system messages file: # /bin/grep SCSI /var/log/messages* # /bin/grep 'fs error' /var/log/messages* Output will vary depending on platform model and configuration, but will be similar to the following: Dec 12 12:30:00 x4100c kernel: SCSI device sdb: 143374738 512-byte hdwr sectors (73408 MB) Dec 12 12:30:00 x4100c kernel: SCSI device sdb: drive cache: write through Dec 12 12:30:01 x4100c kernel: SCSI device sdc: 143374738 512-byte hdwr sectors (73408 MB) Dec 12 12:30:01 x4100c kernel: SCSI device sdc: drive cache: write through Dec 12 13:35:00 x4100c kernel: SCSI error : <2 0 3 0> return code = 0x10000 Dec 12 13:35:00 x4100c kernel: EXT2-fs error (device sdc1): read_inode_bitmap: Cannot read inode bitmap - block_group = 422, inode_bitmap = 13828097 This output can be divided into two sections, expected and exception output. EXPECTED OUTPUT: Dec 12 12:30:00 x4100c kernel: SCSI device sdb: 143374738 512-byte hdwr sectors (73408 MB) Dec 12 12:30:00 x4100c kernel: SCSI device sdb: drive cache: write through Dec 12 12:30:01 x4100c kernel: SCSI device sdc: 143374738 512-byte hdwr sectors (73408 MB) Dec 12 12:30:01 x4100c kernel: SCSI device sdc: drive cache: write through These messages are runtime events that are output by the platforms hardware discovery during boot. These messages can be ignored as they are not errors but are useful because they allow us to understand the disks identities that are available at boot time. EXCEPTION OUTPUT: Dec 12 13:35:00 x4100c kernel: SCSI error : <2 0 3 0> return code = 0x10000 Dec 12 13:35:00 x4100c kernel: EXT2-fs error (device sdc1): read_inode_bitmap: Cannot read inode bitmap - block_group = 422, inode_bitmap = 13828097 These messages are errors and are output due to a failing component or complete disk. We use the keyword SCSI because all storage devices in a modern Linux platform including IDE/PATA, FC-AL, SAS, SATA, SCSI, and USB emulate SCSI to be represented as a storage device. Therefore, most error messages reported in the system messages file are prefixed with the word SCSI. This document contains normalized content and is managed by the the Domain Lead(s) of the respective domains. To notify content owners of a knowledge gap contained in this document, and/or prior to updating this document, please contact the domain engineers that are managing this document via the "Document Feedback" alias(es) listed below: tsc-emea-x64@sun.com Anthony McNamara x64 Global Domain Lead x64, normalized, linux, RAID Previously Published As 91488 Change History Date: 2010-06-07 User Name: brian.jackson@oracle.com Action: Currency check Comment: Please review this article and document any changes you made here... Attachments This solution has no attachment |
||||||||||||
|