Document Audience: | INTERNAL |
Document ID: | I0939-1 |
Title: | VxDMP I/O fail back during Microcode upgrade on SE9900 systems can result in data corruption. |
Copyright Notice: | Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved |
Update Date: | 2005-04-14 |
---------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
FIN #: I0939-1
Synopsis: VxDMP I/O fail back during Microcode upgrade on SE9900 systems can result in data corruption.Create Date: Feb/28/03
SunAlert: No
Top FIN/FCO Report: Yes
Products Reference: Sun StorEdge 99x0 Arrays
Product Category: Storage / Service
Product Affected:
Systems Affected:
-----------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- ANYSYS - System Platform Independent -
X-Options Affected:
-------------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- SE9910 ALL Sun StorEdge 9910 Array -
- SE9960 ALL Sun StorEdge 9960 Array -
- SE9970 ALL Sun StorEdge 9970 Array -
- SE9980 ALL Sun StorEdge 9980 Array -
Parts Affected:
Part Number Description Model
----------- ----------- -----
- - -
References:
DOC: SE99xx Maintenance manual.
Veritas Volume Manager Administrators Guide.
URL: http://pts-americas.west/nws/products/T99x0/documentation.html,
http://www.veritas.com
Issue Description:
When performing an online Microcode upgrade of a SE9900 array using the
"Alternate SCSI Path" mode, if the host is utilizing VxDMP for
multipath failover, it is possible to inadvertently shutdown all paths
from the host to the array, thereby causing an outage and possibly data
corruption/loss.
VxDMP by default only checks the health of an offlined path every 300
seconds. Therefore, if the support engineer relies on the auto
failover capabilities of VxDMP to offline and online multiple paths to
the host, it is possible for the Microcode upgrade process to:
1. Down the paths from cluster 1
2. Perform the necessary updates to cluster 1
3. Bring cluster 1 back up
4. Then, before VxDMP has recognized that cluster 1 is back online, down
cluster 2 (which VxDMP believes is the only good path).
This would cause the host to lose all access to the array. This can be
seen in the /var/adm/messages file of the host:
Sept 20th 19:40:43 HDLM: [ID 936769 kern.info] 1A down
dmp path0 offline
Sept 20th 19:42:37 HDLM: [ID 936770 kern.info] 2A down
dmp path1 offline
Sept 20th 19:44:37 HDLM: [ID 936769 kern.info] 1A up dmp
path0 online
The default VxDMP health check period for an offlined path can be set
or verified by looking at:
/etc/rcS.d/S25vxvm-sysboot
# By default, the restore daemon will check the health of
# only disabled paths with a polling interval of 300sec.
restore_daemon_opts="interval=300 policy=check_disabled"
# Uncomment the following line to turn on checking for all
# the paths on the system with polling interval of 300sec.
# restore_daemon_opts="interval=300 policy=check_all"
Implementation:
---
| | MANDATORY (Fully Proactive)
---
---
| X | CONTROLLED PROACTIVE (per Sun Geo Plan)
---
---
| | REACTIVE (As Required)
---
Corrective Action:
The following recommendation is provided as a guideline for authorized
Sun Services Field Representatives who may encounter the above
mentioned problem.
The proper corrective action to avoid this type of outage is to insure
availability of multiple paths prior to the Microcode upgrade and to
manually fail over paths through the upgrade process.
To verify availability of all paths prior to the upgrade, execute the
"vxdmpadm listctlr all" command from the host. The result should look
similar to this:
root[sh]@test# vxdmpadm listctlr all
CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME
=====================================================
c0 Disk ENABLED Disk
c16 Disk ENABLED Disk
c17 Disk ENABLED Disk
This indicates that the state of both paths (c16 and c17) are enabled.
In addition, you can verify multiple paths to a particular disk by
executing "vxdisk list ":
root[sh]@test# vxdisk list c16t1d30s2
Device: c16t1d30s2
devicetag: c16t1d30
type: sliced
hostid: test
disk: name=test2dg31 id=1037654212.4042.test
group: name=test2dg id=1037658550.4056.test
flags: online ready private autoconfig autoimport imported
pubpaths: block=/dev/vx/dmp/c16t1d30s4 char=/dev/vx/rdmp/c16t1d30s4
privpaths: block=/dev/vx/dmp/c16t1d30s3 char=/dev/vx/rdmp/c16t1d30s3
version: 2.2
iosize: min=512 (bytes) max=2048 (blocks)
public: slice=4 offset=0 len=4397056
private: slice=3 offset=1 len=2047
update: time=1041990512 seqno=0.8
headers: 0 248
configs: count=1 len=1486
logs: count=1 len=225
Defined regions:
config priv 000017-000247[000231]: copy=01 offset=000000 disabled
config priv 000249-001503[001255]: copy=01 offset=000231 disabled
log priv 001504-001728[000225]: copy=01 offset=000000 disabled
Multipathing information:
numpaths: 2
c16t1d30s2 state=enabled
c17t1d30s2 state=enabled
When performing the online Microcode upgrade using the "Alternate SCSI
Path" method, you will be prompted to discontinue I/O through a
particular path with the message:
"Switch the SCSI channel path that is connected to the cluster-1, to
an alternate path. Then select OK."
Before clicking "OK" you must disable the specified path from the host
by entering "vxdmpadm disable ctlr=". Verify that the path is down by once again running "vxdmpadm
listctrl all", this should give you an output similar to:
root[sh]@test# vxdmpadm listctlr all
CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME
=====================================================
c0 Disk ENABLED Disk
c16 Disk DISABLED Disk
c17 Disk ENABLED Disk
*NOTE: Make sure you bring down the correct path, or when you click "OK"
on the SVP you will bring down the host!
The Microcode installation process will then ask you to stop I/O on the
next path (cluster-2). Before continuing with this process you must
bring the first path back online and verify it by running "vxdmpadm
enable ctlr=" then "vxdmpadm listctrl all" and
insure that the controller once again has the state of "ENABLED". You
can then proceed to disable the second controller using the same method
("vxdmpadm disable ") and verify it is disabled before
clicking "OK" on the SVP. After the microcode upgrade on cluster-2 has
completed, you must re-enable that path with the vxdmpadm command as
shown above.
Make sure to verify that when you have completed the Microcode upgrade,
all paths are once again enabled and accessible.
For more information regarding the Microcode upgrade process look at
the Micrc-FC section of the relevant maintenance manual.
For more information regarding Veritas VxDMP commands look at the
Veritas Volume Manager Administrator's Guide, section 3, page 95.
Comments:
None.
============================================================================
Implementation Footnote:
i) In case of MANDATORY FINs, Sun Services will attempt to contact
all affected customers to recommend implementation of the FIN.
ii) For CONTROLLED PROACTIVE FINs, Sun Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Sun Services will implement the FIN as the
need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.central/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Central/
* From there, select the appropriate link to browse the FIN or FCO index.
Internet Access:
----------------
* Access the top level URL of https://spe.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
--------------------------------------------------------------------------