Topics:
Identifying I/O Failure
Disk Failure Types
Resolving Permanent Disk Failure
Resolving Temporary Disk Failure
Resolving Intermittent Disk Failure
Disk Failure Handling
Follow the path:
The OS detects an error and informs vxconfigd – Is the volume redundant? 1) No 2) Yes
1) Display error messages, detach the disk from the disk group, and change volume’s kernel state.
2) Is the private region accessible? 3) No 4) Yes
3) Mark the disk as FAILED, detach the disk, mark the affected plex with NODEVICE, and relocate redundant volumes.
4) Mark the disk as FAILING, mark the affected plex with IOFAIL, and relocate subdisks.
Permanent Disk Failure: Volume States After the Failure
“vxprint –htg diskgroup” will list the device as NODEVICE.
Permanent Disk Failure: Resolving
1) Replace the disk
2) Have VxVM scan the devices: vxdctl enable
3) Initialize the new drive: vxdisksetup –i device_name
4) Attach the disk media name (datadg02) to the new drive:
vxdg –g diskgroup –k adddisk datadg02=device_name
5) Recover the redundant volumes: vxrecover
6) Start any nonredundant volumes: vxvol –g diskgroup –f start volume_name
7) Restore data of any nonredundant volumes from backup.
Temporary Disk Failure: Volume States After Reattaching Disk
“vxprint –htg diskgroup” will list the device as DISABLED IOFAIL
Temporary Disk Failure: Resolving
1) Fix the failure
2) Ensure that the OS recognizes the device
3) Force VxVM to reread all drives: vxdctl enable
4) Reattach the disk media name to the disk access name: vxreattach
5) Recover the redundant volumes: vxrecover
6) Start any nonredundant volumes: vxvol –g diskgroup –f start volume_name
7) Check data for consistency, for example:
fsck /dev/vx/rdsk/diskgroup/volume_name
Intermittent Disk Failure: Resolving
1) If any volumes on the failing disk are not redundant, attempt to mirror those volumes:
- If you can mirror the volumes, continue with the procedure for redundant volumes.
- If you cannot mirror the volume, prepare for backup and restore.
2) If the volume is redundant
- Prevent read I/O from accessing the failing disk by changing the volume read policy
- Remove the failing disk
- Set the volume read policy back to the original policy
Forced Removal
To forcibly remove a disk and not evacuate the data:
1) Use the vxdiskadm option, “Remove a disk for replacement.” VxVM handles the drive as if it has already failed.
2) Use the vxdiskadm option, “Replace a failed or removed disk.”
CLI:
vxdg –k –g diskgroup rmdisk [device_name]
vxdisksetup –if [newdisk]
vxdg –k –g diskgroup adddisk [device_name]=[newdisk]
Identifying a Degraded Plex of a RAID-5 Volume
“vxprint –htg diskgroup” will list the device as NODEVICE and the subdisk as NDEV.
The following commands will also indicate degradation:
vxprint –l volume_name
vxinfo –p –g diskgroup

Custom Search