This uses FlashSnap, which requires a separate license to use.
Veritas Storage Foundation with FlashSnap provides three types of PITC solutions:
* Volume-Level PITC Solutions:
- Full-Sized Instant Volume Snapshots
- Space-Optimized Instant Volume Snapshots
* Filesystem-Level Solution:
- Storage Checkpoints
Preparing to Create a Full-Sized Instant Volume Snapshot: CLI
Enable FastResync:
vxsnap –g diskgroup [-b] prepare origvol
Allocate the storage using ONE of these methods:
1) Add a mirror to use as a full-sized instant snapshot:
# vxsnap –g diskgroup addmir volume
2) Use an existing ACTIVE plex in the volume
3) Create an empty volume for use as the snapshot volume:
# LEN=`vxprint –g diskgroup –F%len volume`
# DCONAME=`vxprint –g diskgroup –F%sco_name volume`
# RSZ=`vxprint –g diskgroup –F%regionsz $DCONAME`
# vxassist –g diskgroup make snap-origvol $LEN
# vxsnap –g diskgroup prepare snap-origvol regionsize=$RSZ
Creating and Managing Full-Sized Instant Volume Snapshots: CLI
Create the snapshot volume using ONE of these methods:
1) Break off an existing plex to create the new snapshot
# vxsnap –g diskgroup make source=origvol newvol=snap-origvol plex=name
2) Specify an empty volume to be used as the snapshot:
# vxsnap –g diskgroup source=origvol snapvol=snap-origvol
Update:
# vxsnap –g diskgroup refresh snap-origvol source=origvol
# vxsnap –g diskgroup reattach snap-origvol source=origvol
# vxsnap –g diskgroup restore origvol source=snap-origvol
# vxsnap –g diskgroup dis snap-origvol
Remove:
# vxedit –g diskgroup –r rm snapvol
Note: These are a few more items that I have not transcribed. The reason being that they require additional licensing and are useful only in specific setups.

Custom Search
Monday, July 25, 2005
Performance Monitoring
Performance Analysis Process
1) You must understand your application workload and your performance objectives for each application workload
2) You must identify all components of the data transfer model of your storage architecture, that is, the complete I/O path of your data from application to disk
3) For each of the hardware components in your architecture, determine the theoretical performance characteristics of each component
4) Use performance monitoring and workload generation tools to measure performance for each of the components in your configuration.
Tools:
* vxstat
* vxtrace
Note: I was lazy on this one. There is too much system specific info to put in. Besides the VxVM tools, everything else is more system performance related
1) You must understand your application workload and your performance objectives for each application workload
2) You must identify all components of the data transfer model of your storage architecture, that is, the complete I/O path of your data from application to disk
3) For each of the hardware components in your architecture, determine the theoretical performance characteristics of each component
4) Use performance monitoring and workload generation tools to measure performance for each of the components in your configuration.
Tools:
* vxstat
* vxtrace
Note: I was lazy on this one. There is too much system specific info to put in. Besides the VxVM tools, everything else is more system performance related
Volume Maintenance
Topics:
Changing the Volume Layout
Managing Volume Tasks
Analyzing Configurations with Storage Expert
Changing the Volume Layout
Online relayout: Change the colume layout or layout characteristics while the volume is online.
By using online relayout, you can change the layout of an entire volume or a specific plex. Use online relayout to change the column or plex layout to or from:
* Concatenated
* Striped
* RAID-5
* Striped mirrored
* Concatenated mirrored
Online Relayout Notes
* You can reverse online relayout at any time
* Some layout transformations can cause a slight increase or decrease in the volume length due to subdisk alignment policies. If volume length increases during relayout, VxVM resizes the file system using vxresize.
* Relayout does not change log plexes
* You cannot:
- Create a snapshot during relayout
- Change the number of mirrors during relayout
- Perform multiple relayouts at the same time
- Perform relayout on a volume with a sparse plex
Changing the Layout: VEA
Highlight a volume and select Actions -> Change Layout
Changing the Layout: CLI
# vxassist relayout
- Used for nonlayered relayout operations
- Used for changing layout characteristics, such as stripe width and number of columns
# vxassist convert
- Changes nonlayered volumes to layered volumes, and vice versa
Note: vxassist relayout cannot create a nonlayered mirrored volume in a single step. The command always creates a layered mirrored volume even if you specify a nonlayered mirrored layout. Use vxassist convert to convert the resulting layered volume into a nonlayered volume.
vxassist –g diskgroup relayout volume|plex layout=layout ncol=+|- ncol stripeunit=size
To change to a striped layout:
# vxassist –g datadg relayout datavol layout=stripe ncol=2
To add a column to striped volume datavol:
# vxassist –g datadg relayout datavol ncol=+1
To remove a column from datavol:
# vxassist –g datadg relayout datavol ncol=-1
To change stripe unit size and number of columns:
# vxassist –g datadg relayout datavol stripeunit=32k ncol=5
To change mirrored layouts to RAID-5, specify the plex to be converted (instead of the volume):
# vxassist –g datadg relayout datavol01-01 layout=raid5 stripeunit=32k ncol=3
To convert the striped mirrored volume datavol to a layered stripe-mirror layout:
# vxassist –g datadg convert datavol layout=stripe-mirror
Managing Volume Tasks: CLI
Use the vxtask command to:
- Display task information
- Pause, continue, and about tasks
- Modify the progress rate of a task
The vxrelayout command can be used to display the status of, reverse, or start a relayout operation:
# vxrelayout –g diskgroup status|reverse|start volume
What is Storage Expert?
Veritas Storage Expert (VxSE) is a CLI utility that provides volume configuration analysis.
Storage Expert:
* Analyzes configurations based on a set of “rules” or VxVM “best practices”
* Produces a report of results in ASCII format
* Provides recommendations, but does not launch any administrative operations
Running Storage Expert Rules
* VxVM and VEA must be installed
* Rules are located in /opt/VRTS/vxse/vxvm
* Syntax:
rule_name options info|list|check|run
* In the syntax:
- info
Displays rule description
- list
Displays attributes of rule
- check
Displays default values
- run
Runs the rule
Changing the Volume Layout
Managing Volume Tasks
Analyzing Configurations with Storage Expert
Changing the Volume Layout
Online relayout: Change the colume layout or layout characteristics while the volume is online.
By using online relayout, you can change the layout of an entire volume or a specific plex. Use online relayout to change the column or plex layout to or from:
* Concatenated
* Striped
* RAID-5
* Striped mirrored
* Concatenated mirrored
Online Relayout Notes
* You can reverse online relayout at any time
* Some layout transformations can cause a slight increase or decrease in the volume length due to subdisk alignment policies. If volume length increases during relayout, VxVM resizes the file system using vxresize.
* Relayout does not change log plexes
* You cannot:
- Create a snapshot during relayout
- Change the number of mirrors during relayout
- Perform multiple relayouts at the same time
- Perform relayout on a volume with a sparse plex
Changing the Layout: VEA
Highlight a volume and select Actions -> Change Layout
Changing the Layout: CLI
# vxassist relayout
- Used for nonlayered relayout operations
- Used for changing layout characteristics, such as stripe width and number of columns
# vxassist convert
- Changes nonlayered volumes to layered volumes, and vice versa
Note: vxassist relayout cannot create a nonlayered mirrored volume in a single step. The command always creates a layered mirrored volume even if you specify a nonlayered mirrored layout. Use vxassist convert to convert the resulting layered volume into a nonlayered volume.
vxassist –g diskgroup relayout volume|plex layout=layout ncol=+|- ncol stripeunit=size
To change to a striped layout:
# vxassist –g datadg relayout datavol layout=stripe ncol=2
To add a column to striped volume datavol:
# vxassist –g datadg relayout datavol ncol=+1
To remove a column from datavol:
# vxassist –g datadg relayout datavol ncol=-1
To change stripe unit size and number of columns:
# vxassist –g datadg relayout datavol stripeunit=32k ncol=5
To change mirrored layouts to RAID-5, specify the plex to be converted (instead of the volume):
# vxassist –g datadg relayout datavol01-01 layout=raid5 stripeunit=32k ncol=3
To convert the striped mirrored volume datavol to a layered stripe-mirror layout:
# vxassist –g datadg convert datavol layout=stripe-mirror
Managing Volume Tasks: CLI
Use the vxtask command to:
- Display task information
- Pause, continue, and about tasks
- Modify the progress rate of a task
The vxrelayout command can be used to display the status of, reverse, or start a relayout operation:
# vxrelayout –g diskgroup status|reverse|start volume
What is Storage Expert?
Veritas Storage Expert (VxSE) is a CLI utility that provides volume configuration analysis.
Storage Expert:
* Analyzes configurations based on a set of “rules” or VxVM “best practices”
* Produces a report of results in ASCII format
* Provides recommendations, but does not launch any administrative operations
Running Storage Expert Rules
* VxVM and VEA must be installed
* Rules are located in /opt/VRTS/vxse/vxvm
* Syntax:
rule_name options info|list|check|run
* In the syntax:
- info
Displays rule description
- list
Displays attributes of rule
- check
Displays default values
- run
Runs the rule
Troubleshooting the Boot Process
Topics:
Operating System Boot Processes
Troubleshooting the Boot Process
Recovering the Boot Disk Group
Files Used in the Boot Process
* /etc/system (Sun only)
Contains VxVM entries
* /etc/vfstab (Sun), /etc/fstab (HP-UX and Linux)
Maps mount points to devices
* /etc/vx/volboot
Contains disk ownership data
* /etc/vx/licenses/lic, /etc/vx/elm
Contains license files
* /var/vxvm/tempdb
Stores data about diskgroups
* /etc/vx/reconfig.d/state.d/install-db
Indicates VxVM is not initialized
* /VXVM#.#.#-UPGRADE/.start_runed
Indicates that the VxVM upgrade is not complete
Troubleshooting: The Boot Device Cannot be Opened
Possible causes:
* Boot disk is not powered on
* Boot disk has failed
* SCSI bus is not terminated
* Controller failure has occurred
* Disk is failing and locking the bus
To resolve:
* Check SCSI bus connections:
- On Sun, use probe-scsi-all
- On Linux, use non-fast or verbose boot in the BIOS
* Boot from an alternate boot disk
Troubleshooting: Startup Scripts Exit Without Initialization
Possible causes:
Either one of the following files is present:
* /etc/vx/reconfig.d/state.d/install-db
This file indicates that VxVM software packages have been added, but VxVM has not been initialized with vxinstall. Therefore, vxconfigd is not started.
* / VXVM#.#.#-UPGRADE/.start_runed
This file indicates that a VxVM upgrade has been started but not completed. Therefore, vxconfigd is not started.
Troubleshooting: Conflicting Host ID in volboot
The volboot file contains the host ID that was on the system when you installed VxVM.
If you manually edit this file, VxVM does not function.
* To change the hostname in the volboot file:
vxdctl hosted newhostname
* To re-create the volboot file:
vxdctl init [hostname]
Troubleshooting: License Problems (keys corrupted, missing, or expired)
Save /etc/vx/licenses/lic/* to a backup device. If the license files are removed or corrupted, you can copy the files back.
License problems can occur if:
* The /etc/vx/licenses/lic files become corrupted
* An evaluation license was installed and not updated to a full license.
To resolve license issues:
* vxlicinst (installs a new license)
* vxiod set 10 (starts the I/O daemons)
* vxconfigd (starts the configuration daemon)
Troubleshooting: Missing /var/vxvm/tempdb (missing, misnamed, or corrupted)
This directory stores configuration information about imported diskgroups. The contents are recreated after a reboot. If this directory is missing, misnamed, or corrupted, vxconfigd does not start.
To remove and recreate this directory:
# vxconfigd –k –x cleartempdir
Troubleshooting: Debugging with vxconfigd
Running vxconfigd in debug mode:
# vxconfigd –k –m enable –x debug_level
* debug_level = 0 – No debugging (default)
* debug_level = 9 – Highest debug level
Some debugging options:
* -x log
Logs all console output to the /var/vxvm/vxconfigd.log file
* -x logfile=name
Use the specified log file instead
* -x syslog
Direct all console output through the syslog() interface
* -x timestamp
Attach a timestamp to all messages
* -x tracefile=name
Log all possible tracing information in the given file.
Troubleshooting: Invalid or Missing /etc/system File (Sun only)
The /etc/system file is used in the kernel initialization and /sbin/init phases of the boot process.
This file is a standard Sun system file to which VxVM add entries to:
* Specify drivers to be loaded
* Specify root encapsulation
If the file or these entries are missing, you encounter problems in the boot process.
When booting from an alternate system file, do not go past the maint mode. Boot up on the alternate system file, fix the VxVM problem, and then reboot with the original system file.
ok> boot –a
When prompted specify the /etc/hosts file for the system file. You will get many errors but you’ll get far enough in order to fix the original system file.
Temporarily Importing the Boot Diskgroup
Through a temporary import, you can bring the boot diskgroup to a working system and repair it there:
1) Obtain the diskgroup ID (dgid) of the boot diskgroup:
# vxdisk –s list
2) On the importing host, import and temporarily rename the diskgroup:
# vxdg –tC –n tmpdg import dgid
3) Fix and replace the files and volumes as necessary.
4) Deport the diskgroup back to the original host:
# vxdg –h orig_hostname deport tmpdg
Operating System Boot Processes
Troubleshooting the Boot Process
Recovering the Boot Disk Group
Files Used in the Boot Process
* /etc/system (Sun only)
Contains VxVM entries
* /etc/vfstab (Sun), /etc/fstab (HP-UX and Linux)
Maps mount points to devices
* /etc/vx/volboot
Contains disk ownership data
* /etc/vx/licenses/lic, /etc/vx/elm
Contains license files
* /var/vxvm/tempdb
Stores data about diskgroups
* /etc/vx/reconfig.d/state.d/install-db
Indicates VxVM is not initialized
* /VXVM#.#.#-UPGRADE/.start_runed
Indicates that the VxVM upgrade is not complete
Troubleshooting: The Boot Device Cannot be Opened
Possible causes:
* Boot disk is not powered on
* Boot disk has failed
* SCSI bus is not terminated
* Controller failure has occurred
* Disk is failing and locking the bus
To resolve:
* Check SCSI bus connections:
- On Sun, use probe-scsi-all
- On Linux, use non-fast or verbose boot in the BIOS
* Boot from an alternate boot disk
Troubleshooting: Startup Scripts Exit Without Initialization
Possible causes:
Either one of the following files is present:
* /etc/vx/reconfig.d/state.d/install-db
This file indicates that VxVM software packages have been added, but VxVM has not been initialized with vxinstall. Therefore, vxconfigd is not started.
* / VXVM#.#.#-UPGRADE/.start_runed
This file indicates that a VxVM upgrade has been started but not completed. Therefore, vxconfigd is not started.
Troubleshooting: Conflicting Host ID in volboot
The volboot file contains the host ID that was on the system when you installed VxVM.
If you manually edit this file, VxVM does not function.
* To change the hostname in the volboot file:
vxdctl hosted newhostname
* To re-create the volboot file:
vxdctl init [hostname]
Troubleshooting: License Problems (keys corrupted, missing, or expired)
Save /etc/vx/licenses/lic/* to a backup device. If the license files are removed or corrupted, you can copy the files back.
License problems can occur if:
* The /etc/vx/licenses/lic files become corrupted
* An evaluation license was installed and not updated to a full license.
To resolve license issues:
* vxlicinst (installs a new license)
* vxiod set 10 (starts the I/O daemons)
* vxconfigd (starts the configuration daemon)
Troubleshooting: Missing /var/vxvm/tempdb (missing, misnamed, or corrupted)
This directory stores configuration information about imported diskgroups. The contents are recreated after a reboot. If this directory is missing, misnamed, or corrupted, vxconfigd does not start.
To remove and recreate this directory:
# vxconfigd –k –x cleartempdir
Troubleshooting: Debugging with vxconfigd
Running vxconfigd in debug mode:
# vxconfigd –k –m enable –x debug_level
* debug_level = 0 – No debugging (default)
* debug_level = 9 – Highest debug level
Some debugging options:
* -x log
Logs all console output to the /var/vxvm/vxconfigd.log file
* -x logfile=name
Use the specified log file instead
* -x syslog
Direct all console output through the syslog() interface
* -x timestamp
Attach a timestamp to all messages
* -x tracefile=name
Log all possible tracing information in the given file.
Troubleshooting: Invalid or Missing /etc/system File (Sun only)
The /etc/system file is used in the kernel initialization and /sbin/init phases of the boot process.
This file is a standard Sun system file to which VxVM add entries to:
* Specify drivers to be loaded
* Specify root encapsulation
If the file or these entries are missing, you encounter problems in the boot process.
When booting from an alternate system file, do not go past the maint mode. Boot up on the alternate system file, fix the VxVM problem, and then reboot with the original system file.
ok> boot –a
When prompted specify the /etc/hosts file for the system file. You will get many errors but you’ll get far enough in order to fix the original system file.
Temporarily Importing the Boot Diskgroup
Through a temporary import, you can bring the boot diskgroup to a working system and repair it there:
1) Obtain the diskgroup ID (dgid) of the boot diskgroup:
# vxdisk –s list
2) On the importing host, import and temporarily rename the diskgroup:
# vxdg –tC –n tmpdg import dgid
3) Fix and replace the files and volumes as necessary.
4) Deport the diskgroup back to the original host:
# vxdg –h orig_hostname deport tmpdg
Encapsulation and Rootability
Topics:
Placing the Boot Disk Under VxVM Control
Creating an Alternate Boot Disk
Removing the Boot Disk from VxVM Control
Upgrading to a New VxVM Version
What is Encapsulation?
Encapsulation is a method of placing a disk under VxVM control.in which the data that exists on a disk is preserved. Encapsulation converts existing partitions into volumes, which provides continued access to the data on the disk after a reboot. After a disk has been encapsulated, the disk is handled in the same way as an initialized disk.
Requirements:
One free partition (for public and private region)
s2 slice that represents the full disk
2048 sectors free at beginning or end of disk for the private region
What is Rootability?
Rootability, or root encapsulation, is the process of placing the root file system, swap device, and other file systems on the boot disk under VxVM control. VxVM converts existing partitions of the boot disk into VxVM volumes. The system can then mount the standard boot disk file systems (that is, /, /usr, and so on) from volumes instead of disk partitions.
Requirements are the same as for data disk encapsulation, but the private region can be created from swap space.
Why Encapsulate the Boot Disk?
You should encapsulate the boot disk only if you plan to mirror the boot disk.
Benefits of mirroring the boot disk:
1) Enables high availability
2) Fixes bad blocks automatically (for reads)
3) Improves performance (ed. I don’t buy this point)
There is no benefit to boot disk encapsulation for its own sake. You should not encapsulate the boot disk if you do not plan to mirror the boot disk.
Limitations of Boot Disk Encapsulation
Encapsulating the boot disk adds steps to OS upgrades.
A system cannot boot from a boot disk that spans multiple devices
You should never expand or change the layout of boot volumes. No volume associated with an encapsulated boot disk (rootvol, usr, var, opt, swapvol, and so on) should be expanded or shrunk, because these volumes map to a physical underlying partition on the disk and must be contiguous.
If you attempt to expand these volumes, the system can become unbootable if it becomes necessary to revert back to slices in order to boot the system. Expanding these volumes can also prevent a successful OS upgrade, and a fresh install can be required.
Note: regarding Solaris, the upgrade_start script may fail.
Solaris File System Requirements
For root, use, var, and opt volumes:
1) Use UFS file systems (VxFS is not available until later in the boot process)
2) Use contiguous disk space. (Volumes cannot use striped, RAID-5, concatenated mirrored, or striped mirrored layouts)
3) Do not use dirty region logging on the system volumes. (You can use DRL for the opt and var volumes)
For swap volumes:
1) The first swap volume must be contiguous, and, therefore, cannot use striped or layered layouts.
2) Other swap volumes can be noncontiguous and can use any layout. However, there is an implied 2Gb limit of usable swap space per device for 32-bit operating systems.
Before Encapsulating the Boot Disk
Plan your rootability configuration. bootdg is a system-wide reserved disk group name that is an alias for the disk group that contains the volumes that are used to boot the system. When you place the boot disk under VxVM control, VxVM sets bootdg to the appropriate disk group. You should never attempt to change the assigned value of bootdg; doing so may render you system unbootable. An example configuration would be to place the boot disk into a disk group named sysdg, and add at least two more disks to the disk group: one for a boot disk mirror and one as a spare disk. VxVM will set bootdg to sysdg.
For Solaris, enable boot disk aliases: eeprom “use-nvramrc?=true”
Record the layout of the partitions on the unencapsulated boot disk to save for future use.
Encapsulating the Boot Disk
vxdiskadm:
“Encapsulate one or more disks”
Follow the prompts by specifying:
1) Name of the device to add
2) Name of the disk group to which the disk will be added
3) Sliced disk format (The boot disk cannot be a CDS disk)
vxencap:
/etc/vx/bin/vxencap –g diskgroup accessname
/etc/init.d/vxvm-reconfig accessname
After Boot Disk Encapsulation
You can view operating system-specific files to better understand the encapsulation process.
Solaris:
1) VTOC (prtvtoc device)
2) /etc/system
3) /etc/vfstab
Linux:
/etc/fstab
Alternate Boot Disk: Requirements
An alternate boot disk is a mirror of the entire boot disk. It preserves the boot block in case the primary boot disk fails
Creating an alternate boot disk requires that:
1) The boot disk be encapsulated by VxVM
2) Another disk be available with enough space to contain all of the boot disk partitions
3) All disks be in the boot disk group
The root mirror places the private region at the beginning of the disk. The remaining partitions are placed after the private region.
Creating an Alternate Boot Disk
VEA:
1) Highlight the boot disk, and select Actions -> Mirror Disk
2) Specify the target disk to use as the alternate boot disk.
vxdiskadm:
“Mirror volumes on a disk”
CLI:
To mirror the root volume only:
vxrootmir alternate_disk
To mirror all other unmirrored, concatenated columes on the boot disk to the alternate disk:
vxmirror –g diskgroup boot_disk alternate_disk
To mirror other volumes to the boot disk or other disks:
vxassist –g diskgroup mirror homevol alternate_disk
On Solaris, to set up system boot information on a VxVM disk:
vxbootsetup
Booting from an Alternate Mirror (Solaris)
1) Set the eeprom variable use-nvramrc? to true:
ok> setenv use-nvramrc? true
ok> reset
This variable must be set to true to enable the use of alterate boot disks.
2) Check for available boot disk aliases:
ok> devalias
Output displays the name of the boot disk and available mirrors.
3) Boot from an available boot disk alias:
ok> boot vx-diskname
Unencapsulating a Boot Disk
To unencapsulate a boot disk, use vxunroot
Requirements: Remove all but one plex of rootvol, swapvol, use, var, opt, and home.
Use vxunroot when you need to:
* Boot from physical system partitions
* Change the size or location of the private region on the boot disk.
* Upgrade both the OS and VxVM
Do not use vxunroot if you are only upgrading VxVM packages, including the VEA package.
The vxunroot Command
1) Ensure that the boot disk volumes only have one plex each:
vxprint –ht rootvol swapvol use var
2) If boot disk volumes have more than one plex each, remove the unnecessary plexes:
vxplex –g diskgroup –o rm dis plex_name
3) Run the vxunroot utility:
vxunroot
Notes on Upgrading Storage Foundation
* Determine what you are upgrading: Storage Foundation, VxVM only, both VxVM and the OS, or the OS only.
* Follow documentation for Storage Foundation and the OS
* Install appropriate patches
* A license is not required to upgrade VxVM only
* Your existing VxVM configuration is retained
* Upgrading VxVM does not upgrade existing disk group of file system versions. You may need to manually upgrade each after a VxVM upgrade.
* Get the latest upgrade information from the support.veritas.com website
* Backup data before upgrading (Note: copy /kernel/drv/sd.conf to a safe location)
Upgrading Storage Foundation
1) Unmount any mounted VxFS file systems
2) Reboot the system to single-user mode
3) When the system comes up, mount the /opt and /var filesystems
4) Mount the Veritas cdrom
5) Invoke the common installer, run the install command:
cd /cdrom/cdrom0
./installer
6) Answer the prompts appropriately
Upgrading VxVM Only
Methods:
* VxVM installation script (installvm)
* Manual package upgrade
* VxVM upgrade scripts (Solaris only)
- upgrade_start
- upgrade_finish
Note: on Sun, the upgrade_finish script changes /etc/vfstab to point to /dev/vx/bootdg/… even if bootdg doesn’t exist. Remember to change it back by hand before reboot. Unless you like booting off of CD in order to change the vfstab by hand.
Upgrading VxVM Only: installvm
* Invoke the installvm script and follow the instructions when prompted
* If you are performing a multihost installation, you can avoid copying packages to each system. For example, to ensure that packages are not copied remotely when using the NFS mountable filesystem $NFS_FS:
# cd /cdrom/CD_NAME
# cp –r * $NFS_FS
# cd volume_manager
# ./installvm –pkgpath $NFS_FS/volume_manager/pkgs –patchpath $NFS_FS/volume_manager/patches
* This copies the files to an NFS mounted file system that is connected to all of the systems on which you want to install the software.
Upgrading VxVM Only: Manual Packages Upgrade
1) Bring the system to single-user mode
2) Stop the vxconfigd and vxiod daemons:
# vxdctl stop
# vxiod –f set 0
3) Remove the VMSA software packages VRTSvmsa (optional)
4) Add the new VxVM packages using OS specific package installation commands
5) Perform a reconfiguration reboot (i.e. on Sun: reboot -- -r)
Scripts Used in Upgrades: Sun only
The upgrade_start and upgrade_finish scripts preserve your VxVM configuration
To check for potential problems before and upgrade, run:
# upgrade_start –check
Note: on Sun: save off a copy of your /etc/vfstab and /kernel/drv/sd.conf. The upgrade_finish will screw both up. Your /etc/vfstab will point to bootdg even if you don’t use that diskgroup name. Also, your sd.conf will be messed up if you use SAN and you’ll not see all of your disks. The vfstab can be corrected by hand, but you’ll need to copy the sd.conf back to your system to correct the “fix”.
Upgrading VxVM Only: Upgrade Scripts: Sun Only
1) Mount the Veritas cdrom
2) Run upgrade_start –check
3) Run upgrade_start
4) Reboot to single-user
5) mount /opt if not part of the root filesystem
6) Remove the VxVM package and other related VxVM packages with pkgrm
7) Reboot the system to multiuser mode
8) Verify that /opt is mounted, and than install the new VxVM packages with pkgadd
9) Run the upgrade_finish script
Upgrading Solaris Only
To prepare:
1) Detach any boot disk mirrors
2) Check alignment of boot disk volumes
3) Ensure that /opt is not a symbolic link
To upgrade:
1) Bring the system to single-user mode
2) Mount the Veritas cdrom
3) Run upgrade_start -check
4) Run upgrade_start
5) Reboot to single-user mode
6) Upgrade the OS
7) Reboot to single-user mode
8) Mount the Veritas cdrom
9) Run upgrade_finish
10) Reboot to multiuser mode
Upgrading VxVM and Solaris
To prepare:
1) Install license keys if needed
2) Detach any boot disk mirrors
3) Check alignment of boot disk volumes
4) Ensure that /opt is not a symbolic link
To remove old version:
1) Bring system to single-user mode
2) Mount the Veritas cdrom
3) Run upgrade_start –check
4) Run upgrade_start
5) Reboot to single-user mode
6) Remove VxVM packages
To install new version:
1) Reboot to single-user mode
2) Upgrade OS
3) Reboot to single-user mode
4) Mount Veritas cdrom
5) Add new licensing and VxVM packages
6) Run upgrade_finish
7) Reboot to multiuser mode
8) Add additional packages
After Upgrading
1) Confirm that key VxVM processes (vxconfigd, vxnotify, vxcache, vxrelocd, vxconfigbackupd, and vxesd) are running:
# ps –ef grep vx
2) Verify the existence of the boot disk’s volumes:
# vxprint –ht
Upgrading VxFS
1) Unmount any mounted Veritas file systems
2) Remove old VxFS packages
3) Comment out VxFS filesystems in /etc/vfstab, then reboot
4) Upgrade the OS if necessary for VxFS version compatibility.
5) Add the new VxFS packages
6) Undo any changes made to /etc/vfstab
7) Reboot
Placing the Boot Disk Under VxVM Control
Creating an Alternate Boot Disk
Removing the Boot Disk from VxVM Control
Upgrading to a New VxVM Version
What is Encapsulation?
Encapsulation is a method of placing a disk under VxVM control.in which the data that exists on a disk is preserved. Encapsulation converts existing partitions into volumes, which provides continued access to the data on the disk after a reboot. After a disk has been encapsulated, the disk is handled in the same way as an initialized disk.
Requirements:
One free partition (for public and private region)
s2 slice that represents the full disk
2048 sectors free at beginning or end of disk for the private region
What is Rootability?
Rootability, or root encapsulation, is the process of placing the root file system, swap device, and other file systems on the boot disk under VxVM control. VxVM converts existing partitions of the boot disk into VxVM volumes. The system can then mount the standard boot disk file systems (that is, /, /usr, and so on) from volumes instead of disk partitions.
Requirements are the same as for data disk encapsulation, but the private region can be created from swap space.
Why Encapsulate the Boot Disk?
You should encapsulate the boot disk only if you plan to mirror the boot disk.
Benefits of mirroring the boot disk:
1) Enables high availability
2) Fixes bad blocks automatically (for reads)
3) Improves performance (ed. I don’t buy this point)
There is no benefit to boot disk encapsulation for its own sake. You should not encapsulate the boot disk if you do not plan to mirror the boot disk.
Limitations of Boot Disk Encapsulation
Encapsulating the boot disk adds steps to OS upgrades.
A system cannot boot from a boot disk that spans multiple devices
You should never expand or change the layout of boot volumes. No volume associated with an encapsulated boot disk (rootvol, usr, var, opt, swapvol, and so on) should be expanded or shrunk, because these volumes map to a physical underlying partition on the disk and must be contiguous.
If you attempt to expand these volumes, the system can become unbootable if it becomes necessary to revert back to slices in order to boot the system. Expanding these volumes can also prevent a successful OS upgrade, and a fresh install can be required.
Note: regarding Solaris, the upgrade_start script may fail.
Solaris File System Requirements
For root, use, var, and opt volumes:
1) Use UFS file systems (VxFS is not available until later in the boot process)
2) Use contiguous disk space. (Volumes cannot use striped, RAID-5, concatenated mirrored, or striped mirrored layouts)
3) Do not use dirty region logging on the system volumes. (You can use DRL for the opt and var volumes)
For swap volumes:
1) The first swap volume must be contiguous, and, therefore, cannot use striped or layered layouts.
2) Other swap volumes can be noncontiguous and can use any layout. However, there is an implied 2Gb limit of usable swap space per device for 32-bit operating systems.
Before Encapsulating the Boot Disk
Plan your rootability configuration. bootdg is a system-wide reserved disk group name that is an alias for the disk group that contains the volumes that are used to boot the system. When you place the boot disk under VxVM control, VxVM sets bootdg to the appropriate disk group. You should never attempt to change the assigned value of bootdg; doing so may render you system unbootable. An example configuration would be to place the boot disk into a disk group named sysdg, and add at least two more disks to the disk group: one for a boot disk mirror and one as a spare disk. VxVM will set bootdg to sysdg.
For Solaris, enable boot disk aliases: eeprom “use-nvramrc?=true”
Record the layout of the partitions on the unencapsulated boot disk to save for future use.
Encapsulating the Boot Disk
vxdiskadm:
“Encapsulate one or more disks”
Follow the prompts by specifying:
1) Name of the device to add
2) Name of the disk group to which the disk will be added
3) Sliced disk format (The boot disk cannot be a CDS disk)
vxencap:
/etc/vx/bin/vxencap –g diskgroup accessname
/etc/init.d/vxvm-reconfig accessname
After Boot Disk Encapsulation
You can view operating system-specific files to better understand the encapsulation process.
Solaris:
1) VTOC (prtvtoc device)
2) /etc/system
3) /etc/vfstab
Linux:
/etc/fstab
Alternate Boot Disk: Requirements
An alternate boot disk is a mirror of the entire boot disk. It preserves the boot block in case the primary boot disk fails
Creating an alternate boot disk requires that:
1) The boot disk be encapsulated by VxVM
2) Another disk be available with enough space to contain all of the boot disk partitions
3) All disks be in the boot disk group
The root mirror places the private region at the beginning of the disk. The remaining partitions are placed after the private region.
Creating an Alternate Boot Disk
VEA:
1) Highlight the boot disk, and select Actions -> Mirror Disk
2) Specify the target disk to use as the alternate boot disk.
vxdiskadm:
“Mirror volumes on a disk”
CLI:
To mirror the root volume only:
vxrootmir alternate_disk
To mirror all other unmirrored, concatenated columes on the boot disk to the alternate disk:
vxmirror –g diskgroup boot_disk alternate_disk
To mirror other volumes to the boot disk or other disks:
vxassist –g diskgroup mirror homevol alternate_disk
On Solaris, to set up system boot information on a VxVM disk:
vxbootsetup
Booting from an Alternate Mirror (Solaris)
1) Set the eeprom variable use-nvramrc? to true:
ok> setenv use-nvramrc? true
ok> reset
This variable must be set to true to enable the use of alterate boot disks.
2) Check for available boot disk aliases:
ok> devalias
Output displays the name of the boot disk and available mirrors.
3) Boot from an available boot disk alias:
ok> boot vx-diskname
Unencapsulating a Boot Disk
To unencapsulate a boot disk, use vxunroot
Requirements: Remove all but one plex of rootvol, swapvol, use, var, opt, and home.
Use vxunroot when you need to:
* Boot from physical system partitions
* Change the size or location of the private region on the boot disk.
* Upgrade both the OS and VxVM
Do not use vxunroot if you are only upgrading VxVM packages, including the VEA package.
The vxunroot Command
1) Ensure that the boot disk volumes only have one plex each:
vxprint –ht rootvol swapvol use var
2) If boot disk volumes have more than one plex each, remove the unnecessary plexes:
vxplex –g diskgroup –o rm dis plex_name
3) Run the vxunroot utility:
vxunroot
Notes on Upgrading Storage Foundation
* Determine what you are upgrading: Storage Foundation, VxVM only, both VxVM and the OS, or the OS only.
* Follow documentation for Storage Foundation and the OS
* Install appropriate patches
* A license is not required to upgrade VxVM only
* Your existing VxVM configuration is retained
* Upgrading VxVM does not upgrade existing disk group of file system versions. You may need to manually upgrade each after a VxVM upgrade.
* Get the latest upgrade information from the support.veritas.com website
* Backup data before upgrading (Note: copy /kernel/drv/sd.conf to a safe location)
Upgrading Storage Foundation
1) Unmount any mounted VxFS file systems
2) Reboot the system to single-user mode
3) When the system comes up, mount the /opt and /var filesystems
4) Mount the Veritas cdrom
5) Invoke the common installer, run the install command:
cd /cdrom/cdrom0
./installer
6) Answer the prompts appropriately
Upgrading VxVM Only
Methods:
* VxVM installation script (installvm)
* Manual package upgrade
* VxVM upgrade scripts (Solaris only)
- upgrade_start
- upgrade_finish
Note: on Sun, the upgrade_finish script changes /etc/vfstab to point to /dev/vx/bootdg/… even if bootdg doesn’t exist. Remember to change it back by hand before reboot. Unless you like booting off of CD in order to change the vfstab by hand.
Upgrading VxVM Only: installvm
* Invoke the installvm script and follow the instructions when prompted
* If you are performing a multihost installation, you can avoid copying packages to each system. For example, to ensure that packages are not copied remotely when using the NFS mountable filesystem $NFS_FS:
# cd /cdrom/CD_NAME
# cp –r * $NFS_FS
# cd volume_manager
# ./installvm –pkgpath $NFS_FS/volume_manager/pkgs –patchpath $NFS_FS/volume_manager/patches
* This copies the files to an NFS mounted file system that is connected to all of the systems on which you want to install the software.
Upgrading VxVM Only: Manual Packages Upgrade
1) Bring the system to single-user mode
2) Stop the vxconfigd and vxiod daemons:
# vxdctl stop
# vxiod –f set 0
3) Remove the VMSA software packages VRTSvmsa (optional)
4) Add the new VxVM packages using OS specific package installation commands
5) Perform a reconfiguration reboot (i.e. on Sun: reboot -- -r)
Scripts Used in Upgrades: Sun only
The upgrade_start and upgrade_finish scripts preserve your VxVM configuration
To check for potential problems before and upgrade, run:
# upgrade_start –check
Note: on Sun: save off a copy of your /etc/vfstab and /kernel/drv/sd.conf. The upgrade_finish will screw both up. Your /etc/vfstab will point to bootdg even if you don’t use that diskgroup name. Also, your sd.conf will be messed up if you use SAN and you’ll not see all of your disks. The vfstab can be corrected by hand, but you’ll need to copy the sd.conf back to your system to correct the “fix”.
Upgrading VxVM Only: Upgrade Scripts: Sun Only
1) Mount the Veritas cdrom
2) Run upgrade_start –check
3) Run upgrade_start
4) Reboot to single-user
5) mount /opt if not part of the root filesystem
6) Remove the VxVM package and other related VxVM packages with pkgrm
7) Reboot the system to multiuser mode
8) Verify that /opt is mounted, and than install the new VxVM packages with pkgadd
9) Run the upgrade_finish script
Upgrading Solaris Only
To prepare:
1) Detach any boot disk mirrors
2) Check alignment of boot disk volumes
3) Ensure that /opt is not a symbolic link
To upgrade:
1) Bring the system to single-user mode
2) Mount the Veritas cdrom
3) Run upgrade_start -check
4) Run upgrade_start
5) Reboot to single-user mode
6) Upgrade the OS
7) Reboot to single-user mode
8) Mount the Veritas cdrom
9) Run upgrade_finish
10) Reboot to multiuser mode
Upgrading VxVM and Solaris
To prepare:
1) Install license keys if needed
2) Detach any boot disk mirrors
3) Check alignment of boot disk volumes
4) Ensure that /opt is not a symbolic link
To remove old version:
1) Bring system to single-user mode
2) Mount the Veritas cdrom
3) Run upgrade_start –check
4) Run upgrade_start
5) Reboot to single-user mode
6) Remove VxVM packages
To install new version:
1) Reboot to single-user mode
2) Upgrade OS
3) Reboot to single-user mode
4) Mount Veritas cdrom
5) Add new licensing and VxVM packages
6) Run upgrade_finish
7) Reboot to multiuser mode
8) Add additional packages
After Upgrading
1) Confirm that key VxVM processes (vxconfigd, vxnotify, vxcache, vxrelocd, vxconfigbackupd, and vxesd) are running:
# ps –ef grep vx
2) Verify the existence of the boot disk’s volumes:
# vxprint –ht
Upgrading VxFS
1) Unmount any mounted Veritas file systems
2) Remove old VxFS packages
3) Comment out VxFS filesystems in /etc/vfstab, then reboot
4) Upgrade the OS if necessary for VxFS version compatibility.
5) Add the new VxFS packages
6) Undo any changes made to /etc/vfstab
7) Reboot
Friday, July 15, 2005
Sunday, July 10, 2005
Jobs and NC
Looks like I'll be starting the first of August down there.
I'm getting too old for this job hopping stuff.
I'm getting too old for this job hopping stuff.
Thursday, July 07, 2005
Plex Problems and Solutions
Topics:
Displaying State Information for VxVM Objects
Interpreting Plex States
Interpreting Volume States
Interpreting Kernel States
Resolving Plex Problems
Analyzing Plex Problems
Identifying Plex Problems
To identify and solve plex problems, use the following information:
- Plex states
- Volume states
- Plex kernel states
- Volume kernel states
- Object condition flags
Commands to display plex, volume, and kernel states:
vxprint –g diskgroup –ht [volume_name]
vxinfo –p –g diskgroup [volume_name]
Plex States and Condition Flags
EMPTY: indicates that you have not yet defined which plex has the good data (CLEAN), and which plex does not have the good data (STALE).
CLEAN: is normal and indicates that the plex has a copy of the data that represents the volume. CLEAN also means that the volume is not started and is not currently able to handle I/O (by the admin’s control).
ACTIVE: is the same as CLEAN, but the colume is or was currently started, and the colume is or was able to perform I/O.
SNAPDONE: is the same as ACTIVE or CLEAN, but is a plex that has been synchronized with the volume as a result of a “vxassist snapstart” operation. After a reboot or a manual start of the volume, a plex in the SNAPDONE state is removed along with its subdisks.
STALE: indicates that VxVM has reason to believe that the data in the plex is not synchronized with the data in the CLEAN plexes. This state is usually caused by taking the plex offline or by a disk failure.
SNAPATT: indicates that the object is a snapshot that is currently being synchronized but does not yet have a complete copy of the data.
OFFLINE: indicates that the administrator has issued the “vxmend off” command on the plex. When the admin brings the plex back online using the “vxmend on” command, the plex changes to the STALE state.
TEMP: the TEMP state flags (TEMP, TEMPRM, TEMPRMSD) usually indicate that the data was never a copy of the volume’s data, and you should not use these plexes. These temporary states indicate that the plex is currently involved in a synchronization operation with the volume.
NODEVICE: indicates that the disk drive below the plex has failed.
REMOVED: has the same meaning as NODEVICE, but the system admin has requested that the device appear as failed.
IOFAIL: is similar to NODEVICE, but it indicates that an unrecoverable failure occurred on the device, and VxVM has not yet verified whether the disk is actually bad. Note: I/O to both the public and the private regions must fail to change the state from IOFAIL to NODEVICE.
RECOVER: is set on a plex when two conditions are met:
1) A failed disk has been fixed (by using vxreattach or the vxdiskadm option, “Replace a failed or removed disk”).
2) The plex was in the ACTIVE state prior to the failure.
Volume States
EMPTY, CLEAN, and ACTIVE: have the same meanings as they do for plexes.
NEEDSYNC: is the same as SYNC, but the internal read thread has not been started. This state exists so that volumes that use the same disk are not synchronized at the same time, and head thrashing is avoided.
SYNC: indicates that the plexes are involved in read-writeback or RAID-5 parity synchronization:
- Each time that a read occurs from a plex, it is written back to all the other plexes that are in the ACTIVE state.
- An internal read thread is started to read the entire volume (or, after a system crash, only the dirty regions if dirty region logging (DRL) is being used), forcing the data to be synchronized completely. On a RAID-5 volume, the presence of a RAID-5 log speeds up a SYNC operation.
NODEVICE: indicates that none of the plexes have currently accessible disk devices underneath the volume.
Kernel States
Kernel states represent VxVM’s ability to transfer I/O to the volume or plex.
ENABLED: The object can transfer both system I/O and user I/O
DETACHED: The object can transfer system I/O, but not user I/O (maintenance mode)
DISABLED: No I/O can be transferred.
Solving Plex Problems
Commands used to fix plex problems:
vxrecover
vxvol init
vxvol –f start
vxmend fix
vxmend offon
The vxrecover Command
vxrecover –g diskgroup –s [volume_name]
- Recovers and resynchronizes all plexes in a started volume.
- Runs “vxvol start” and “vxplex att” commands (and sometimes “vxvol resync”)
- Works in normal situations
- Resynchronizes all volumes that need recovery if a volume name is not included.
Initializing a Volume’s Plexes
vxvol –g diskgroup init init_type volume_name [plexes]
init_type:
zero: sets all plexes to a value of 0, which means that all bytes are null
active: sets all plexes to active and enables the volume and its plexes
clean: If you know that one of the plexes has the correct data, you can select that particular plex to represent the data of the volume. In this case, all other plexes will copy their content from the clean plex when the volume is started.
enable: use this option to temporarily enable the volume so that data can be loaded onto it to make the plexes consistent.
The “vxvol start” Command
vxvol –g diskgroup –f start volume_name
- This command ignores problems with the volume and starts the volume
- Only use this command on nonredundant volumes. If used on nonredundant volumes, data can be corrupted, unless all mirrors have the same data.
The vxmend Command
vxmend –g diskgroup fix stalecleanactiveempty plex
vxmend fix stale
vxmend –f diskgroup fix stale plex
- This command changes a CLEAN or ACTIVE (RECOVER) state to STALE
- The volume that the plex is associated with must be in DISABLED mode.
- Use this command as an intermediate step to the final destination for the plex state.
vxmend fix clean
vxmend –g diskgroup fix clean plex
- This command changes a STALE plex to CLEAN
- Only run this command if:
1) the associated volume is in the DISABLED state
2) There is no other plex that has a state of clean
3) All of the plexes are in the STALE or OFFLINE states.
- After you change the state of a plex to clean, recover the volume by using:
vxrecover –s
vxmend fix active
vxmend –g diskgroup fix active plex
- This command changes a STALE plex to SCTIVE
- The volume that the plex is associated with must be in DISABLED mode
When you run “vxvol start”:
ACTIVE plexes are synchronized (SYNC) together
RECOVER plexes are set to STALE and are synchronized from the ACTIVE plexes.
vxmend fix empty
vxmend –f diskgroup fix empty volume_name
- Sets all plexes and the volume to the EMPTY state
- Requires the volume to be in DISABLED mode
- Runs on the volume, not on a plex
- Returns to the same state as bottom-up creation
vxmend offon
When analyzing plexes, you can temporarily take plexes offline while validating the data in another plex.
- To take a plex offline, use the command:
vxmend –g diskgroup off plex
- To take the plex out of the offline state, use:
vxmend –g diskgroup on plex
Fixing Layered Volumes
- For layered volumes, vxmend functions the same as with nonlayered volumes.
- When starting the volume, use either:
1) “vxrecover –s” – starts both the top-level volume and the subvolumes
2) “vxvol start” with VxVM 4.0 and later, “vxvol start” completely starts (and stops) layered volumes.
Example: If the Good Plex Is Known
- For plex vol01-01, the disk was turned off and back on and still has data.
- Plex vol01-02 has been offline for several hours.
To recover:
1) Set all plexes to STALE (vxmend fix stale vol01-01)
2) Set the good plex to CLEAN (vxmend fix clean vol01-01)
3) Run “vxrecover –s vol01”
Example: If the Good Plex Is Not Known
The volume is disabled and not startable, and you do not know what happened. There are no CLEAN plexes.
To resolve:
1) Take all but one plex offline and set that plex to CLEAN (vxmend off vol01-02; vxmend fix clean vol01-01)
2) Run “vxrecover –s”
3) Verify data on the volume
4) Run “vxvol stop”
5) Repeat for each plex until you identify the plex with the good data
Displaying State Information for VxVM Objects
Interpreting Plex States
Interpreting Volume States
Interpreting Kernel States
Resolving Plex Problems
Analyzing Plex Problems
Identifying Plex Problems
To identify and solve plex problems, use the following information:
- Plex states
- Volume states
- Plex kernel states
- Volume kernel states
- Object condition flags
Commands to display plex, volume, and kernel states:
vxprint –g diskgroup –ht [volume_name]
vxinfo –p –g diskgroup [volume_name]
Plex States and Condition Flags
EMPTY: indicates that you have not yet defined which plex has the good data (CLEAN), and which plex does not have the good data (STALE).
CLEAN: is normal and indicates that the plex has a copy of the data that represents the volume. CLEAN also means that the volume is not started and is not currently able to handle I/O (by the admin’s control).
ACTIVE: is the same as CLEAN, but the colume is or was currently started, and the colume is or was able to perform I/O.
SNAPDONE: is the same as ACTIVE or CLEAN, but is a plex that has been synchronized with the volume as a result of a “vxassist snapstart” operation. After a reboot or a manual start of the volume, a plex in the SNAPDONE state is removed along with its subdisks.
STALE: indicates that VxVM has reason to believe that the data in the plex is not synchronized with the data in the CLEAN plexes. This state is usually caused by taking the plex offline or by a disk failure.
SNAPATT: indicates that the object is a snapshot that is currently being synchronized but does not yet have a complete copy of the data.
OFFLINE: indicates that the administrator has issued the “vxmend off” command on the plex. When the admin brings the plex back online using the “vxmend on” command, the plex changes to the STALE state.
TEMP: the TEMP state flags (TEMP, TEMPRM, TEMPRMSD) usually indicate that the data was never a copy of the volume’s data, and you should not use these plexes. These temporary states indicate that the plex is currently involved in a synchronization operation with the volume.
NODEVICE: indicates that the disk drive below the plex has failed.
REMOVED: has the same meaning as NODEVICE, but the system admin has requested that the device appear as failed.
IOFAIL: is similar to NODEVICE, but it indicates that an unrecoverable failure occurred on the device, and VxVM has not yet verified whether the disk is actually bad. Note: I/O to both the public and the private regions must fail to change the state from IOFAIL to NODEVICE.
RECOVER: is set on a plex when two conditions are met:
1) A failed disk has been fixed (by using vxreattach or the vxdiskadm option, “Replace a failed or removed disk”).
2) The plex was in the ACTIVE state prior to the failure.
Volume States
EMPTY, CLEAN, and ACTIVE: have the same meanings as they do for plexes.
NEEDSYNC: is the same as SYNC, but the internal read thread has not been started. This state exists so that volumes that use the same disk are not synchronized at the same time, and head thrashing is avoided.
SYNC: indicates that the plexes are involved in read-writeback or RAID-5 parity synchronization:
- Each time that a read occurs from a plex, it is written back to all the other plexes that are in the ACTIVE state.
- An internal read thread is started to read the entire volume (or, after a system crash, only the dirty regions if dirty region logging (DRL) is being used), forcing the data to be synchronized completely. On a RAID-5 volume, the presence of a RAID-5 log speeds up a SYNC operation.
NODEVICE: indicates that none of the plexes have currently accessible disk devices underneath the volume.
Kernel States
Kernel states represent VxVM’s ability to transfer I/O to the volume or plex.
ENABLED: The object can transfer both system I/O and user I/O
DETACHED: The object can transfer system I/O, but not user I/O (maintenance mode)
DISABLED: No I/O can be transferred.
Solving Plex Problems
Commands used to fix plex problems:
vxrecover
vxvol init
vxvol –f start
vxmend fix
vxmend offon
The vxrecover Command
vxrecover –g diskgroup –s [volume_name]
- Recovers and resynchronizes all plexes in a started volume.
- Runs “vxvol start” and “vxplex att” commands (and sometimes “vxvol resync”)
- Works in normal situations
- Resynchronizes all volumes that need recovery if a volume name is not included.
Initializing a Volume’s Plexes
vxvol –g diskgroup init init_type volume_name [plexes]
init_type:
zero: sets all plexes to a value of 0, which means that all bytes are null
active: sets all plexes to active and enables the volume and its plexes
clean: If you know that one of the plexes has the correct data, you can select that particular plex to represent the data of the volume. In this case, all other plexes will copy their content from the clean plex when the volume is started.
enable: use this option to temporarily enable the volume so that data can be loaded onto it to make the plexes consistent.
The “vxvol start” Command
vxvol –g diskgroup –f start volume_name
- This command ignores problems with the volume and starts the volume
- Only use this command on nonredundant volumes. If used on nonredundant volumes, data can be corrupted, unless all mirrors have the same data.
The vxmend Command
vxmend –g diskgroup fix stalecleanactiveempty plex
vxmend fix stale
vxmend –f diskgroup fix stale plex
- This command changes a CLEAN or ACTIVE (RECOVER) state to STALE
- The volume that the plex is associated with must be in DISABLED mode.
- Use this command as an intermediate step to the final destination for the plex state.
vxmend fix clean
vxmend –g diskgroup fix clean plex
- This command changes a STALE plex to CLEAN
- Only run this command if:
1) the associated volume is in the DISABLED state
2) There is no other plex that has a state of clean
3) All of the plexes are in the STALE or OFFLINE states.
- After you change the state of a plex to clean, recover the volume by using:
vxrecover –s
vxmend fix active
vxmend –g diskgroup fix active plex
- This command changes a STALE plex to SCTIVE
- The volume that the plex is associated with must be in DISABLED mode
When you run “vxvol start”:
ACTIVE plexes are synchronized (SYNC) together
RECOVER plexes are set to STALE and are synchronized from the ACTIVE plexes.
vxmend fix empty
vxmend –f diskgroup fix empty volume_name
- Sets all plexes and the volume to the EMPTY state
- Requires the volume to be in DISABLED mode
- Runs on the volume, not on a plex
- Returns to the same state as bottom-up creation
vxmend offon
When analyzing plexes, you can temporarily take plexes offline while validating the data in another plex.
- To take a plex offline, use the command:
vxmend –g diskgroup off plex
- To take the plex out of the offline state, use:
vxmend –g diskgroup on plex
Fixing Layered Volumes
- For layered volumes, vxmend functions the same as with nonlayered volumes.
- When starting the volume, use either:
1) “vxrecover –s” – starts both the top-level volume and the subvolumes
2) “vxvol start” with VxVM 4.0 and later, “vxvol start” completely starts (and stops) layered volumes.
Example: If the Good Plex Is Known
- For plex vol01-01, the disk was turned off and back on and still has data.
- Plex vol01-02 has been offline for several hours.
To recover:
1) Set all plexes to STALE (vxmend fix stale vol01-01)
2) Set the good plex to CLEAN (vxmend fix clean vol01-01)
3) Run “vxrecover –s vol01”
Example: If the Good Plex Is Not Known
The volume is disabled and not startable, and you do not know what happened. There are no CLEAN plexes.
To resolve:
1) Take all but one plex offline and set that plex to CLEAN (vxmend off vol01-02; vxmend fix clean vol01-01)
2) Run “vxrecover –s”
3) Verify data on the volume
4) Run “vxvol stop”
5) Repeat for each plex until you identify the plex with the good data
Wednesday, July 06, 2005
Disk Problems and Solutions
Topics:
Identifying I/O Failure
Disk Failure Types
Resolving Permanent Disk Failure
Resolving Temporary Disk Failure
Resolving Intermittent Disk Failure
Disk Failure Handling
Follow the path:
The OS detects an error and informs vxconfigd – Is the volume redundant? 1) No 2) Yes
1) Display error messages, detach the disk from the disk group, and change volume’s kernel state.
2) Is the private region accessible? 3) No 4) Yes
3) Mark the disk as FAILED, detach the disk, mark the affected plex with NODEVICE, and relocate redundant volumes.
4) Mark the disk as FAILING, mark the affected plex with IOFAIL, and relocate subdisks.
Permanent Disk Failure: Volume States After the Failure
“vxprint –htg diskgroup” will list the device as NODEVICE.
Permanent Disk Failure: Resolving
1) Replace the disk
2) Have VxVM scan the devices: vxdctl enable
3) Initialize the new drive: vxdisksetup –i device_name
4) Attach the disk media name (datadg02) to the new drive:
vxdg –g diskgroup –k adddisk datadg02=device_name
5) Recover the redundant volumes: vxrecover
6) Start any nonredundant volumes: vxvol –g diskgroup –f start volume_name
7) Restore data of any nonredundant volumes from backup.
Temporary Disk Failure: Volume States After Reattaching Disk
“vxprint –htg diskgroup” will list the device as DISABLED IOFAIL
Temporary Disk Failure: Resolving
1) Fix the failure
2) Ensure that the OS recognizes the device
3) Force VxVM to reread all drives: vxdctl enable
4) Reattach the disk media name to the disk access name: vxreattach
5) Recover the redundant volumes: vxrecover
6) Start any nonredundant volumes: vxvol –g diskgroup –f start volume_name
7) Check data for consistency, for example:
fsck /dev/vx/rdsk/diskgroup/volume_name
Intermittent Disk Failure: Resolving
1) If any volumes on the failing disk are not redundant, attempt to mirror those volumes:
- If you can mirror the volumes, continue with the procedure for redundant volumes.
- If you cannot mirror the volume, prepare for backup and restore.
2) If the volume is redundant
- Prevent read I/O from accessing the failing disk by changing the volume read policy
- Remove the failing disk
- Set the volume read policy back to the original policy
Forced Removal
To forcibly remove a disk and not evacuate the data:
1) Use the vxdiskadm option, “Remove a disk for replacement.” VxVM handles the drive as if it has already failed.
2) Use the vxdiskadm option, “Replace a failed or removed disk.”
CLI:
vxdg –k –g diskgroup rmdisk [device_name]
vxdisksetup –if [newdisk]
vxdg –k –g diskgroup adddisk [device_name]=[newdisk]
Identifying a Degraded Plex of a RAID-5 Volume
“vxprint –htg diskgroup” will list the device as NODEVICE and the subdisk as NDEV.
The following commands will also indicate degradation:
vxprint –l volume_name
vxinfo –p –g diskgroup
Identifying I/O Failure
Disk Failure Types
Resolving Permanent Disk Failure
Resolving Temporary Disk Failure
Resolving Intermittent Disk Failure
Disk Failure Handling
Follow the path:
The OS detects an error and informs vxconfigd – Is the volume redundant? 1) No 2) Yes
1) Display error messages, detach the disk from the disk group, and change volume’s kernel state.
2) Is the private region accessible? 3) No 4) Yes
3) Mark the disk as FAILED, detach the disk, mark the affected plex with NODEVICE, and relocate redundant volumes.
4) Mark the disk as FAILING, mark the affected plex with IOFAIL, and relocate subdisks.
Permanent Disk Failure: Volume States After the Failure
“vxprint –htg diskgroup” will list the device as NODEVICE.
Permanent Disk Failure: Resolving
1) Replace the disk
2) Have VxVM scan the devices: vxdctl enable
3) Initialize the new drive: vxdisksetup –i device_name
4) Attach the disk media name (datadg02) to the new drive:
vxdg –g diskgroup –k adddisk datadg02=device_name
5) Recover the redundant volumes: vxrecover
6) Start any nonredundant volumes: vxvol –g diskgroup –f start volume_name
7) Restore data of any nonredundant volumes from backup.
Temporary Disk Failure: Volume States After Reattaching Disk
“vxprint –htg diskgroup” will list the device as DISABLED IOFAIL
Temporary Disk Failure: Resolving
1) Fix the failure
2) Ensure that the OS recognizes the device
3) Force VxVM to reread all drives: vxdctl enable
4) Reattach the disk media name to the disk access name: vxreattach
5) Recover the redundant volumes: vxrecover
6) Start any nonredundant volumes: vxvol –g diskgroup –f start volume_name
7) Check data for consistency, for example:
fsck /dev/vx/rdsk/diskgroup/volume_name
Intermittent Disk Failure: Resolving
1) If any volumes on the failing disk are not redundant, attempt to mirror those volumes:
- If you can mirror the volumes, continue with the procedure for redundant volumes.
- If you cannot mirror the volume, prepare for backup and restore.
2) If the volume is redundant
- Prevent read I/O from accessing the failing disk by changing the volume read policy
- Remove the failing disk
- Set the volume read policy back to the original policy
Forced Removal
To forcibly remove a disk and not evacuate the data:
1) Use the vxdiskadm option, “Remove a disk for replacement.” VxVM handles the drive as if it has already failed.
2) Use the vxdiskadm option, “Replace a failed or removed disk.”
CLI:
vxdg –k –g diskgroup rmdisk [device_name]
vxdisksetup –if [newdisk]
vxdg –k –g diskgroup adddisk [device_name]=[newdisk]
Identifying a Degraded Plex of a RAID-5 Volume
“vxprint –htg diskgroup” will list the device as NODEVICE and the subdisk as NDEV.
The following commands will also indicate degradation:
vxprint –l volume_name
vxinfo –p –g diskgroup
Monday, July 04, 2005
Managing Devices Within the VxVM Architecture
Topics:
Managing Components in the VxVM Architecture
Discovering Disk Devices
Administering the Device Discovery Layer
Dynamic Multipathing
Preventing Multipathing for a Device
Managing DMP
Controlling Automatic Restore Processes
VxVM Daemons
vxconfigd – The VxVM configuration daemon maintains disk and group configurations, communicates configuration changes to the kernel, and modifies configuration information stored on disks. When a system is booted, the command “vxdctl enable” is automatically executed to start vxconfigd. VxVM reads the /etc/vx/volboot file to determine disk ownership and automatically imports disk groups owned by the host.
vxiod – The VxVM I/O daemon provides extended I/O operations without blocking calling processes. Several vxiod daemons are usually started at boot time, and they continue to run at all times.
vxrelocd – is the hot-relocation daemon that monitors events that affect data redundancy.
VxVM Configuration Database
- Contains all disk, volume, plex, and subdisk configuration records
- Is stored in the private region of a VxVM disk
- Is replicated to maintain a copy on multiple disks in a disk group
- Is updated by the vxconfigd process
Displaying VxVM Configuration Database Information
vxdg list diskgroup
Displaying Disk Header Information
vxdisk –g diskgroup list disk_name
VxVM Disk Types and Formats
- auto:cdcdisk
- auto:simple
- auto:sliced
- auto:none
simple – Public and private regions are contiguous on the same partition
sliced – Public and private regions are on separate partitions.
nopriv – No private region.
VxVM Configuration Daemon
vxconfigd:
- Maintains the configuration database
- Synchronizes changes between multiple requests, based on a database transaction model:
* All utilities make changes through vxconfigd
* Utilities identify resources needed at the start of the transaction.
* Transactions are serialized, as needed.
* Changes are reflected in all copies immediately
- Does not interfere with access to data on disk
- Must be running for changes to be made to the configuration database.
If vxconfigd is not running, VxVM operates, but configuration changes are not allowed and queries of the database are not possible.
- vxconfigd reads the kernel log to determine current states of VxVM components and updates the configuration database.
- Kernel logs are updated even if vxconfigd is not running. For example, upon startup, vxconfigd reads the kernel log and determines that a volume needs to be resynchronized.
- vxconfigd modes:
enabled – normal operating state
disabled – Most operations not allowed
booted – Part of the normal system startup while acquiring the boot disk group
The vxdctl Command
Use vxdctl to control vxconfigd
vxdctl mode – Displays vxconfigd status
vxdctl enable – Enables vxconfigd
vxdctl disable – Disables vxconfigd
vxdctl stop – Stops vxconfigd
vxdctl –k stop – Sends a kill -9
vxconfigd – Starts vxconfigd
vxdctl license – Checks licensing
vxdctl support – Displays version information
The volboot File
/etc/vx/volboot contains:
- The host ID (this is really the hostname) that is used by VxVM to establish ownership of physical disks
- The values of defaultdg and bootdg if these values were set by the user.
Caution: Do not edit volboot, or its checksum is invalidated.
To display the contents of volboot:
vxdctl list
To change the host ID in volboot:
vxdctl hosted new_hostid
vxdctl enable
To re-create volboot:
vxdctl init hosted
Device Discovery Layer (DDL)
Device discovery is the process of locating and identifying disks attached to a host
Prior to VxVM 3.2, device discovery occurred at boot time. With VxVM 3.2 and later, device discovery occurs automatically whenever you add a new disk array.
Adding Disk Array Support
To add support for a new type of disk array, add vendor-supplied libraries.
Then scan for new devices:
vxdctl enable
This invokes vxconfigd to scan for all disk devices, updates the device list, and reconfigures DMP
Partial Device Discovery
Discover newly added devices previously unknown to VxVM:
vxdisk scandisks new
Discover fabric devices:
vxdisk scandisks fabric
Scan for the specific devices:
vxdisk scandisks device=c1t1d0,c2t2d0
Scan for all devices except those that are listed:
vxdisk scandisks !device=c1t1d0,c2t2d0
Scan for devices that are connected to logical or physical controllers:
vxdisk scandisks ctlr=c1,c2
Discover devices that are connected to the specified physical controller:
vxdisk scandisks pctlr=/pci@1f,4000/scsi@3/
Administering DDL
To add/remove/list support for disk arrays:
vxddladm listsupport
vxddladm excludearray libname=library
vxddladm excludearray vid=ACME pid=X1
vxddladm includearray libname=library
vxddladm includearray vid=ACME pid=X1
vxddladm listexclude
To add/remove/list support for JBODs:
vxddladm listjbod
vxddladm addjbod vid=vendor_ID pid=prod_ID
vxddladm rmjbod vid=vendor_ID pid=prod_ID
To add a foreign device:
vxddladm addforeign blockdir=path chardir=path
Dynamic Multipathing (DMP)
Dynamic multipathing is a method that VxVM uses to manage two or more hardware paths directing I/O to a single drive. VxVM arbitrarily selects one of the two names and creates a single device entry, and then transfers data across both paths to spread the I/O.
VxVM detects multipath systems by using the Universal World-Wide-Device Identifiers (WWD IDs) and manages multipath targets, such as disk arrays, which define polices for using more than one path.
Types of Multiported Arrays
A multiported disk array is an array that can be connected to host systems through multiple paths. The two basic types of multiported disk arrays are:
1) active/active disk arrays
2) active/passive disk arrays
Preventing DMP for a Device
If an array cannot support DMP, you can prevent multipathing for the device by using vxdiskadm:
“Prevent multipathing/Suppress devices from VxVM’s view”
Warning: If you do not prevent DMP for unsupported arrays:
- Commands like “vxdisk list” show duplicate sets of disks as ONLINE, even through only one path is used for I/O.
- Disk failures can be represented incorrectly.
- The option “Suppress all paths through a controller from VxVM’s view” continues to allow the I/O to use both paths internally. After a reboot, “vxdisk list” does not show the suppressed disks.
- “Prevent multipathing of all disks on a controller by VxVM” does not allow the I/O to use internal multipathing. The “vxdisk list” command shows all disks as ONLINE. This option has no effect on arrays that are performing dynamic multipathing or that do not support VxVM DMP.
Listing Controllers
vxdmpadm listctlr all
Displaying Paths
vxdmpadm getsubpaths ctlr=controller_name
To display paths connected to a LUN:
vxdmpadm getsubpaths dmpnodename=node_name
Displaying DMP Nodes
vxdmpadm getdmpnode nodename=c3t2d1
Disabling I/O to a controller
VEA:
Select Actions -> Disable/Enable
CLI:
To disable I/O to a particular controller:
vxdmpadm disable ctlr=ctlr_name
To disable I/O to a particular enclosure:
vxdmpadm disable enclosure=enc0
To reenable I/O to a particular controller:
vxdmpadm enable ctlr=ctlr_name
Displaying I/O Statistics for Paths
Enable the gathering of statistics:
vxdmpadm iostat start [memory=size]
Reset the I/O counters to zero:
vxdmpadm iostat reset
Display the accumulated statistics for all paths:
vxdmpadm iostat show all
Setting I/O Policies and Path Attributes
To change the I/O policy for balancing the I/O load across multiple paths to a disk array or enclosure:
vxdmpadm setattr enclosure enc_name iopolicy=policy
Policies:
adaptive – automatically determines the paths that have the least delay
balanced – (default) takes the track cache into consideration when balancing I/O across paths.
minimumq – sends I/O on paths that have the minimum number of I/O requests in the queue.
priority – assigns the path with the highest load carrying capacity as the priority path.
round-robin – sets a simple round-robin policy for I/O
singleactive – channels I/O through the single active path
To set path attributes for a disk array or enclosure:
vxdmpadm setattr path path_name pathtype=type
Type:
active – changes a standby path to active
nomanual – restores the original primary or secondary attributes of a path
nopreferred – restores the normal priority of the path
preferred [priority=N] – specifies a preferred path and optionally assignes a priority to it.
primary – assignes a primary path for an Active/Passive disk array
secondary - assignes a secondary path for an Active/Passive disk array
standby – marks a path as not available for normal I/O scheduling.
Managing Enclosures
CLI:
To display attributes of all enclosures:
vxdmpadm listenclosure all
To change the name of an enclosure:
vxdmpadm setattr enclosure orig_name name=new_name
VEA:
Highlight an enclosure, and select Actions -> Rename Enclosure.
Controlling the Restore Daemon
The DMP restore daemon is an internal process that monitors DMP paths. To check status:
vxdmpadm stat restored.
To change daemon properties:
Stop the DMP restore Daemon:
vxdmpadm stop restore
Restart the daemon with new attributes:
vxdmpadm start restore interval=400 policy=check_all
Managing Components in the VxVM Architecture
Discovering Disk Devices
Administering the Device Discovery Layer
Dynamic Multipathing
Preventing Multipathing for a Device
Managing DMP
Controlling Automatic Restore Processes
VxVM Daemons
vxconfigd – The VxVM configuration daemon maintains disk and group configurations, communicates configuration changes to the kernel, and modifies configuration information stored on disks. When a system is booted, the command “vxdctl enable” is automatically executed to start vxconfigd. VxVM reads the /etc/vx/volboot file to determine disk ownership and automatically imports disk groups owned by the host.
vxiod – The VxVM I/O daemon provides extended I/O operations without blocking calling processes. Several vxiod daemons are usually started at boot time, and they continue to run at all times.
vxrelocd – is the hot-relocation daemon that monitors events that affect data redundancy.
VxVM Configuration Database
- Contains all disk, volume, plex, and subdisk configuration records
- Is stored in the private region of a VxVM disk
- Is replicated to maintain a copy on multiple disks in a disk group
- Is updated by the vxconfigd process
Displaying VxVM Configuration Database Information
vxdg list diskgroup
Displaying Disk Header Information
vxdisk –g diskgroup list disk_name
VxVM Disk Types and Formats
- auto:cdcdisk
- auto:simple
- auto:sliced
- auto:none
simple – Public and private regions are contiguous on the same partition
sliced – Public and private regions are on separate partitions.
nopriv – No private region.
VxVM Configuration Daemon
vxconfigd:
- Maintains the configuration database
- Synchronizes changes between multiple requests, based on a database transaction model:
* All utilities make changes through vxconfigd
* Utilities identify resources needed at the start of the transaction.
* Transactions are serialized, as needed.
* Changes are reflected in all copies immediately
- Does not interfere with access to data on disk
- Must be running for changes to be made to the configuration database.
If vxconfigd is not running, VxVM operates, but configuration changes are not allowed and queries of the database are not possible.
- vxconfigd reads the kernel log to determine current states of VxVM components and updates the configuration database.
- Kernel logs are updated even if vxconfigd is not running. For example, upon startup, vxconfigd reads the kernel log and determines that a volume needs to be resynchronized.
- vxconfigd modes:
enabled – normal operating state
disabled – Most operations not allowed
booted – Part of the normal system startup while acquiring the boot disk group
The vxdctl Command
Use vxdctl to control vxconfigd
vxdctl mode – Displays vxconfigd status
vxdctl enable – Enables vxconfigd
vxdctl disable – Disables vxconfigd
vxdctl stop – Stops vxconfigd
vxdctl –k stop – Sends a kill -9
vxconfigd – Starts vxconfigd
vxdctl license – Checks licensing
vxdctl support – Displays version information
The volboot File
/etc/vx/volboot contains:
- The host ID (this is really the hostname) that is used by VxVM to establish ownership of physical disks
- The values of defaultdg and bootdg if these values were set by the user.
Caution: Do not edit volboot, or its checksum is invalidated.
To display the contents of volboot:
vxdctl list
To change the host ID in volboot:
vxdctl hosted new_hostid
vxdctl enable
To re-create volboot:
vxdctl init hosted
Device Discovery Layer (DDL)
Device discovery is the process of locating and identifying disks attached to a host
Prior to VxVM 3.2, device discovery occurred at boot time. With VxVM 3.2 and later, device discovery occurs automatically whenever you add a new disk array.
Adding Disk Array Support
To add support for a new type of disk array, add vendor-supplied libraries.
Then scan for new devices:
vxdctl enable
This invokes vxconfigd to scan for all disk devices, updates the device list, and reconfigures DMP
Partial Device Discovery
Discover newly added devices previously unknown to VxVM:
vxdisk scandisks new
Discover fabric devices:
vxdisk scandisks fabric
Scan for the specific devices:
vxdisk scandisks device=c1t1d0,c2t2d0
Scan for all devices except those that are listed:
vxdisk scandisks !device=c1t1d0,c2t2d0
Scan for devices that are connected to logical or physical controllers:
vxdisk scandisks ctlr=c1,c2
Discover devices that are connected to the specified physical controller:
vxdisk scandisks pctlr=/pci@1f,4000/scsi@3/
Administering DDL
To add/remove/list support for disk arrays:
vxddladm listsupport
vxddladm excludearray libname=library
vxddladm excludearray vid=ACME pid=X1
vxddladm includearray libname=library
vxddladm includearray vid=ACME pid=X1
vxddladm listexclude
To add/remove/list support for JBODs:
vxddladm listjbod
vxddladm addjbod vid=vendor_ID pid=prod_ID
vxddladm rmjbod vid=vendor_ID pid=prod_ID
To add a foreign device:
vxddladm addforeign blockdir=path chardir=path
Dynamic Multipathing (DMP)
Dynamic multipathing is a method that VxVM uses to manage two or more hardware paths directing I/O to a single drive. VxVM arbitrarily selects one of the two names and creates a single device entry, and then transfers data across both paths to spread the I/O.
VxVM detects multipath systems by using the Universal World-Wide-Device Identifiers (WWD IDs) and manages multipath targets, such as disk arrays, which define polices for using more than one path.
Types of Multiported Arrays
A multiported disk array is an array that can be connected to host systems through multiple paths. The two basic types of multiported disk arrays are:
1) active/active disk arrays
2) active/passive disk arrays
Preventing DMP for a Device
If an array cannot support DMP, you can prevent multipathing for the device by using vxdiskadm:
“Prevent multipathing/Suppress devices from VxVM’s view”
Warning: If you do not prevent DMP for unsupported arrays:
- Commands like “vxdisk list” show duplicate sets of disks as ONLINE, even through only one path is used for I/O.
- Disk failures can be represented incorrectly.
- The option “Suppress all paths through a controller from VxVM’s view” continues to allow the I/O to use both paths internally. After a reboot, “vxdisk list” does not show the suppressed disks.
- “Prevent multipathing of all disks on a controller by VxVM” does not allow the I/O to use internal multipathing. The “vxdisk list” command shows all disks as ONLINE. This option has no effect on arrays that are performing dynamic multipathing or that do not support VxVM DMP.
Listing Controllers
vxdmpadm listctlr all
Displaying Paths
vxdmpadm getsubpaths ctlr=controller_name
To display paths connected to a LUN:
vxdmpadm getsubpaths dmpnodename=node_name
Displaying DMP Nodes
vxdmpadm getdmpnode nodename=c3t2d1
Disabling I/O to a controller
VEA:
Select Actions -> Disable/Enable
CLI:
To disable I/O to a particular controller:
vxdmpadm disable ctlr=ctlr_name
To disable I/O to a particular enclosure:
vxdmpadm disable enclosure=enc0
To reenable I/O to a particular controller:
vxdmpadm enable ctlr=ctlr_name
Displaying I/O Statistics for Paths
Enable the gathering of statistics:
vxdmpadm iostat start [memory=size]
Reset the I/O counters to zero:
vxdmpadm iostat reset
Display the accumulated statistics for all paths:
vxdmpadm iostat show all
Setting I/O Policies and Path Attributes
To change the I/O policy for balancing the I/O load across multiple paths to a disk array or enclosure:
vxdmpadm setattr enclosure enc_name iopolicy=policy
Policies:
adaptive – automatically determines the paths that have the least delay
balanced – (default) takes the track cache into consideration when balancing I/O across paths.
minimumq – sends I/O on paths that have the minimum number of I/O requests in the queue.
priority – assigns the path with the highest load carrying capacity as the priority path.
round-robin – sets a simple round-robin policy for I/O
singleactive – channels I/O through the single active path
To set path attributes for a disk array or enclosure:
vxdmpadm setattr path path_name pathtype=type
Type:
active – changes a standby path to active
nomanual – restores the original primary or secondary attributes of a path
nopreferred – restores the normal priority of the path
preferred [priority=N] – specifies a preferred path and optionally assignes a priority to it.
primary – assignes a primary path for an Active/Passive disk array
secondary - assignes a secondary path for an Active/Passive disk array
standby – marks a path as not available for normal I/O scheduling.
Managing Enclosures
CLI:
To display attributes of all enclosures:
vxdmpadm listenclosure all
To change the name of an enclosure:
vxdmpadm setattr enclosure orig_name name=new_name
VEA:
Highlight an enclosure, and select Actions -> Rename Enclosure.
Controlling the Restore Daemon
The DMP restore daemon is an internal process that monitors DMP paths. To check status:
vxdmpadm stat restored.
To change daemon properties:
Stop the DMP restore Daemon:
vxdmpadm stop restore
Restart the daemon with new attributes:
vxdmpadm start restore interval=400 policy=check_all
Friday, July 01, 2005
Recovery Essentials
Topics:
Maintaining Data Consistency
Hot Relocation
Managing Spare Disks
Replacing a Disk
Unrelocating a Disk
Recovering a Volume
Protecting the VxVM Configuration
Accessing the Technical Support Website
Atomic-copy resynchronization involves the sequential writing of all blocks of a volume to a plex.
This type of resynchronization is used in:
Adding a new plex (mirror)
Reattaching a detached plex (mirror) to a volume
Online reconfiguration operations:
- Moving a plex
- Copying a plex
- Creating a snapshot
- Moving a subdisk
Read-writeback resynchronization is used for volumes that were fully mirrored prior to a system failure.
In this type of resynchronization:
- Mirrors marked ACTIVE remain ACTIVE, and volume is placed in the SYNC state
- An internal read thread is started. Blocks are read from the plex specified in the read policy, and the data is written to the other plexes.
- Upon completion, the SYNC flag is turned off
Impact of Resynchronization
Resynchronization takes time and impacts performance.
To minimize this performance impact, VxVM provides the following solutions:
- Dirty region logging (DRL) for mirrored volumes
- RAID-5 logging for RAID-5 volumes
- FastResync for mirrored and snapshot columes
Dirty Region Logging
For mirrored volumes with logging enabled, DRL speeds plex resynchronization. Only regions that are dirty (different from the primary) need to be resynchronized after a crash.
VxVM selects an appropriate log size based on volume size
If you resize a volume, the log size does not change. To resize the log, you must delete the log and add it back after resizing the volume.
RAID-5 Logging
- For RAID-5 volumes, logging helps to prevent data corruption during recovery
- RAID-5 logging records changes to data and parity on a persistent device (log disk) before committing the changes to the RAID-5 volume.
- Logs are associated with a RAID-5 volume by being attached as log plexes.
Hot Relocation
Hot relocation is a feature of VxVM that enables a system to automatically react to I/O failures on redundant (mirrored or RAID-5) VxVM objects and restore redundancy and access to those objects. VxVM detects I/O failures on objects and relocates the affected subdisks. The subdisks are relocated to disks designed as spare disks or to free space within the disk group. VxVM then reconstructs the objects that existed before the failure and makes them redundant and accessible again.
How is Space Selected?
- Hot relocation attempts to move all subdisks from a failing drive to a single spare destination disk
- If no disks have been designated as spares, VxVM uses any available free space in the disk group in which the failure occurs.
- If there is not enough spare disk space, a combination of spare disk space and free space is used.
- Free space that you exclude from hot relocation is not used.
Managing Spare Disks
VEA:
Actions -> Set Disk Usage
vxdiskadm:
- “Mark a disk as a spare for a disk group”
- “Turn off the spare flag on a disk”
- “Exclude a disk from hot-relocation use”
- “Make a disk available for hot-relocation use”
CLI:
To designate a disk as a spare:
vxedit –g diskgroup set spare=onoff diskname
To exclude/include a disk for hot relocation:
vxedit –g diskgroup set nohotuse=onoff diskname
To force hot relocation to only use spare disks:
Add spare=only to /etc/default/vxassist
Disk Replacement Tasks
1) Replace the failed/failing disk
2) Logical Replacement
- Replace the disk in VxVM
- Start disabled volumes
- Resynchronize mirrors
- Resynchronize RAID-5 parity
Adding a New Disk
1) Connect the new disk
2) Get the OS to recognize the disk:
Sun –
devfsadm
prtvtoc /dev/dsk/device_name
HP-UX
ioscan –fC disk
insf –e
3) Get VxVM to recognize that a failed disk is now working again:
vxdctl enable
4) Verify that VxVM recognizes the disk:
vxdisk list
Note: In VEA, use the Actions -> Rescan option to run disk setup commands appropriate for the OS. This option ensures that VxVM recognizes newly attached hardware.
Unrelocating a Disk
VEA:
Select the disk to be unrelocated
Select Actions -> Undo Hot Relocation
vxdiskadm:
“Unrelocate subdisks back to a disk”
CLI:
vxunreloc [-f] [-g diskgroup] [-t tasktag] [-n disk_name] orig_disk_name
- orig_disk_name = Original disk before relocation
- -n disk_name = Unrelocates to a disk other than the original
- -f = Forces unrelocation if exact offsets are not possible
Viewing Relocated Subdisks: CLI
When a subdisk is hot-relocated, its original disk media name is stored in the sd_orig_dmname field of the subdisk record files. You can search this field to find all the subdisks that originated from a failed disk using the vxprint command:
vxprint –g diskgroup –se ‘sd_orig_dmname=”disk_name”’
For example, to display all the subdisks that were hot-relocated from datadg01 within the datadg disk group:
vxprint –g datadg –se ‘sd_orig_dmname=”datadg01”’
Recovering a Volume
VEA:
Select the volume to be recovered
Select Actions -> Recover Volume
CLI:
vxreattach [-bcr] [device_tag]
- Reattaches disks to a disk group if disk has a transient failure, such as when a drive is turned off and then turned back on, or if the Volume Manager starts with some disk drivers unloaded and unloadable.
- -r attempts to recover stale plexes using vxrecover
vxrecover [-bnpsvV] [-g diskgroup] [volume_namedisk_name]
i.e. vxrecover –b –g datadg datavol
Note: the vxrecover command only works on a started volume. A started volume displays an ENABLED state in vxprint –ht.
Note: use the –s argument to start a disabled volume
Configuration Backup and Restore
Backup:
vxconfigbackup [diskgroup]
Precommit:
vxconfigrestore –p diskgroup
Commit:
vxconfigrestore-c diskgroup
Bt default, VxVM configuration data is automatically backed up to the files:
/etc/vx/cbr/bk/diskgroup.dgid/dgid.dginfo
/etc/vx/cbr/bk/diskgroup.dgid/dgid.diskinfo
/etc/vx/cbr/bk/diskgroup.dgid/dgid.binconfig
/etc/vx/cbr/bk/diskgroup.dgid/dgid.cfgrec
Configuration data from a backup enables you to reinstall private region headers of VxVM disks in disk group, re-create a corrupted disk group configuration, or re-create a disk group and the VxVM objects within it.
And that is the end of the “Fundamentals” book.
Maintaining Data Consistency
Hot Relocation
Managing Spare Disks
Replacing a Disk
Unrelocating a Disk
Recovering a Volume
Protecting the VxVM Configuration
Accessing the Technical Support Website
Atomic-copy resynchronization involves the sequential writing of all blocks of a volume to a plex.
This type of resynchronization is used in:
Adding a new plex (mirror)
Reattaching a detached plex (mirror) to a volume
Online reconfiguration operations:
- Moving a plex
- Copying a plex
- Creating a snapshot
- Moving a subdisk
Read-writeback resynchronization is used for volumes that were fully mirrored prior to a system failure.
In this type of resynchronization:
- Mirrors marked ACTIVE remain ACTIVE, and volume is placed in the SYNC state
- An internal read thread is started. Blocks are read from the plex specified in the read policy, and the data is written to the other plexes.
- Upon completion, the SYNC flag is turned off
Impact of Resynchronization
Resynchronization takes time and impacts performance.
To minimize this performance impact, VxVM provides the following solutions:
- Dirty region logging (DRL) for mirrored volumes
- RAID-5 logging for RAID-5 volumes
- FastResync for mirrored and snapshot columes
Dirty Region Logging
For mirrored volumes with logging enabled, DRL speeds plex resynchronization. Only regions that are dirty (different from the primary) need to be resynchronized after a crash.
VxVM selects an appropriate log size based on volume size
If you resize a volume, the log size does not change. To resize the log, you must delete the log and add it back after resizing the volume.
RAID-5 Logging
- For RAID-5 volumes, logging helps to prevent data corruption during recovery
- RAID-5 logging records changes to data and parity on a persistent device (log disk) before committing the changes to the RAID-5 volume.
- Logs are associated with a RAID-5 volume by being attached as log plexes.
Hot Relocation
Hot relocation is a feature of VxVM that enables a system to automatically react to I/O failures on redundant (mirrored or RAID-5) VxVM objects and restore redundancy and access to those objects. VxVM detects I/O failures on objects and relocates the affected subdisks. The subdisks are relocated to disks designed as spare disks or to free space within the disk group. VxVM then reconstructs the objects that existed before the failure and makes them redundant and accessible again.
How is Space Selected?
- Hot relocation attempts to move all subdisks from a failing drive to a single spare destination disk
- If no disks have been designated as spares, VxVM uses any available free space in the disk group in which the failure occurs.
- If there is not enough spare disk space, a combination of spare disk space and free space is used.
- Free space that you exclude from hot relocation is not used.
Managing Spare Disks
VEA:
Actions -> Set Disk Usage
vxdiskadm:
- “Mark a disk as a spare for a disk group”
- “Turn off the spare flag on a disk”
- “Exclude a disk from hot-relocation use”
- “Make a disk available for hot-relocation use”
CLI:
To designate a disk as a spare:
vxedit –g diskgroup set spare=onoff diskname
To exclude/include a disk for hot relocation:
vxedit –g diskgroup set nohotuse=onoff diskname
To force hot relocation to only use spare disks:
Add spare=only to /etc/default/vxassist
Disk Replacement Tasks
1) Replace the failed/failing disk
2) Logical Replacement
- Replace the disk in VxVM
- Start disabled volumes
- Resynchronize mirrors
- Resynchronize RAID-5 parity
Adding a New Disk
1) Connect the new disk
2) Get the OS to recognize the disk:
Sun –
devfsadm
prtvtoc /dev/dsk/device_name
HP-UX
ioscan –fC disk
insf –e
3) Get VxVM to recognize that a failed disk is now working again:
vxdctl enable
4) Verify that VxVM recognizes the disk:
vxdisk list
Note: In VEA, use the Actions -> Rescan option to run disk setup commands appropriate for the OS. This option ensures that VxVM recognizes newly attached hardware.
Unrelocating a Disk
VEA:
Select the disk to be unrelocated
Select Actions -> Undo Hot Relocation
vxdiskadm:
“Unrelocate subdisks back to a disk”
CLI:
vxunreloc [-f] [-g diskgroup] [-t tasktag] [-n disk_name] orig_disk_name
- orig_disk_name = Original disk before relocation
- -n disk_name = Unrelocates to a disk other than the original
- -f = Forces unrelocation if exact offsets are not possible
Viewing Relocated Subdisks: CLI
When a subdisk is hot-relocated, its original disk media name is stored in the sd_orig_dmname field of the subdisk record files. You can search this field to find all the subdisks that originated from a failed disk using the vxprint command:
vxprint –g diskgroup –se ‘sd_orig_dmname=”disk_name”’
For example, to display all the subdisks that were hot-relocated from datadg01 within the datadg disk group:
vxprint –g datadg –se ‘sd_orig_dmname=”datadg01”’
Recovering a Volume
VEA:
Select the volume to be recovered
Select Actions -> Recover Volume
CLI:
vxreattach [-bcr] [device_tag]
- Reattaches disks to a disk group if disk has a transient failure, such as when a drive is turned off and then turned back on, or if the Volume Manager starts with some disk drivers unloaded and unloadable.
- -r attempts to recover stale plexes using vxrecover
vxrecover [-bnpsvV] [-g diskgroup] [volume_namedisk_name]
i.e. vxrecover –b –g datadg datavol
Note: the vxrecover command only works on a started volume. A started volume displays an ENABLED state in vxprint –ht.
Note: use the –s argument to start a disabled volume
Configuration Backup and Restore
Backup:
vxconfigbackup [diskgroup]
Precommit:
vxconfigrestore –p diskgroup
Commit:
vxconfigrestore-c diskgroup
Bt default, VxVM configuration data is automatically backed up to the files:
/etc/vx/cbr/bk/diskgroup.dgid/dgid.dginfo
/etc/vx/cbr/bk/diskgroup.dgid/dgid.diskinfo
/etc/vx/cbr/bk/diskgroup.dgid/dgid.binconfig
/etc/vx/cbr/bk/diskgroup.dgid/dgid.cfgrec
Configuration data from a backup enables you to reinstall private region headers of VxVM disks in disk group, re-create a corrupted disk group configuration, or re-create a disk group and the VxVM objects within it.
And that is the end of the “Fundamentals” book.
Point-in-Time Copies: Standard
Topics:
What is a Point-In-Time Copy?
Traditional Volume Snapshots
File System Snapshots
A point-in-time copy (PITC) enables you to capture an image of data at a selected instant for use in applications, such as backups, decision support, reporting, and development testing.
Physical vs. Logical PITCs
Physical PITCs –
The physical PITC is a physically distinct copy of the data usually produced by breaking off a mirror of the storage container.
Advantages:
Complete copy of the primary data
Fully synchronized
Disadvantages:
Requires the same amount of storage space as the original
Requires time for synchronization of data
Logical PITCs –
This PITC identifies and maintains modified blocks, and in addition, there is a reference to the original data. The logical PITC is dependent on the primary copy of the data.
Advantages:
Available for use instantaneously
Disadvantages:
Dependent on the original.
Performance Issues with Physical PITCs
The primary impact for physical PITCs is the initial synchronization. This is especially important when large amounts of data need to be copied.
After this full synchronization is complete, there is very little, if any performance impact on the original volume or the PITC because they are separate objects.
Performance Issues with Logical PITCs
The logical PITC is connected to the primary data. Therefore, the I/O of a logical PITC is subject to the rate of change of the original data. The overall impact of the PITC is dependent on the read-to-write ratio of an application and the mixing of the I/O operations.
Note: Both the primary data and the logical PITC become faster as more data is copied out from the primary, because the PITC slowly becomes a complete physical copy over time.
Life Cycle of Point-in-Time Copies
1) Make PITC (Assign Resources) – vxassist snapstart & vxassist snapshot
2) Use PITC (Testing, Backup, etc.)
3) Update PITC (update PITC with new data from the primary or repopulate the primary from the PITC) – vxassist [-o resyncfromreplica] snapback
4) Destroy PITC (Release Resources) – vxassist remove
Traditional Volume Snapshots
The traditional type of volume snapshot that was originally provided in VxVM is the third-mirror break-off type.
When you create a traditional volume snapshot, you create a temporary mirror of an existing volume. After the contents of the third mirror (or snapshot plex) are synchronized from the original plexes of the volume, the snapshot plex can be detached as a snapshot volume for use in backup or decision support applications.
Creating and Managing Traditional Volume Snapshots
Create:
vxassist –g diskgroup [-b] snapstart origvol
(vxassist –g diskgroup snapwait origvol - use this command to force a wait for the snapshot mirror to finish synchronizing)
vxassist –g diskgroup snapshot [origvol] [snapvol]
Reassociate:
vxassist –g diskgroup snapback snapvol
or
vxassist –g diskgroup –o resyncfromreplica snapback snapvol
Dissociate:
vxassist –g diskgroup snapclear snapvol
Destroy:
vxassist –g diskgroup remove volume snapvol
Snapabort:
To remove a snapshot mirror that has not been detached and moved to a snapshot volume, you use the vxassist snapabort option.
vxassist –g diskgroup snapabort origvol
Displaying Traditional Volume Snapshot Information
vxprint –g diskgroup –ht (or vxprint –htg diskgroup)
Creating And Managing File System Snapshots
Create:
mount –F vxfs –o snapof=origfs[,snapsize=size] destination snap_mount_point
Refresh:
mount –F vxfs –o remount, snapof=origfs[,snapsize=size] destination snap_mount_point
Remove:
umount snap_mount_point
Using a File System Snapshot
After creating a snapshot file system, you can back up the file system from the snapshot while the snapped file system remains online.
To backup a snapshot:
vxdump [options] [snap_mount_point]
To backup the snapshot to tape:
vxdump –cf /dev/rmt/0 /snapmount
To restore the file system from tape:
vxrestore –vx /mount_point
What is a Point-In-Time Copy?
Traditional Volume Snapshots
File System Snapshots
A point-in-time copy (PITC) enables you to capture an image of data at a selected instant for use in applications, such as backups, decision support, reporting, and development testing.
Physical vs. Logical PITCs
Physical PITCs –
The physical PITC is a physically distinct copy of the data usually produced by breaking off a mirror of the storage container.
Advantages:
Complete copy of the primary data
Fully synchronized
Disadvantages:
Requires the same amount of storage space as the original
Requires time for synchronization of data
Logical PITCs –
This PITC identifies and maintains modified blocks, and in addition, there is a reference to the original data. The logical PITC is dependent on the primary copy of the data.
Advantages:
Available for use instantaneously
Disadvantages:
Dependent on the original.
Performance Issues with Physical PITCs
The primary impact for physical PITCs is the initial synchronization. This is especially important when large amounts of data need to be copied.
After this full synchronization is complete, there is very little, if any performance impact on the original volume or the PITC because they are separate objects.
Performance Issues with Logical PITCs
The logical PITC is connected to the primary data. Therefore, the I/O of a logical PITC is subject to the rate of change of the original data. The overall impact of the PITC is dependent on the read-to-write ratio of an application and the mixing of the I/O operations.
Note: Both the primary data and the logical PITC become faster as more data is copied out from the primary, because the PITC slowly becomes a complete physical copy over time.
Life Cycle of Point-in-Time Copies
1) Make PITC (Assign Resources) – vxassist snapstart & vxassist snapshot
2) Use PITC (Testing, Backup, etc.)
3) Update PITC (update PITC with new data from the primary or repopulate the primary from the PITC) – vxassist [-o resyncfromreplica] snapback
4) Destroy PITC (Release Resources) – vxassist remove
Traditional Volume Snapshots
The traditional type of volume snapshot that was originally provided in VxVM is the third-mirror break-off type.
When you create a traditional volume snapshot, you create a temporary mirror of an existing volume. After the contents of the third mirror (or snapshot plex) are synchronized from the original plexes of the volume, the snapshot plex can be detached as a snapshot volume for use in backup or decision support applications.
Creating and Managing Traditional Volume Snapshots
Create:
vxassist –g diskgroup [-b] snapstart origvol
(vxassist –g diskgroup snapwait origvol - use this command to force a wait for the snapshot mirror to finish synchronizing)
vxassist –g diskgroup snapshot [origvol] [snapvol]
Reassociate:
vxassist –g diskgroup snapback snapvol
or
vxassist –g diskgroup –o resyncfromreplica snapback snapvol
Dissociate:
vxassist –g diskgroup snapclear snapvol
Destroy:
vxassist –g diskgroup remove volume snapvol
Snapabort:
To remove a snapshot mirror that has not been detached and moved to a snapshot volume, you use the vxassist snapabort option.
vxassist –g diskgroup snapabort origvol
Displaying Traditional Volume Snapshot Information
vxprint –g diskgroup –ht (or vxprint –htg diskgroup)
Creating And Managing File System Snapshots
Create:
mount –F vxfs –o snapof=origfs[,snapsize=size] destination snap_mount_point
Refresh:
mount –F vxfs –o remount, snapof=origfs[,snapsize=size] destination snap_mount_point
Remove:
umount snap_mount_point
Using a File System Snapshot
After creating a snapshot file system, you can back up the file system from the snapshot while the snapped file system remains online.
To backup a snapshot:
vxdump [options] [snap_mount_point]
To backup the snapshot to tape:
vxdump –cf /dev/rmt/0 /snapmount
To restore the file system from tape:
vxrestore –vx /mount_point
Subscribe to:
Posts (Atom)