IBM SG24-5131-00 User Manual

140

IBM Certification Study Guide AIX HACMP

• Reconnect hdisk0, close the casing, and turn the key to normal mode.

• Power on NodeF then verify that the rootvg logical volumes are no longer

stale (

lsvg -l rootvg

6.2.4.2 7135 Disk Failure
Perform the following steps in the event of a disk failure:

• Check, by way of the verification commands, that all the Nodes in the

cluster are up and running.

• Optional: Prune the error log on NodeF (

errclear 0

• Monitor cluster logfiles on NodeT if HACMP has been customized to

monitor 7135 disk failures.

• Mark a shared disk failed through smit (

smit raidiant

; RAIDiant Disk

Array Manager -> Change/Show Drive Status -> select the appropriate
hdisk -> select the appropriate physical disk -> F4 to select a Drive
Status of 83 Fail Drive), or if the disk is hot pluggable, remove the disk.

• The amber light on the front of the 7135 comes on, and can also be seen

in SMIT (

smit raidiant

; RAIDiant Disk Array Manager -> List all SCSI

RAID Arrays).

• Verify that all sharedvg file systems and paging spaces are accessible (

and

lsps -a

• If using RAID5 with Hot Spare, verify that reconstruction has completed to

the Hot Spare, then un-mark or plug the failed disk back in. If using
RAID1, sync the volume group (

syncvg NodeFvg

• If using RAID5 without Hot Spare, mark the failed disk Optimal (

smit

raidiant

; RAIDiant Disk Array Manager -> Change/Show Drive Status;

select the appropriate hdisk -> select the appropriate physical disk ->
F4 to select a Drive Status of 84 Replace and Reconstruct Drive).

• Verify that the reconstruction has completed (

smit raidiant

;RAIDiant Disk

Array Manager -> List all SCSI RAID Arrays).

• Verify that all sharedvg file systems and paging spaces are accessible (df

and lsps -a) and that the partitions are not stale (

lsvg -l sharedvg

). Also

verify that the yellow light has turned off on the 7135.

6.2.4.3 Mirrored 7133 Disk Failure
• Check, by way of the verification commands, that all the Nodes in the

cluster are up and running.

• Optional: Prune the error log on NodeF (

errclear 0