IBM SG24-5131-00 User Manual

Page of 240
140
 
IBM Certification Study Guide  AIX HACMP
  • Reconnect hdisk0, close the casing, and turn the key to normal mode.
  • Power on NodeF then verify that the rootvg logical volumes are no longer 
stale (
lsvg -l rootvg
).
6.2.4.2  7135 Disk Failure
Perform the following steps in the event of a disk failure:
  • Check, by way of the verification commands, that all the Nodes in the 
cluster are up and running.
  • Optional: Prune the error log on NodeF (
errclear 0
).
  • Monitor cluster logfiles on NodeT if HACMP has been customized to 
monitor 7135 disk failures.
  • Mark a shared disk failed through smit (
smit raidiant
RAIDiant Disk 
Array Manager -> Change/Show Drive Status -> select the appropriate 
hdisk -> select the appropriate physical disk -> F4 to select a Drive 
Status of 83 Fail Drive)
, or if the disk is hot pluggable, remove the disk.
  • The amber light on the front of the 7135 comes on, and can also be seen 
in SMIT (
smit raidiant
RAIDiant Disk Array Manager -> List all SCSI 
RAID Arrays).
  • Verify that all sharedvg file systems and paging spaces are accessible (
df
 
and 
lsps -a
).
  • If using RAID5 with Hot Spare, verify that reconstruction has completed to 
the Hot Spare, then un-mark or plug the failed disk back in. If using 
RAID1, sync the volume group (
syncvg NodeFvg
).
  • If using RAID5 without Hot Spare, mark the failed disk Optimal (
smit 
raidiant
RAIDiant Disk Array Manager -> Change/Show Drive Status; 
select the appropriate hdisk -> select the appropriate physical disk -> 
F4 to select a Drive Status of 84 Replace and Reconstruct Drive
).
  • Verify that the reconstruction has completed (
smit raidiant
;RAIDiant Disk 
Array Manager -> List all SCSI RAID Arrays).
  • Verify that all sharedvg file systems and paging spaces are accessible (df 
and lsps -a) and that the partitions are not stale (
lsvg -l sharedvg
).  Also 
verify that the yellow light has turned off on the 7135.
6.2.4.3  Mirrored 7133 Disk Failure
  • Check, by way of the verification commands, that all the Nodes in the 
cluster are up and running.
  • Optional: Prune the error log on NodeF (
errclear 0
).