IBM SG24-5131-00 User Manual

Page of 240
© Copyright IBM Corp. 1999
143
Chapter 7.  Cluster Troubleshooting
Typically, a functioning HACMP cluster requires minimal intervention. If a 
problem occurs, however, diagnostic and recovery skills are essential. Thus, 
troubleshooting requires that you identify the problem quickly and apply your 
understanding of the HACMP for AIX software to restore the cluster to full 
operation.
In general, troubleshooting an HACMP cluster involves:
  • Becoming aware that a problem exists
  • Determining the source of the problem
  • Correcting the problem
Becoming aware of a problem is often through system messages on the 
console, end-users complaining about slow or unavailable services or 
through some sort of monitoring of your cluster. When an HACMP for AIX 
script or daemon generates a message, the message is written to the system 
console and to one or more cluster log files. Messages written to the system 
console may scroll off screen before you notice them. The following 
paragraphs provide an overview of the log files, which are to be consulted for 
cluster troubleshooting, as well as some information on specific cluster states 
you may find there.
7.1  Cluster Log Files
HACMP for AIX scripts, daemons, and utilities write messages to the 
following log files:
Table 21.  HACMP Log Files 
Log File Name
Description
/var/adm/cluster.log
Contains time-stamped, formatted messages generated by
HACMP for AIX scripts and daemons. In this log file, there
is one line written for the start of each event, and one line
written for the completion.
/tmp/hacmp.out
Contains time-stamped, formatted messages generated by
the HACMP for AIX scripts. In verbose mode, this log file
contains a line-by-line record of each command executed
in the scripts, including the values of the arguments passed
to the commands. By default, the HACMP for AIX software
writes verbose information to this log file; however, you can
change this default. Verbose mode is recommended.