IBM SG24-5131-00 User Manual

138

IBM Certification Study Guide AIX HACMP

• Verify that failover has occurred (

netstat -i

and

ping

for networks,

lsvg -o

and

of a test file for volume groups, and

ps -U <appuid

> for application

processes).

• Power cycle NodeF. If HACMP is not configured to start from /etc/inittab

(on restart), start HACMP on NodeF (

smit clstart

). NodeF will take back

its cascading Resource Groups.

• Verify that re-integration has occurred (

netstat -i

and

ping

for networks,

lsvg -o

and

of a test file for volume groups, and

ps -U <appuid

> for

application processes).

6.2.2.3 TCP/IP Subsystem Failure
• Check, by way of the verification commands, that all the Nodes in the

cluster are up and running.

• Optional: Prune the error log on NodeF (errclear 0).

• Monitor the cluster log files on NodeT.

• On NodeF, stop the TCP/IP subsystem (

sh /etc/tcp.clean

) or crash the

subsystem by increasing the size of the sb_max and thewall parameters to
large values (

no -o sb_max=10000; no -o thewall=10000

) and ping NodeT.

Note that you should record the values for sb_max and thewall prior to
modifying them, and, as an extra check, you may want to add the original
values to the end of /etc/rc.net.

• The TCP/IP subsystem failure on NodeF will cause a network failure of all

the TCP/IP networks on NodeF. Unless there has been some
customization done to promote this type of failure to a node failure, only
the network failure will occur. The presence of a non-TCP/IP network
(RS232, target mode SCSI or target mode SSA) should prevent the cluster
from triggering a node down in this situation.

• Verify that the network_down event has been run by checking the

/tmp/hacmp.out file on either node. By default, the network_down script
does nothing, but it can be customized to do whatever is appropriate for
that situation in your environment.

• On NodeF, issue the command

startsrc -g tcpip

. This should restart the

TCP/IP daemons, and should cause a network_up event to be triggered in
the cluster for each of your TCP/IP networks.

6.2.3 Network Failure

• Check, by way of the verification commands, that all the Nodes in the

cluster are up and running.

• Optional: Prune the error log on NodeF (

errclear 0