IBM SG24-5131-00 User Manual
138
IBM Certification Study Guide AIX HACMP
• Verify that failover has occurred (
netstat -i
and
ping
for networks,
lsvg -o
and
vi
of a test file for volume groups, and
ps -U <appuid
> for application
processes).
• Power cycle NodeF. If HACMP is not configured to start from /etc/inittab
(on restart), start HACMP on NodeF (
smit clstart
). NodeF will take back
its cascading Resource Groups.
• Verify that re-integration has occurred (
netstat -i
and
ping
for networks,
lsvg -o
and
vi
of a test file for volume groups, and
ps -U <appuid
> for
application processes).
6.2.2.3 TCP/IP Subsystem Failure
• Check, by way of the verification commands, that all the Nodes in the
• Check, by way of the verification commands, that all the Nodes in the
cluster are up and running.
• Optional: Prune the error log on NodeF (errclear 0).
• Monitor the cluster log files on NodeT.
• On NodeF, stop the TCP/IP subsystem (
sh /etc/tcp.clean
) or crash the
subsystem by increasing the size of the sb_max and thewall parameters to
large values (
large values (
no -o sb_max=10000; no -o thewall=10000
) and ping NodeT.
Note that you should record the values for sb_max and thewall prior to
modifying them, and, as an extra check, you may want to add the original
values to the end of /etc/rc.net.
modifying them, and, as an extra check, you may want to add the original
values to the end of /etc/rc.net.
• The TCP/IP subsystem failure on NodeF will cause a network failure of all
the TCP/IP networks on NodeF. Unless there has been some
customization done to promote this type of failure to a node failure, only
the network failure will occur. The presence of a non-TCP/IP network
(RS232, target mode SCSI or target mode SSA) should prevent the cluster
from triggering a node down in this situation.
customization done to promote this type of failure to a node failure, only
the network failure will occur. The presence of a non-TCP/IP network
(RS232, target mode SCSI or target mode SSA) should prevent the cluster
from triggering a node down in this situation.
• Verify that the network_down event has been run by checking the
/tmp/hacmp.out file on either node. By default, the network_down script
does nothing, but it can be customized to do whatever is appropriate for
that situation in your environment.
does nothing, but it can be customized to do whatever is appropriate for
that situation in your environment.
• On NodeF, issue the command
startsrc -g tcpip
. This should restart the
TCP/IP daemons, and should cause a network_up event to be triggered in
the cluster for each of your TCP/IP networks.
the cluster for each of your TCP/IP networks.
6.2.3 Network Failure
• Check, by way of the verification commands, that all the Nodes in the
cluster are up and running.
• Optional: Prune the error log on NodeF (
errclear 0
).