Q-Logic IB6054601-00 D 사용자 설명서
3 – Using InfiniPath MPI
Debugging MPI Programs
Debugging MPI Programs
3-20
IB6054601-00 D
Q
may be desirable to run multiple MPI processes and multiple OpenMP threads per
node.
node.
The number of OpenMP threads is typically controlled by the
OMP_NUM_THREADS environment variable in the .
OMP_NUM_THREADS environment variable in the .
mpirunrc
file. This may be
used to adjust the split between MPI processes and OpenMP threads. Usually the
number of MPI processes (per node) times the number of OpenMP threads will be
set to match the number of CPUs per node. An example case would be a node with
4 CPUs, running 1 MPI process and 4 OpenMP threads. In this case,
OMP_NUM_THREADS is set to 4. OMP_NUM_THREADS is on a per-node basis.
number of MPI processes (per node) times the number of OpenMP threads will be
set to match the number of CPUs per node. An example case would be a node with
4 CPUs, running 1 MPI process and 4 OpenMP threads. In this case,
OMP_NUM_THREADS is set to 4. OMP_NUM_THREADS is on a per-node basis.
See the
for information on setting environment variables.
The MPI_THREAD_SERIALIZED and MPI_THREAD_MULTIPLE models are not
yet supported.
yet supported.
NOTE:
If there are more threads than CPUs, then both MPI and OpenMP
performance can be significantly degraded due to over-subscription of
the CPUs.
performance can be significantly degraded due to over-subscription of
the CPUs.
3.11
Debugging MPI Programs
Debugging parallel programs is substantially more difficult than debugging serial
programs. Thoroughly debugging the serial parts of your code before parallelizing
is good programming practice.
programs. Thoroughly debugging the serial parts of your code before parallelizing
is good programming practice.
3.11.1
MPI Errors
Almost all MPI routines (except
MPI_Wtime
and
MPI_Wtick
) return an error code;
as the function return value in C functions or as the last argument in a Fortran
subroutine call. Before the value is returned, the current MPI error handler is called.
By default, this error handler aborts the MPI job. Therefore you can get information
about MPI exceptions in your code by providing your own handler for
subroutine call. Before the value is returned, the current MPI error handler is called.
By default, this error handler aborts the MPI job. Therefore you can get information
about MPI exceptions in your code by providing your own handler for
MPI_ERRORS_RETURN
. See the
man
page for
MPI_Errhandler_set
for details.
NOTE:
MPI does not guarantee that an MPI program can continue past an error.
See the standard MPI documentation referenced in
for details on the
MPI error codes.
3.11.2
Using Debuggers
The InfiniPath software supports the use of multiple debuggers, including
pathdb
,
gdb
, and the system call tracing utility
strace
. These debuggers let you set
breakpoints in a running program, and examine and set its variables.