Escali 4.4 사용자 설명서

다운로드
페이지 81
Scali MPI Connect Release 4.4 Users Guide 
54
Appendix B
 Troubleshooting
This appendix offers initial suggestions for what to do when something goes wrong with 
applications running together with SMC. When problems occur, first check the list of common 
errors and their solutions; an updated list of SMC-related Frequently Asked Questions 
(FAQ) is posted in the Support section of the Scali website (http://www.scali.com). If you 
are unable to find a solution to the problem(s) there, please read this chapter before contacting 
support@scali.com.
Problems and fixes reported to Scali will eventually be included in the appropriate sections of 
this manual. Please send relevant remarks by e-mail to support@scali.com.
Many problems find their origin in not using the right application code, daemons that Scali MPI 
Connect rely on are stopped, and incomplete specification of network drivers. Below some 
typical problems and their solutions are described. Troubleshooting the DAT functionality is 
described in C-11.
B-1 When things do not work - troubleshooting
This section is intended to serve as a starting point to help with software and hardware 
debugging. The main focus is on locating and repairing faulty hardware and software setup, 
but can also be helpful in getting started after installing a new system. For a description of the 
Scali Manage GUI, see the Scali System Guide
.
 
B-1.1 Why does not my program start to run?
V
mpimon: command not found.
‹ Include /opt/scali/bin in the PATH environment variable.
V
mpimon can’t find mpisubmon.
‹ Set MPI_HOME=/opt/scali or use the -execpath option.
V
The application has problems loading libraries (libsca*).
‹ Update the LD_LIBRARY_PATH to include /opt/scali/lib.
V
Incompatible MPI versions.
mpid, mpimon, mpisubmon and the libraries all have version variables that are checked at 
start-up. To insure that these are correct, try the following:
1. Set the environment variable MPI_HOME correctly
2. Restart mpid, because a new version of ScaMPI has been installed without restarting 
mpid
3. Reinstall SMC, because a new version of SMC was not cleanly installed on all nodes.
V
Set working directory failed
‹ SMC assumes that there is a homogenous file-structure. If you start mpimon from a 
directory that is not available on all nodes you must set SCAMPI_WORKING_DIRECTORY to 
point to a directory that is available on all nodes.
V
ScaMPI uses wrong interface for TCP-IP on frontend with more than one 
interface
‹ Set SCAMPI_NODENAME to hostname of correct interface.
V
MPI_Wtime gives strange values
‹ SMC uses a hardware-supported high precision timer for MPI_Wtime. This timer can be 
disabled by using SCAMPI_DISABLE_HPT=1