Compiling for MPI.


Click here for full thread
Forum Vet
Please carefully read the INSTALL file, section about openIB. You need to specify the ARMCI_NETWORK and the location of IB libraries.

Bert


Quote:Davis68 Dec 19th 8:48 pm
I have been working for a few months with NWChem on a workstation, but now need to ramp up the size of the simulations I am doing. I have unsuccessfully been trying to get NWChem 6 with MPI to compile
for a while now. Any counsel to resolve this will be appreciated.
My environment variables which I've set for compiling (with the greatest success rate so far) are:
export NWCHEM_TARGET=LINUX64
export NWCHEM_TOP=~/nwchem/nwchem-6.0/
export NWCHEM_MODULES=all
export LARGE_FILES=TRUE
export LIB_DEFINES="-DDFLT_TOT_MEM=16777216"

export CC=gcc
export FC=gfortran

export USE_MPI=y
export USE_MPIF=y
export MPI_LOC=/usr/local/mvapich2-1.6-gcc
export MPI_LIB=$MPI_LOC/lib
export MPI_INCLUDE=$MPI_LOC/include
export LIBMPI="-lmpich"

The compilation is successful with gcc/gfortran, although switching everything to the corresponding Intel compilers and modules consistently errors out. The cluster is running Scientific Linux over IB with either MVAPICH or OpenMPI, with gcc/gfortran v.4.4.5; GNU Make v.3.81.
The output when I run is
[davis68@taub302 uo2-work]$ mpiexec ~/bin/nwchem lda-147.nw 
ARMCI configured for 2 cluster nodes. Network protocol is 'TCP/IP Sockets'.
-10012:armci_AcceptSockAll:timeout waiting for connection: 0
(rank:-10012 hostname:taub448 pid:24214):ARMCI DASSERT fail. sockets.c:armci_AcceptSockAll():635 cond:0
12:Child process terminated prematurely, status=: 256
(rank:12 hostname:taub448 pid:24188):ARMCI DASSERT fail. signaltrap.c:SigChldHandler():167 cond:0
ARMCI master: wait for child process (server) failed:: No child processes
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 12
-10000:armci_AcceptSockAll:timeout waiting for connection: 0
(rank:-10000 hostname:taub302 pid:21006):ARMCI DASSERT fail. sockets.c:armci_AcceptSockAll():635 cond:0
0:Child process terminated prematurely, status=: 256
(rank:0 hostname:taub302 pid:20981):ARMCI DASSERT fail. signaltrap.c:SigChldHandler():167 cond:0
ARMCI master: wait for child process (server) failed:: No child processes
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0

There is a long wait after the first line, ``ARMCI configured for 2 cluster nodes... before the other messages appear.