NWChem failing to run on IB nodes


Just Got Here
Hello,

I have tried to build NWChem using OpenMPI (1.6.5) and MVAPICH (1.9) to run on Infiniband nodes, but the application keeps crashing with this error for both MPI implementations:



0:Cannot run: improper task to host mapping!: 0



My configuration is given below:



module load mvapich/1.9/intel
module load intel/13.1

cd Nwchem-6.5.revision26243-src.2014-09-10
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES=all

export USE_MPI=y
export USE_MPIF=y
export MPI_LIB=/opt/mvapich/1.9/intel/13.1/lib
export MPI_INCLUDE=/opt/mvapich/1.9/intel/13.1/include
export LIBMPI="-L$MPI_LIB -lmpichf90 -lmpich -lopa -lmpl -libmad -libumad -libverbs -lrt -lnuma -lpthread"

export FC=ifort
export CC=icc
export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include/infiniband
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"

export MKLROOT=/opt/intel/composer_xe_2013.5.192/mkl/lib/intel64
export BLASOPT="-Wl,--start-group $MKLROOT/libmkl_intel_ilp64.a $MKLROOT/libmkl_sequential.a $MKLROOT/libmkl_core.a -Wl,--end-group -lpthread -lm"

cd src
make nwchem_config
make -j 4



Has this anything to do with 8 byte integers?

Any help will be greatly appreciated. Thanks in advance.

Forum Vet
It seems that there is a problem in your submission script,
more precisely in the mpirun/mpiexec options.
Your MPI hostfile could be of the kind

node0
node1
node0
node1

NWChem is not going to work with this kind of settings, the hostfile should be of the following form, instead

node0
node0
node1
node1

Just Got Here
Hello Edo,

thanks for the quick reply. My host file contains the following two nodes:

nxn25 1 parallel.nextscale.q@nxn25 UNDEFINED
nxn27 1 parallel.nextscale.q@nxn27 UNDEFINED

and I would like to run 16 MPI ranks on each node (32 MPI ranks in total). How should the host file be formatted?

Thanks in advance,
Wadud.

Forum Vet
Quote:Miahw Sep 23rd 2:24 am
Hello Edo,

thanks for the quick reply. My host file contains the following two nodes:

nxn25 1 parallel.nextscale.q@nxn25 UNDEFINED
nxn27 1 parallel.nextscale.q@nxn27 UNDEFINED

and I would like to run 16 MPI ranks on each node (32 MPI ranks in total). How should the host file be formatted?

Thanks in advance,
Wadud.

Wadud
I would change the hostfile as following (but I am not 100% sure it will work with you MPI installation,
you should check with whoever installed it)

nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn25
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27
nxn27


Forum >> NWChem's corner >> Compiling NWChem