Infiniband Install


Clicked A Few Times
Hi,

Hopefully someone can help me to get NWChem running in parallel. I have a infiniband connection with qlogic; therefore, I have been trying to compile with ARMCI_NETWORK=MPI-SPAWN. Below is my install script. When stalling everything finishes fine; I can even run parallel within one node. However, when I try using more then one node I get the following error in my .out file:

argument  1 = PCBM.nw
chama18.40582ipath_userinit: assign_context command failed: Network is down
chama17.41893ipath_userinit: assign_context command failed: Network is down
0:Terminate signal was sent, status=: 15
(rank:0 hostname:chama17 pid:41859):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigTermHandler():472 cond:0

Any suggestions?

  1. !/bin/bash

module unload openmpi-intel/1.4
module load openmpi-intel/1.6

export NWCHEM_TOP=/home/mefoste/chama/nwchem-6.1.1-src
export NWCHEM_TARGET=LINUX64

export ARMCI_NETWORK=MPI-SPAWN
export IB_HOME=/usr
export IB_INCLUDE=/usr/include
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libverbs -libumad -lpthread "
export MSG_COMMS=MPI
export TCGRSH=/usr/bin/ssh

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/opt/openmpi-1.6-intel
export MPI_LIB=$MPI_LOC/lib
export MPI_INCLUDE=$MPI_LOC/include
export LIBMPI="-lpthread -L$MPI_LIB -lmpi_f90 -lmpi_f77 -lmpi"

export NWCHEM_MODULES="all"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
export LIB_DEFINES=-DDFLT_TOT_MEM=2147483647
 

export MKLROOT=/opt/intel-12.1/mkl
export HAS_BLAS=yes
export BLASOPT="-Wl,--start-group $MKLROOT/lib/intel64/libmkl_intel_ilp64.a $MKLROOT/lib/intel64/libmkl_sequential.a $MKLROOT/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm"

export FC=ifort
export CC=icc


cd $NWCHEM_TOP/src
make nwchem_config
make FC=$FC CC=$CC

Forum Vet
The error message below seems to be coming from the MPI implementation on the QLogic fabric
ipath_userinit: assign_context command failed: Network is down
By any chance, have you run some simple MPI test codes to check if there aren't issues on the network?

Regards, Edo

Clicked A Few Times
I'm currently using openMPI with GAMESS and VASP without any problems; however, when compiling those programs you do not have to set ARMCI_NETWORK

Forum Vet
Better switch to ARMCI_NETWORK=MPI-MT
I think that your best option would be to try the ARMCI port that uses MPI threads.
In order to do that, you need MPI implementation that supports MPI_Init_thread() and a required threading level of MPI_THREAD_MULTIPLE.
If your choice is OpenMPI, MPI multithreading is not on by default, but you would have to recompile it with the
configure option
--enable-mpi-threads

You can check the output of ompi_info to see if Open MPI has MPI_THREAD_MULTIPLE support
$ ompi_info | grep -i thread
         Thread support: posix (mpi: yes, progress: no)
$

The "mpi: yes" portion of the above output indicates that Open MPI was compiled with MPI_THREAD_MULTIPLE support.

Clicked A Few Times
Thanks for the suggestions. OpenMPI was not installed with MPI_THREAD_MULTIPLE support. I don’t have access to reinstall OpenMPI. I can ask my admin but things like this seem to take a long time to get resolved because of others using the system. I also have access to MVapich2, could this be a better chose? I’m not familiar with MVapich2 nor do I know if one is better than the other.

Thanks again

Clicked A Few Times
Also I have 3 versions of MVapich2 installed is one better for NWChem?

mvapich2-intel-ofa
mvapich2-intel-psm
mvapich2-intel-shmem

Forum Vet
Quote:Mef362 Aug 27th 8:22 am
Thanks for the suggestions. OpenMPI was not installed with MPI_THREAD_MULTIPLE support. I don’t have access to reinstall OpenMPI. I can ask my admin but things like this seem to take a long time to get resolved because of others using the system. I also have access to MVapich2, could this be a better chose? I’m not familiar with MVapich2 nor do I know if one is better than the other.

Thanks again


Mvapich2 could work, but you need first to check if it does support MPI_THREAD_MULTIPLE with a simple MPI test program like the one at the bottom of this posting.
Please keep in mind that you need to disable the mvapich2 CPU affinity by using the following environment variable
MV2_ENABLE_AFFINITY=0


$ cat mpithreadcheck.f
implicit none
INCLUDE 'mpif.h'
INTEGER REQUIRED, PROVIDED, IERROR
INTEGER myrank,mysize,rc
REQUIRED=MPI_THREAD_MULTIPLE
CALL MPI_INIT_THREAD(REQUIRED, PROVIDED, IERROR)
call MPI_COMM_SIZE(MPI_COMM_WORLD,mysize,ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD,myrank,ierror)
print *, 'Hello World! I am ', myrank
if(myrank.eq.0)then
write(6,*) ' IERROR ',IERROR
if(PROVIDED.NE.MPI_THREAD_MULTIPLE) then
write(6,*) ' MPI_THREAD_MULTIPLE not supported'
else
write(6,*) ' MPI_THREAD_MULTIPLE is supported'
endif
endif
call MPI_FINALIZE(rc)
end

Forum Vet
Quote:Mef362 Aug 27th 8:38 am
Also I have 3 versions of MVapich2 installed is one better for NWChem?

mvapich2-intel-ofa
mvapich2-intel-psm
mvapich2-intel-shmem


mvapich2-intel-psm seems to be the one that uses QLogic PSM library

Clicked A Few Times
Here is the install file I used to successfully install NWChem using MVapich2 and MKL:

#!/bin/bash
  
export NWCHEM_TARGET=LINUX64
 
export ARMCI_NETWORK=MPI-MT
export IB_HOME=/usr
export IB_INCLUDE=/usr/include/infiniband
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-lrdmacm -libumad -libverbs -lpthread -lrt"
export MSG_COMMS=MPI
export TCGRSH=/usr/bin/ssh
export SLURM=y
 
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/opt/mvapich2-intel-psm-1.7
export MPI_LIB=$MPI_LOC/lib
export MPI_INCLUDE=$MPI_LOC/include
export LIBMPI="-lpthread -L$MPI_LIB -lmpichf90 -lmpich -lmpl -lrdmacm -libverbs"
export MV2_ENABLE_AFFINITY=0
 
 
export NWCHEM_MODULES="all"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
export LIB_DEFINES=-DDFLT_TOT_MEM=259738112 
 
export MKLROOT=/opt/intel-12.1/mkl
export HAS_BLAS=yes
export BLASOPT="-Wl,--start-group  $MKLROOT/lib/intel64/libmkl_intel_ilp64.a $MKLROOT/lib/intel64/libmkl_sequential.a $MKLROOT/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm"
 
export FC=mpif90
export CC=mpicc
 
make nwchem_config


Forum >> NWChem's corner >> Compiling NWChem