MPI errors Centos 7.3


Click here for full thread
Clicked A Few Times
Trying to get NWChem 6.6 working on our HPC Grid running Centos 7.3.

Tried both OpenMPI and MPICH 3.0 as supplied via the Centos distribution. I can get both to compile, but they die during runtime with the following results:

Openmpi:
2:ga_iter_lsolve: dgesv failed:Received an Error in Communication


MPI_ABORT was invoked on rank 2 in communicator MPI COMMUNICATOR 3 DUP FROM 0
with errorcode 0.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.




MPICH:
0:geom_binvr: dsyev failed:Received an Error in Communication
application called MPI_Abort(comm=0x84000004, 0) - process 0


I've run multiple MPI test suites and as far as I can tell the core OpenMPI and MPICH infrastructures are working properly within the HPC Grid environment.


OpenMPI version compiled with:
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES=all
export NWCHEM_TOP=/mnt/research/deej/src/nwchem/6.6/nwchem-6.6

export USE_NOFSCHECK=TRUE
export USE_NOIO=TRUE

export ARMCI_NETWORK=MPI_TS
export LARGE_FILES=TRUE
export MRCC_THEORY=TRUE
export LIB_DEFINES=-DDFLT_TOT_MEM=1677721600

export USE_PYTHONCONFIG=y
export PYTHONHOME=/usr
export PYTHONVERSION=2.7
export USE_PYTHON64=y
export PYTHONLIBTYPE=so

export BLASOPT="-L/usr/lib64/atlas -llapack -lf77blas -latlas"
export HAS_BLAS=y

export SCALAPACK="-L/usr/lib64/openmpi/lib -lscalapack -lmpiblacs"
export SCALAPACK_LIBS="-L/usr/lib64/openmpi/lib -lscalapack -lmpiblacs"
export USE_SCALAPACK=y

export PATH=/usr/lib64/openmpi/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib64/openmpi/lib/:$LD_LIBRARY_PATH
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib64/openmpi
export MPI_LIB=/usr/lib64/openmpi/lib
export MPI_INCLUDE=/usr/include/openmpi-x86_64
export LIBMPI="-pthread -m64 -I/usr/lib64/openmpi/lib -Wl,-rpath -Wl,/usr/lib64/openmpi/lib -Wl,--enable-new-dtags -L/usr/lib64/openmpi/lib -lmpi_usempi -lmpi_mpifh -lmpi"

make FC=mpif90 CC=mpicc nwchem_config NWCHEM_MODULES="all python"
make FC=mpif90 CC=mpicc >& make.log



MPICH version compiled with:

export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES=all
export NWCHEM_TOP=/mnt/research/deej/src/nwchem/6.6/nwchem-6.6

export USE_NOFSCHECK=TRUE
export USE_NOIO=TRUE

export ARMCI_NETWORK=MPI_TS
export LARGE_FILES=TRUE
export MRCC_THEORY=TRUE
export LIB_DEFINES=-DDFLT_TOT_MEM=1677721600

export USE_PYTHONCONFIG=y
export PYTHONHOME=/usr
export PYTHONVERSION=2.7
export USE_PYTHON64=y
export PYTHONLIBTYPE=so

export BLASOPT="-L/usr/lib64/atlas -llapack -lf77blas -latlas"
export HAS_BLAS=y

export SCALAPACK="-L/usr/lib64/mpich/lib -lscalapack -lmpiblacs"
export SCALAPACK_LIBS="-L/usr/lib64/mpich/lib -lscalapack -lmpiblacs"
export USE_SCALAPACK=y

export PATH=/usr/lib64/mpich/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib64/mpich/lib/:$LD_LIBRARY_PATH
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib64/mpich
export MPI_LIB=/usr/lib64/mpich/lib
export MPI_INCLUDE=/usr/include/mpich-x86_64
export LIBMPI="-lmpichf90 -lmpich -lopa -lmpl -lrt -lpthread"

make nwchem_config NWCHEM_MODULES="all python"
make FC=/usr/lib64/mpich/bin/mpif90 >& make.log


Any help would be greatly appreciated!