Problem building NWChem version 6.5 on IB cluster with MKL & IntelMPI


Jump to page 1Prev 162Next 16Last
Forum Vet
Could you upload the file
$NWCHEM_TOP/src/tools/build/config.log
to a website where I can access it.

Could you also send the output of the command

ls -l /u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64

Thanks, Edo

Clicked A Few Times
config.log
Hello Edo,

Here is the file you requested (NWCHEM_TOP/src/tools/build/config.log):

https://ucla.box.com/s/6u0pqd2kthzirfbkdq8vp0buq7v7xztr

and here is the output of the command:

ls -l /u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64
https://ucla.box.com/s/0pf93r20mvhilllez34yxpwvw8j08w98

The output of the (failed) compilation attempt is also here:

 https://ucla.box.com/s/c7tfj46o89cvd9wqcdfz262a5hfe51bo

and the script to build is:

https://ucla.box.com/s/28kxt6tn6ktx8fl2va9umvk68xhxjw8a

Please let me know if you need any other info.

Again the goal would be to create a distributable executable (no xHost or any other host related optimization flags), and to make sure that the includes for the MKL are found.

Grazie

Raffaella.

Clicked A Few Times
openmp
Hello again,

I see that openmp is also switched on by default. However this could end up being a mess with having to set up a host file for parallel run. Is there a way to switch opnemp off? Is every part of the code using openmp?

Thanks,

Raffaella.

Forum Vet
Raffaella
I do not see any problem after inspecting your compilation logs.
Could you be more precise about what you exactly mean by
"... A first compilation attempt indicated that the include location for the MKLs could not be found"?
Does it mean that you get a NWChem binary, but that when you try to run it, you got a failure to find the MKL libraries?
If my analysis somewhat describe what you have experienced, please define
export LD_LIBRARY_PATH=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64:$LD_LIBRARY_PATH

Forum Vet
Simply add the following to the scripts you will use to run NWChem jobs and OpenMP will never kick in
export OMP_NUM_THREADS=1
Quote:Rdauria May 20th 12:19 pm
Hello again,

I see that openmp is also switched on by default. However this could end up being a mess with having to set up a host file for parallel run. Is there a way to switch opnemp off? Is every part of the code using openmp?

Thanks,

Raffaella.

Clicked A Few Times
executable not generated
Hello,

The compilation ended prematurely because of undefined references. The LD_LIBRARY_PATH is defined as you suggested in the modulefile.

Looking at the files which I have sent you I noticed that the LAPACK where not being found (this is from the log file):

configure: WARNING: LAPACK library not found, using internal LAPACK

I have therefore tried to add to my script the lines:

export LAPACK_LIBS="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_CPPFLAGS="-DMKL_ILP64 -I$(MKLROOT)/include"

but the lapack are still not being found (notice MKL's LAPACK do not have anymore the work lapack in their names, unless I used libmkl_lapack95_ilp64.a).

Any suggestions?

Thanks,

Raffaella.

Forum Vet
Sorry, It took me a second try to read the large compilation log.
Analysis done in a bit ...
Later, Edo

Forum Vet
I can see that the undefined references are of the kind "ygemm_".
This kind of failures is not directly related to the MKL detection in the tools autoconf.
Instead, the source you are using was once processed with the command
make 64_to_32

The other issue I can see from your log is that you are using 64-bit integers for MKL (both blas and scalapack),
but you have not defined SCALAPACK_SIZE=8.

I am suggesting you to do the following:

1) set the following three env. variables

export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export USE_64TO32=y

(the second export is not strictly necessary, since 8 it is the default value)
3) recompile the tools by executing the following commands

cd $NWCHEM_TOP/src/tools
rm -rf build install
make FC=ifort

4) relink by executing the following commands

cd $NWCHEM_TOP/src
make FC=ifort link

Clicked A Few Times
ld: cannot find -l64to32
Hi Edo,

I followed your instructions but at the step:

make FC=ifort link

I got:

ld: cannot find -l64to32
make: *** [link] Error 1

(see more details below).

I think I will re-unpack the nchem source and start from scratch. But should I not use the 64-bit integers?

Thanks,

Raffaella.

nwchem.F(463): (col. 11) remark: vectorization support: unaligned access used inside loop body
nwchem.F(463): (col. 11) remark: loop was not vectorized: vectorization possible but seems inefficient
ifort -i8 -align -vec-report6 -fimf-arch-consistency=true -O2 -g -fp-model source  -Wl,--export-dynamic  -L/u/local/downloads/nwchem/6.5/rev26243//lib/LINUX64 -L/u/local/downloads/nwchem/6.5/rev26243//src/tools/install/lib  -o /u/local/downloads/nwchem/6.5/rev26243//bin/LINUX64/nwchem nwchem.o stubs.o -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lnwxc -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -lnwpython -ldrdy -lvscf -lqmmm -lqmd -letrans -lpspw -ltce -lbq -lcons -lperfm -ldntmc -lccca -lnwcutil -lga -larmci -lpeigs -lperfm -lcons -lbq -lnwcutil /usr/lib64/python2.6/config/libpython2.6.so -L/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm   -l64to32 -L/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm  -llapack  -lblas   -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib/release -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread   -libumad -libverbs -lpthread  -lnwcutil  -lpthread -lutil -ldl -lz  
ld: cannot find -l64to32
make: *** [link] Error 1

Forum Vet
One more step needed (sorry for missing it earlier .. my bad)

1) compile 64to32blas

cd $NWCHEM_TOP/src/64to32blas
make FC=ifort

2) try again to relink

cd $NWCHEM_TOP/src
make FC=ifort link

Clicked A Few Times
Hell Edo,

I followed the steps you suggested and I was able to produce an output (not sure whether it will be able to run everywhere on the cluster as the compiler flag xHost was turned on). However when I tried to run some of the examples I faced this problem (which we have been encountering on the current version that we have installed here, and that was what prompted us to move to the latest version):

0:Segmentation Violation error, status=: 11
(rank:0 hostname:n2180 pid:20922):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 0:: Inappropriate ioctl for device
application called MPI_Abort(comm=0x84000001, 11) - process 0


I think I will start a fresh with a newly unpacked version of the source. In the meantime any guidance on this error that seems to be associated with using the intel compiler (from version 13 and up) would be much appreciated.

Thanks,

Raffaella.

Forum Vet
FOPTIMIZE=-O3
If you do not want -xhost to be used, please compile by using the command

make FC=ifort FOPTIMIZE=-O3

Clicked A Few Times
USE_OPENMP=no
Hello Edo,

To switch off openmp should I define the following environmental variable?

USE_OPENMP=no

Or would you suggest to leave openmp on?

Are we supposed to run nwchem doing openmp within a node and mpi across different nodes?

Or should I just define a OMP_NUM_THREADS=1 for parallel runs?

Thanks,

Raffaella.

Clicked A Few Times
compiling nwchem-6.5 MKL Composer XE 2013 SP1
Hi there,

I am also trying to compile nwchem-6.5 on a intel xeon infiniband cluster with Intel Composer XE 2013 SP1 compilers. I will be interested in learning about the final set of env variables (for instance) you used---so that I can compare them with mine.

Best regards,

Alejandro

Clicked A Few Times
To Alejandro
Hi Alejandro,

Sorry for answering only. Here is the script I used:

#!/bin/bash

. /u/local/Modules/default/init/modules.sh
module load intel/14.cs
module load intelmpi/5.0.0

#export NWCHEM_TOP=/u/local/downloads/nwchem/6.5/rev26243/
export NWCHEM_TOP=/u/local/downloads/nwchem/6.5-rev26243
export NWCHEM_TARGET=LINUX64

export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include/infiniband/
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"

export USE_MPI=Y
export USE_MPIF=Y
export USE_MPIF4=Y
export MPI_LOC=/u/local/compilers/intel/impi/5.0.0.028
export MPI_INCLUDE="-I/u/local/compilers/intel/impi/5.0.0.028/intel64/include"
export MPI_LIB="/u/local/compilers/intel/impi/5.0.0.028/intel64/lib/release -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib"
export LIBMPI="-lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"


export NWCHEM_MODULES="all python"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
#export LIB_DEFINES=-DDFLT_TOT_MEM=16777216

export PYTHONHOME=/usr
export PYTHONVERSION=2.6
export USE_PYTHON64=y
export PYTHONLIBTYPE=so

sed -i 's/libpython$(PYTHONVERSION).a/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)/g' config/makefile.h


export HAS_BLAS=yes
export USE_SCALAPACK=y
export MKLLIB=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64
export MKLINC=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/include
export BLASOPT="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_LIBS="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
#export LAPACK_CPPFLAGS="-DMKL_ILP64 -I$MKLINC"
export SCALAPACK="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
#export SCALAPACK_CPPFLAGS="-DMKL_ILP64 -I$MKLINC"

export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export USE_64TO32=y

export FC=ifort
export CC=icc

echo "cd $NWCHEM_TOP/src"
cd $NWCHEM_TOP/src

echo "BEGIN --- make realclean "
make realclean
echo "END --- make realclean "

echo "BEGIN --- make nwchem_config "
make nwchem_config 
echo "END --- make nwchem_config "

echo "BEGIN --- make"
make CC=icc FC=ifort FOPTIMIZE=-O3 -j4 
echo "END --- make "

cd $NWCHEM_TOP/src/util
make CC=icc FC=ifort FOPTIMIZE=-O3 version
make CC=icc FC=ifort FOPTIMIZE=-O3 
cd $NWCHEM_TOP/src
make CC=icc FC=ifort FOPTIMIZE=-O3  link


Please notice that if your cluster contains nodes with slightly different CPU's (with different levels of SSE, for example), you will need to remove manually the -xHost flag from:

From $NWCHEM_TOP/src/custom/makefile.h the intel compiler xHost flag has been taken off from:

rc/config/makefile.h:        FOPTIMIZE = -O3 -xHost
src/config/makefile.h:        FOPTIMIZE += -xHost
src/config/makefile.h:        COPTIONS   +=   -xHOST -ftz


Good luck!

Raffaella.

Just Got Here
Hello Raffaella, Thanks for sharing the build script. Here is what I have based on yours.

#!/bin/bash

module load intel/compiler/64/15.0.0.090
module load intel/mpi/64/5.0.1.035
module load intel/mkl/64/11.2


export NWCHEM_TOP=/apps/nwchem/offline/6.5-26243
export NWCHEM_TARGET=LINUX64

export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include/infiniband/
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"

export USE_MPI=Y
export USE_MPIF=Y
export USE_MPIF4=Y
export MPI_LOC=$I_MPI_ROOT
export MPI_INCLUDE="-I$I_MPI_ROOT/include"
export MPI_LIB="-L$I_MPI_ROOT/lib64"
export LIBMPI="-lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"

export NWCHEM_MODULES="all python"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE

export PYTHONHOME=/usr
export PYTHONVERSION=2.6
export USE_PYTHON64=y
export PYTHONLIBTYPE=so

sed -i 's/libpython$(PYTHONVERSION).a/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)/g' $NWCHEM_TOP/src/config/makefile.h

export HAS_BLAS=yes
export USE_SCALAPACK=y
export MKLLIB=$MKLROOT/lib/intel64
export MKLINC=$MKLROOT/include
export BLASOPT="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_LIBS="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export SCALAPACK="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"

export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export USE_64TO32=y

export FC=ifort
export CC=icc

echo "cd $NWCHEM_TOP/src"
cd $NWCHEM_TOP/src

echo "BEGIN --- make realclean "
make realclean
echo "END --- make realclean "

echo "BEGIN --- make nwchem_config "
make nwchem_config
echo "END --- make nwchem_config "

echo "BEGIN --- make"
make CC=icc FC=ifort FOPTIMIZE=-O3 -j4
echo "END --- make "

cd $NWCHEM_TOP/src/util
make CC=icc FC=ifort FOPTIMIZE=-O3 version
make CC=icc FC=ifort FOPTIMIZE=-O3
cd $NWCHEM_TOP/src
make CC=icc FC=ifort FOPTIMIZE=-O3  link


But am running into this error.

ld: cannot find -lccsd
make: *** [link] Error 1


Any pointers on how to get around this would be very helpful.

Thank you

Clicked A Few Times
Thanks for your message! I will certainly try some of the options you used.

The main problem I am finding is that BLAS/LAPACK are not found, even though all the environmental variables are well set---perhaps that's why you defined the variables MKLLIB and MKLINC.

Best regards,

AD

Just Got Here
Quote:Edoapra May 21st 11:28 am
If you do not want -xhost to be used, please compile by using the command

make FC=ifort FOPTIMIZE=-O3


Even better is to use FPOTIMIZE="-O3 -axAVX + any other options" as that will use SSE and AVX where present. AVX2 can also be used if required in a similar way.

Just Got Here
I am also compiling on Intel, with MKL, although using a derivative of OpenMPI.

The compilation works, but when running an example I get:

mpirun -np 2 nwchem ccsdt_polar_small.nw
argument  1 = ccsdt_polar_small.nw
MA fatal error: MA_sizeof: invalid datatype: 343597384693
MA fatal error: MA_sizeof: invalid datatype: 343597384693
...
other failure messages

I pretty sure this is due to an issue with regards to MKL components, but I am not sure what it is I am doing wrong to get this. I am forcing 4 byte integers and 32 bit interfaces to go with it using the following

Just Got Here
Intel MKL, Intel MPI, Intel Compilers, IB build
Has anyone got a working script for Intel MKL, Intel MPI, Intel Compilers, IB build of NWChem 6.5?

Thank you.

Just Got Here
Intel MKL, Intel MPI, Intel Compilers, IB build
(rank:0 hostname:node-as-agpu-001 pid:11185):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/armci.c:ARMCI_Error():208 cond:0
  iter_orthog: failed to converge, error =   0.199219241745242     
 iter_orthog: failed to converge                   0
  iter_orthog: failed to converge, error =   0.199219241745242     
  iter_orthog: failed to converge, error =   0.199219241745242     
 iter_orthog: failed to converge                   0
  iter_orthog: failed to converge, error =   0.199219241745242     
 iter_orthog: failed to converge                   0


Is the error because of a bad input file or a bad NWChem build?

Appreciate any pointers on this.

Thanks.

Forum Vet
Quote:Roshan Sep 10th 9:10 am
Has anyone got a working script for Intel MKL, Intel MPI, Intel Compilers, IB build of NWChem 6.5?

Thank you.

Could you post the versions of MKL, Intel compilers and Intel MPI you are using?

Just Got Here
Intel MKL, Intel MPI, Intel Compilers, IB build
I have tried with

module load intel/compiler/64/14.0/2013_sp1.3.174
module load intel-mpi/64/4.1.3/049
module load intel/mkl/64/11.1/2013_sp1.3.174


and

module load intel/compiler/64/15.0.0.090
module load intel/mpi/64/5.0.1.035
module load intel/mkl/64/11.2


Build script

export NWCHEM_TOP=/apps/nwchem/offline/6.5-intel-impi
export NWCHEM_TARGET=LINUX64

export NWCHEM_LONG_PATHS=Y

# USE_NOIO can be set to avoid NWChem 6.5 doing I/O for the ddscf, mp2 and ccsd modules (it automatically sets USE_NOFSCHECK, too).
# It is strongly recommended on large clusters or supercomputers or any computer lacking any fast and large local filesystem. 
export USE_NOIO=TRUE

# LIB_DEFINES can be set to pass additional defines to the C preprocessor (for both Fortran and C), e.g. 
# Note: -DDFLT_TOT_MEM sets the default dynamic memory available for NWChem to run, where the units are in doubles.
export LIB_DEFINES='-DDFLT_TOT_MEM=16777216'

export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"

export USE_MPI=Y
export USE_MPIF=Y
export USE_MPIF4=Y
export MPI_LOC=$I_MPI_ROOT
export MPI_INCLUDE="-I$I_MPI_ROOT/include64"
export MPI_LIB="-L$I_MPI_ROOT/lib64"
export LIBMPI="-lmpiif -lmpi -ldl -lrt -lpthread"

export NWCHEM_MPIF_WRAP="mpiifort"
export NWCHEM_MPIC_WRAP="mpiicc"
export NWCHEM_MPICXX_WRAP="mpiicpc"

export NWCHEM_MODULES="all"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE

export HAS_BLAS=yes
export USE_SCALAPACK=y
export MKLLIB=$MKLROOT/lib/intel64
export MKLINC=$MKLROOT/include
export BLASOPT="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_LIB="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export BLAS_LIB="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export SCALAPACK="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export SCALAPACK_LIB="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"

export SCALAPACK_SIZE=8
export BLAS_SIZE=8

export FC=ifort
export F77=ifort
export CC=icc
export CXX=icpc
export AR=xiar


Forum Vet
Roshan
I was not able to spot anything wrong in your settings.
The only potential problem might be the use of icc as C compiler.

Anyhow, is the input showing this failure a complex one?


Forum >> NWChem's corner >> Compiling NWChem
Jump to page 1Prev 162Next 16Last