Hello everyone,
I'd like to test NWChem performance in simulating very large models at dft level in order to replace a proprietary software. Due to an agreement with my University, I have access to the Cluster edition of Intel 2017 compiler suite (with MKL and MPI).
After a successful make, I tested the code to optimize a water molecule and I got a weird error related to libpthread (I assume...) during the first gradient step. Could you please help me to figure out what I'm doing wrong?
Following is the script that I used to compile NWChem. I also tryed without USE_OPENMP=y (and consequently different BLAS, LAPACK and SCALAPACK libraries) and I got exactly the same error.
source /opt/intel/parallel_studio_xe_2017.2.050/bin/psxevars.sh intel64
source /opt/intel/impi/2017.2.174/bin64/mpivars.sh intel64
source /opt/intel/bin/compilervars.sh intel64
source /opt/intel/mkl/bin/mklvars.sh intel64
export NWCHEM_TOP=/opt/nwchem-6.6_OMP
export NWCHEM_MODULES=pnnl
export NWCHEM_TARGET=LINUX64
export NWCHEM_LONG_PATHS=y
export PYTHONHOME=/usr
export PYTHONVERSION=2.7
export PYTHONLIBTYPE=so
export USE_PYTHON64=y
export USE_NOFSCHECK=y
export TCGRSH=/usr/bin/ssh
export LARGE_FILES=y
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
[generated by tools/guess_mpidefs]
export MPI_LOC="/opt/intel/impi/2017.2.174/intel64"
export MPI_INCLUDE="/opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/include/gfortran/5.1.0 -I/opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/include"
export MPI_LIB="/opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib/release_mt -L/opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib"
export LIBMPI="-lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"
export USE_OPENMP=y
[copied from Intel MKL Line Advisor]
export MKLLIB="${MKLROOT}/lib/intel64"
export MKLINC="${MKLROOT}/include"
export HAS_BLAS=y
export BLAS_SIZE=8
export BLASOPT="-L${MKLROOT}/lib/intel64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm -ldl"
export LAPACK_SIZE=8
export LAPACK_LIB="$BLASOPT"
export LAPACK_LIBS="$BLASOPT"
export LAPACKOPT="$BLASOPT"
export USE_SCALAPACK=y
export SCALAPACK_SIZE=8
export SCALAPACK="-L${MKLROOT}/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_ilp64 -liomp5 -lpthread -lm -ldl"
export SCALAPACK_LIB="$SCALAPACK"
export SCALAPACK_LIBS="$SCALAPACK"
export CC=icc
export FC=ifort
export USE_64TO32=y
cd $NWCHEM_TOP/src
make realclean
make nwchem_config
make 64_to_32
make CC=icc FC=ifort FOPTIMIZE=-O3
cd $NWCHEM_TOP/src/tools
make CC=icc FC=ifort FOPTIMIZE=-O3 version
make CC=icc FC=ifort FOPTIMIZE=-O3
cd $NWCHEM_TOP/src
make CC=icc FC=ifort FOPTIMIZE=-O3 link
Here, there is my test job and the error from DFT Gradient Module. the command was "export OMP_NUM_THREADS= 4; mpirun -np 1 /opt/nwchem-6.6_OMP/bin/LINUX64/nwchem H2O_RI-TPSS+SVP.nw >& H2O_RI-TPSS+SVP.out"
============================== echo of input deck ==============================
echo
start H2O_RI-TPSS+SVP
memory 2048 mb
title "H2O opt"
charge 0
geometry units angstroms print xyz autosym
O 0.00000 0.00000 -0.40725
H 0.75882 0.00000 0.20362
H -0.75882 0.00000 0.20362
end
basis spherical
* library def2-SVP
end
basis "cd basis" spherical
* library "Weigend Coulomb Fitting"
end
dft
iterations 300
xc slater xtpss03 pw91lda ctpss03
decomp
mult 1
convergence energy 1e-8
grid xfine becke treutler
mulliken
end
driver
gmax 1e-3
grms 5e-7
xmax 1e-6
xrms 5e-7
eprec 1e-9
maxiter 300
end
task dft optimize
task dft freq numerical
================================================================================
...
NWChem DFT Gradient Module
--------------------------
H2O opt
charge = 0.00
wavefunction = closed shell
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
nwchem 00000000031444F1 tbk_trace_stack_i Unknown Unknown
nwchem 000000000314262B tbk_string_stack_ Unknown Unknown
nwchem 00000000030ED494 Unknown Unknown Unknown
nwchem 00000000030ED2A6 tbk_stack_trace Unknown Unknown
nwchem 0000000003082F99 for__issue_diagno Unknown Unknown
nwchem 000000000308A456 for__signal_handl Unknown Unknown
libpthread-2.24.s 00002B81668E55B0 Unknown Unknown Unknown
nwchem 0000000000EDF43E hfderi_ Unknown Unknown
nwchem 0000000000EDC4A9 hf2dold_ Unknown Unknown
nwchem 0000000000ED9C5B hf2d_ Unknown Unknown
nwchem 0000000000E5D79B intd_2e3c_ Unknown Unknown
nwchem 0000000000810714 dftg_cdfit_gen_ Unknown Unknown
nwchem 000000000080F4A1 dftg_cdfit_ Unknown Unknown
nwchem 000000000080B7C0 dft_gradients_ Unknown Unknown
nwchem 00000000006B9CAA grad_dft_ Unknown Unknown
nwchem 00000000006B9779 dft_energy_gradie Unknown Unknown
nwchem 000000000051523B task_gradient_doi Unknown Unknown
nwchem 0000000000514883 task_gradient_ Unknown Unknown
nwchem 0000000000618825 driver_ Unknown Unknown
nwchem 0000000000516C3B task_optimize_ Unknown Unknown
nwchem 00000000005064E8 task_ Unknown Unknown
nwchem 00000000004FDAB5 MAIN__ Unknown Unknown
nwchem 00000000004FD5AE main Unknown Unknown
libc-2.24.so 00002B81684E3401 __libc_start_main Unknown Unknown
nwchem 00000000004FD4AA _start Unknown Unknown
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 6444 RUNNING AT calcVM
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
Intel(R) MPI Library troubleshooting guide:
https://software.intel.com/node/561764
===================================================================================
Finally, I reported the output of "ldd /opt/nwchem-6.6_OMP/bin/LINUX64/nwchem"
linux-vdso.so.1 (0x00007ffef4dfb000)
libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00002afa48999000)
libmkl_scalapack_ilp64.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/mkl/lib/intel64_lin/libmkl_scalapack_ilp64.so (0x00002afa48dda000)
libmkl_intel_ilp64.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/mkl/lib/intel64_lin/libmkl_intel_ilp64.so (0x00002afa496c2000)
libmkl_intel_thread.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/mkl/lib/intel64_lin/libmkl_intel_thread.so (0x00002afa4a04c000)
libmkl_core.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/mkl/lib/intel64_lin/libmkl_core.so (0x00002afa4bade000)
libmkl_blacs_intelmpi_ilp64.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/mkl/lib/intel64_lin/libmkl_blacs_intelmpi_ilp64.so (0x00002afa4d5d1000)
libiomp5.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/compiler/lib/intel64_lin/libiomp5.so (0x00002afa4d83c000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002afa4dbde000)
libm.so.6 => /lib64/libm.so.6 (0x00002afa4ddfe000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002afa4e107000)
libmpifort.so.12 => /opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib/libmpifort.so.12 (0x00002afa4e30b000)
libmpi.so.12 => /opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib/libmpi.so.12 (0x00002afa4e6b4000)
librt.so.1 => /lib64/librt.so.1 (0x00002afa4f3c4000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002afa4f5cc000)
libc.so.6 => /lib64/libc.so.6 (0x00002afa4f7d1000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002afa4fb97000)
/lib64/ld-linux-x86-64.so.2 (0x00005587a6644000)
Thank you very much for your help!!!
|