Compilation problem using Intel 2013 compiler


Clicked A Few Times
Greetings,

I am experiencing a problem compiling the newest version of NWChem 6.3 from July 19.
The cluster is an HP Linux cluster with HP ProLiant BL280c G6 blade servers, each with two quad-core 2.8 GHz Intel Xeon X5560 "Nehalem EP" processors sharing 24 GiB of system memory, with a 40-gigabit QDR InfiniBand (IB) interconnect.

I could compile it, but when I run a test calculation it finishes with an error message:


0:Segmentation Violation error, status=: 11
2:Segmentation Violation error, status=: 11
(rank:4 hostname:node0605 pid:4117):AR:Last System Error Message from Task 6:: Inappropriate ioctl for device
(rank:13 hostname:node0606 pid:20884):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:11 hostname:node0606 pid:20882):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:15 hostname:node0606 pid:20886):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
LLLast System Error Message from Task 12:ast System Error Message from Task 15:LLast System Error Message from Task 13:::ast System Error Message from Task 10: :Iast System Error Message from Task 11::  :nappropriate ioctl for deviceIInappropriate ioctl for device
nappropriate ioctl for device
LInappropriate ioctl for device
ast System Error Message from Task 14:: Inappropriate ioctl for device
Last System Error Message from Task 9:: Inappropriate ioctl for device
Inappropriate ioctl for device
(rank:8 hostname:node0606 pid:20879):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 8:: Inappropriate ioctl for device
MPI Application rank 13 exited before MPI_Finalize() with status 11
MPI Application rank 4 exited before MPI_Finalize() with status 11

  

Below are the compilation options I used:

module load intel/2013.5
module load mkl/11.0.5.192
module load pmpi/8.1.0/intel

export IB_INCLUDE=/usr/include/infiniband
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libverbs"

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y

export MPI_LOC=/opt/platform_mpi-8.01.00.00-20101215r
export MPI_LIB="$MPI_LOC/lib/linux_amd64"
export MPI_INCLUDE="$MPI_LOC/include/64"
export LIBMPI="-lpcmpio -lpcmpi -ldl"

export LARGE_FILES="TRUE"

export MKLROOT="/soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl"
export MKL_INCLUDE="$MKLROOT/include/intel64/ilp64"
export BLAS_LIB="-L$MKLROOT/lib/intel64 $MKLROOT/lib/intel64/libmkl_blas95_ilp64.a $MKLROOT/lib/intel64/libmkl_lapack95_ilp64.a -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lpthread -lm"
export BLASOPT="$BLAS_LIB"
export BLAS_SIZE=8
export SCALAPACK_SIZE=8
export SCALAPACK="-L$MKLROOT/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export SCALAPACK_LIB="$SCALAPACK"
export USE_SCALAPACK=y

export CFLAGS=" -i8 -I$MKLROOT/include"

export FC=ifort

cd $NWCHEM_TOP/src
make 2>&1 | tee make.log


Here is an output from ldd $NWCHEM_TOP/bin/LINUX64/nwchem

       linux-vdso.so.1 =>  (0x00007fff8b3ff000)
libmkl_scalapack_ilp64.so => /soft/intel/mkl/10.2.1.017/lib/em64t/libmkl_scalapack_ilp64.so (0x00002ae1dd6b6000)
libmkl_intel_ilp64.so => /soft/intel/mkl/10.2.1.017/lib/em64t/libmkl_intel_ilp64.so (0x00002ae1dddea000)
libmkl_sequential.so => /soft/intel/mkl/10.2.1.017/lib/em64t/libmkl_sequential.so (0x00002ae1de180000)
libmkl_core.so => /soft/intel/mkl/10.2.1.017/lib/em64t/libmkl_core.so (0x00002ae1de880000)
libmkl_blacs_intelmpi_ilp64.so => /soft/intel/mkl/10.2.1.017/lib/em64t/libmkl_blacs_intelmpi_ilp64.so (0x00002ae1deaad000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ae1dec58000)
libm.so.6 => /lib64/libm.so.6 (0x00002ae1dee75000)
libmpio.so.1 => /opt/platform_mpi-8.01.00.00-20101215r/lib/linux_amd64/libmpio.so.1 (0x00002ae1df0fa000)
libmpi.so.1 => /opt/platform_mpi-8.01.00.00-20101215r/lib/linux_amd64/libmpi.so.1 (0x00002ae1df23b000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002ae1df529000)
libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00002ae1df72e000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002ae1df93b000)
libc.so.6 => /lib64/libc.so.6 (0x00002ae1dfb3e000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002ae1dfed2000)
/lib64/ld-linux-x86-64.so.2 (0x00002ae1dd494000)

Thanks!

Clicked A Few Times
Correction
Sorry, with an older MKL library module unloaded the error message looks like this:

3:Segmentation Violation error, status=: 11
(rank:0 hostname:node0900 pid:25872):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 0: status=: 11
14:Segmentation Violation error, status=: 11
13:Segmentation Violation error, status=: 11
11:Segmentation Violation error, status=: 11
(rank:15 hostname:node0901 pid:22267):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:12 hostname:node0901 pid:22264):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:14 hostname:node0901 pid:22266):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:10 hostname:node0901 pid:22262):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:11 hostname:node0901 pid:22263):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:9 hostname:node0901 pid:22261):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
(rank:13 hostname:node0901 pid:22265):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 10:: LIast System Error Message from Task 12:nappropriate ioctl for device:
LLLast System Error Message from Task 14:L: ast System Error Message from Task 15: ast System Error Message from Task 13:I:Iast System Error Message from Task 11:nappropriate ioctl for devicenappropriate ioctl for device :
L
IIast System Error Message from Task 9:nappropriate ioctl for devicenappropriate ioctl for device

IInappropriate ioctl for devicenappropriate ioctl for device

(rank:8 hostname:node0901 pid:22260):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 8:: Inappropriate ioctl for device
MPI Application rank 12 exited before MPI_Finalize() with status 11
forrtl: error (78): process killed (SIGTERM)


Output from ldd $NWCHEM_TOP/bin/LINUX64/nwchem

       linux-vdso.so.1 =>  (0x00007ffff6fff000)
libmkl_scalapack_ilp64.so => /soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64/libmkl_scalapack_ilp64.so (0x00002b75d9bf3000)
libmkl_intel_ilp64.so => /soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64/libmkl_intel_ilp64.so (0x00002b75da4bf000)
libmkl_sequential.so => /soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64/libmkl_sequential.so (0x00002b75dabd7000)
libmkl_core.so => /soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64/libmkl_core.so (0x00002b75db286000)
libmkl_blacs_intelmpi_ilp64.so => /soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64/libmkl_blacs_intelmpi_ilp64.so (0x00002b75dc4eb000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b75dc76d000)
libm.so.6 => /lib64/libm.so.6 (0x00002b75dc98b000)
libmpio.so.1 => /opt/platform_mpi-8.01.00.00-20101215r/lib/linux_amd64/libmpio.so.1 (0x00002b75dcc0f000)
libmpi.so.1 => /opt/platform_mpi-8.01.00.00-20101215r/lib/linux_amd64/libmpi.so.1 (0x00002b75dcd50000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002b75dd03f000)
libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00002b75dd243000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002b75dd450000)
libc.so.6 => /lib64/libc.so.6 (0x00002b75dd654000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b75dd9e7000)
/lib64/ld-linux-x86-64.so.2 (0x00002b75d99d1000)

Clicked A Few Times
empty

Forum Vet
Sam99,
I have a couple of questions/suggestions
1) When does the segmentation violation occur, in other words, do you get any NWChem output at all before the failure?

2) Most of your settings seem OK to me. The only one that might cause problem is the CFLAGS setting.
I think that you should unset it and recompile

Clicked A Few Times
Quote:Edoapra Sep 20th 11:29 am
Sam99,
I have a couple of questions/suggestions
1) When does the segmentation violation occur, in other words, do you get any NWChem output at all before the failure?

2) Most of your settings seem OK to me. The only one that might cause problem is the CFLAGS setting.
I think that you should unset it and recompile


Thanks for a quick reply.

1) I do, it reaches the point

     Screening Tolerance Information
-------------------------------
Density screening/tol_rho: 1.00D-10
AO Gaussian exp screening on grid/accAOfunc: 14
CD Gaussian exp screening on grid/accCDfunc: 20
XC Gaussian exp screening on grid/accXCfunc: 20
Schwarz screening/accCoul: 1.00D-08

and then I get the error message

2) I'll try that

Clicked A Few Times
Hi Edo,

I tried recompiling without CFLAGS setting and I got the same error.

Forum Vet
ARMCI_NETWORK=OPENIB
Did you set ARMCI_NETWORK=OPENIB ?
Could you please post the first lines of
$NWCHEM_TOP/src/tools/build/config.log

Clicked A Few Times
I did.
Sure. Here they are

$ ../ga-5-2/configure --prefix=/path to nwchem/src/tools/install --with-tcgmsg --with-mpi=-I/opt/platform_mpi-8.01.00.00-20101215r/include/64 -L/opt/platform_mpi-8.01.00.00-20101215r/lib/linux_amd64 -lpcmpio -lpcmpi -ldl /opt/platform_mpi-8.01.00.00-20101215r --enable-peigs --enable-underscoring --disable-mpi-tests --with-scalapack8=-L/soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_ilp64 -lpthread -lm --without-lapack --with-blas8=-L/soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64 /soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64/libmkl_blas95_ilp64.a /soft/intel/x86_64/2013/composer_xe_2013/composer_xe_2013.5.192/mkl/lib/intel64/libmkl_lapack95_ilp64.a -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lpthread -lm --with-openib=/usr/include/infiniband /usr/lib64 -libverbs CC=cc F77=ifort

Forum Vet
Debugger
Sam99
Could you try to debug your problem using a debugger (e.g. gdb)?


Forum >> NWChem's corner >> Compiling NWChem