Segmentation Violation error, status=: 11


Clicked A Few Times
Description of cluster:Cray CS300-LC cluster with 2640 core Intel Ivy Bridge processor cores and 15,600 Intel Xeon Phi cores. Each node has 64 GB of memory (for a total of 8 TB) and an FDR InfiniBand interconnect (56 Gb/s).Compilation script as following

export NWCHEM_TARGET=LINUX64
export NWCHEM_TOP=/work/mmh568/software/nwchem-6.5
export ARMCI_NETWORK=OPENIB
export ARMCI_OPENIB_DEVICE=mlx4_0
export ARMCI_DEFAULT_SHMMAX_UBOUND=65536
export USE_MPI=y
export NWCHEM_MODULES=all\ python
export USE_MPIF=y
export USE_MPIF4=y
export MPI_HOME=/usr/local/intel-2015//impi/5.0.1.035/intel64/
export MPI_INCLUDE="$MPI_HOME"/include
export MPI_LIB="$MPI_HOME"/lib
export LIBMPI="-lmpi -lmpigf -lmpigi -lrt -lpthread"
export MKLROOT=/usr/local/intel-2015/composer_xe_2015.0.090/mkl/
export SCALAPACK_LIB=" -mkl -openmp -lmkl_scalapack_ilp64 -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export SCALAPACK="$SCALAPACK_LIB"
export LAPACK_LIB="-mkl -openmp -lpthread -lm"
export BLAS_LIB="$LAPACK_LIB"
export BLASOPT="$LAPACK_LIB"
export USE_SCALAPACK=y
export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export LAPACK_SIZE=8
export PYTHONHOME=/usr
export PYTHONVERSION=2.6
export PYTHONLIBTYPE=so
export USE_PYTHON64=y
export USE_CPPRESERVE=y
export USE_NOFSCHECK=y
export USE_OPENMP=1
export USE_OFFLOAD=1
cd $NWCHEM_TOP/src
patch -p0 < Xlmpoles_ifort15.patch
make nwchem_config
make FC=ifort CC=icc AR=xiar -j 16 |tee /work/mmh568/software/nwchem_buil.log

Shared memory information:
cat /proc/sys/kernel/shmmax
68719476736

Submission :
OMP_NUM_THREADS=4
MIC_USE_2MB_BUFFER=16K
ARMCI_OPENIB_DEVICE=mlx4_0
NWC_RANKS_PER_DEVICE=0
ARMCI_DEFAULT_SHMMAX=8192

mpirun -np 16 /home/mmh568/software/nwchem-6.5/bin/LINUX64/nwchem UO2-Water.nw >& UO2-Water.out


Input file: I ran same input file in carver/nersc super computer there was no error but when i ran it in our university HPC, i got error. file in the link

https://drive.google.com/file/d/0BwN-PJpifw3fSnlyVE8zekN5ak0/view?usp=sharing

Error message:

Energy Calculation



         ============ Grassmann lmbfgs iteration ============
    >>>  ITERATION STARTED AT Sun Feb  1 15:50:04 2015  <<<
iter. Energy DeltaE DeltaRho
------------------------------------------------------
0:Segmentation Violation error, status=: 11
1:Segmentation Violation error, status=: 11
(rank:1 hostname:shadow-0046.hpc.msstate.edu pid:4399):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
8:Segmentation Violation error, status=: 11
8:Segmentation Violation error, status=: 11
(rank:8 hostname:shadow-0010.hpc.msstate.edu pid:4444):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
2:Segmentation Violation error, status=: 11
(rank:2 hostname:shadow-0046.hpc.msstate.edu pid:4400):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
9:Segmentation Violation error, status=: 11
9:Segmentation Violation error, status=: 11
(rank:9 hostname:shadow-0010.hpc.msstate.edu pid:4445):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
3:Segmentation Violation error, status=: 11
3:Segmentation Violation error, status=: 11
(rank:3 hostname:shadow-0046.hpc.msstate.edu pid:4401):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
10:Segmentation Violation error, status=: 11
10:Segmentation Violation error, status=: 11
10:Segmentation Violation error, status=: 11
(rank:10 hostname:shadow-0010.hpc.msstate.edu pid:4446):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
4:Segmentation Violation error, status=: 11
(rank:4 hostname:shadow-0046.hpc.msstate.edu pid:4402):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
11:Segmentation Violation error, status=: 11
11:Segmentation Violation error, status=: 11
11:Segmentation Violation error, status=: 11
(rank:11 hostname:shadow-0010.hpc.msstate.edu pid:4447):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
5:Segmentation Violation error, status=: 11
(rank:5 hostname:shadow-0046.hpc.msstate.edu pid:4403):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
12:Segmentation Violation error, status=: 11
6:Segmentation Violation error, status=: 11
(rank:6 hostname:shadow-0046.hpc.msstate.edu pid:4404):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
7:Segmentation Violation error, status=: 11
7:Segmentation Violation error, status=: 11
(rank:7 hostname:shadow-0046.hpc.msstate.edu pid:4405):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 1:: Numerical result out of range
Last System Error Message from Task 2:: Numerical result out of range
13:Segmentation Violation error, status=: 11
(rank:13 hostname:shadow-0010.hpc.msstate.edu pid:4449):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 3:: Numerical result out of range
14:Segmentation Violation error, status=: 11
(rank:14 hostname:shadow-0010.hpc.msstate.edu pid:4450):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 4:: Numerical result out of range
15:Segmentation Violation error, status=: 11
15:Segmentation Violation error, status=: 11
(rank:15 hostname:shadow-0010.hpc.msstate.edu pid:4451):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 5:: Numerical result out of range
Last System Error Message from Task 8:: Numerical result out of range
Last System Error Message from Task 7:: Numerical result out of range
Last System Error Message from Task 11:: Numerical result out of range
Last System Error Message from Task 13:: Numerical result out of range
Last System Error Message from Task 14:: Numerical result out of range
Last System Error Message from Task 15:: Numerical result out of range
12:Segmentation Violation error, status=: 11
9:Segmentation Violation error, status=: 11
14:Segmentation Violation error, status=: 11
(rank:0 hostname:shadow-0046.hpc.msstate.edu pid:4398):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
5:Segmentation Violation error, status=: 11
6:Segmentation Violation error, status=: 11
Last System Error Message from Task 0:: Numerical result out of range
0:Segmentation Violation error, status=: 11
1:Segmentation Violation error, status=: 11
2:Segmentation Violation error, status=: 11
4:Segmentation Violation error, status=: 11
11:Segmentation Violation error, status=: 11
12:Segmentation Violation error, status=: 11
(rank:12 hostname:shadow-0010.hpc.msstate.edu pid:4448):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
13:Segmentation Violation error, status=: 11
Last System Error Message from Task 12:: Numerical result out of range
application called MPI_Abort(comm=0x84000001, 11) - process 6
application called MPI_Abort(comm=0x84000001, 11) - process 7
application called MPI_Abort(comm=0x84000001, 11) - process 14
application called MPI_Abort(comm=0x84000001, 11) - process 10
application called MPI_Abort(comm=0x84000001, 11) - process 9
application called MPI_Abort(comm=0x84000001, 11) - process 3
application called MPI_Abort(comm=0x84000001, 11) - process 5
application called MPI_Abort(comm=0x84000001, 11) - process 15
application called MPI_Abort(comm=0x84000001, 11) - process 13
application called MPI_Abort(comm=0x84000001, 11) - process 11
application called MPI_Abort(comm=0x84000001, 11) - process 12
application called MPI_Abort(comm=0x84000001, 11) - process 8


Please Admin help me fix this problem


Forum >> NWChem's corner >> Running NWChem