CentOS 5.7, RHEL5.2 NWChem 6.1 64 bit will build ok, but crash when run causing seg violation


Click here for full thread
Forum Vet
Looks like the 6.0 version was compiled over TCP/IP sockets and is run using "parallel.x". 6.1 seems to be a serial version somehow.

Some more info is needed:

1. What kind of platform are you compiling on?

2. We need more of the output to understand here it fails. Also, search for nproc in the output, how many procs is it using?

Bert


[QUOTE=Fiu chemistry Apr 25th 3:48 pm]Hi, I am new to the forum  though use  NWChem 6.0 quite intensively. New DFT features in 6.1 are of interest but jobs always crash with Segmentation Violation error :

"0:Segmentation Violation error, status=: 11 
(rank:0 hostname:lasso.bw02.fiu.edu pid:8132):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0 rank 0 in job 2 lasso.bw02.fiu.edu_38648 caused collective abort of all ranks exit status of rank 0: return code 11 "

I have tried mpich2,openmpi,mvapich2 but always finish with this error (c2h4 test: nw memory was set in a range 1Gb - 256 mb for 8cpu node with 8Gb) . The compilation script is (both mpich2 and NWChem6.1 were compiled using Intel v. 12)  :


export LARGE_FILES=TRUE
echo LARGE_FILES=$LARGE_FILES
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES=all
export ENABLE_COMPONENT=yes
export TCGRSH=/usr/bin/ssh

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y

export MPI_HOME=$HOME/mpich2
export MPI_LOC=$MPI_HOME
export MPI_LIB=$MPI_LOC/lib
export MPI_INCLUDE=$MPI_LOC/include
export LIBMPI=" -L/${MPI_LIB} -lmpich -lopa -lmpl -lpthread -lrt"

make nwchem_config
make



There is a difference in job log files between 6.1 and 6.0.
6.0 job log starts with "ARMCI configured for 2 cluster nodes. Network protocol is 'TCP/IP Sockets argument 1 = 1.nw"
6.1 job log starts with " argument 1 = 1.nw"

Need help,
regards