int2e test mem: int2e file open failed 0


Clicked A Few Times
I was trying to run nwchem for a C60 molecule on a MPI parallel machine and got this error:

 
      Non-variational initial energy
      ------------------------------

 Total energy =   -2232.919515
 1-e energy   =  -18521.140946
 2-e energy   =    8476.113031
 HOMO         =      -0.189597
 LUMO         =      -0.143266
 
   Time after variat. SCF:     51.1
   Time prior to 1st pass:     51.1
 ------------------------------------------------------------------------
 int2e_test_mem: int2e_file_open failed        0
 ------------------------------------------------------------------------
 ------------------------------------------------------------------------
  current input line : 
    69: task dft
 ------------------------------------------------------------------------




Any idea what is wrong?

Forum Vet
Unable to open integral file on disk
This error means that NWChem was unable to open an integral file in the scratch area specified in the input (or whatever the default is, you can find this scratch space early in the output). It could be that not all processors can write to this disk space, or because there is not sufficient space to write a small file.

Alternative, use direct.

Clicked A Few Times
Thanks, that has worked for a single molecule. Now, I attempt to run a larger cluster and the job fails with:

 
      Screening Tolerance Information
      -------------------------------
          Density screening/tol_rho: 1.00D-10
          AO Gaussian exp screening on grid/accAOfunc:  14
          CD Gaussian exp screening on grid/accCDfunc:  20
          XC Gaussian exp screening on grid/accXCfunc:  20
          Schwarz screening/accCoul: 1.00D-08

 ------------------------------------------------------------------------
 dft_scf:failed duplicate     -991
 ------------------------------------------------------------------------
 ------------------------------------------------------------------------
  current input line : 
   798: task dft
 ------------------------------------------------------------------------
 ------------------------------------------------------------------------
 ------------------------------------------------------------------------



I didn't find any similar problem online and would appreciate some help.

Thanks

Forum Vet
I agree our error messages are a little cryptic sometimes. This error means the calculation ran out of global memory. I don't know what your memory settings are nor how much memory you have per core, but you can see in the output what the code is using if you're using the default. The global memory block should be increased. If you're maxed out on memory usage, the heap block can be 100 mb or less.

Bert

Clicked A Few Times
Hi Bert,

I suspected as much. After reading some of the other posts, I started experimenting with the memory settings. The machine I intend the jobs to run on have up to 120GB of memory per node, with 32 cores per node. As I understand, the memory setting sets the memory per core, that would be in total 3.75GB/core. Even with the default 25/25/50 heap/stack/global partitioning, that are 1.875GB/core global memory.
The problem then appears to be setting the ARMCI_DEFAULT_SHMMAX. That would be 32x1.875GB = 60000, but I understand that the current limit is 8129. Is there any way to increase that or am I having things wrong?

Thanks again,
Bob

Forum Vet
ARMCI_DEFAULT_SHMMAX
This is something we are addressing for our next release. If you are experienced enough to play with c-code, the hard maximum is set in

src/tools/armci/src/memory/shmem.c around line 326

Do make sure that your system can handle large shared memory segments.

$ sysctl kernel.shmmax
kernel.shmmax = 4294967296 - this should be larger then ARMCI_DEFAULT_SHMMAX


Forum >> NWChem's corner >> Running NWChem