GA memory error, large (RI-)MP2 job


Click here for full thread
Just Got Here
Skipping to a particular point/concern: how strict is the ~2,800 basis fxn 'limit' alluded to in the manual? Second, if the RHF calculation converges and prints the vectors but breaks in the analysis of the solution (multipoles), can I still use the vector file to do the MP2 job or will it be incomplete and therefore fail to be read correctly? I'd have tried it myself already, but it's an expensive job and we've only a limited time on the cluster.

Any guidance would be appreciated.

Abbreviated input file. Direct algorithm is necessary. Total basis fxns: 3,488. We can do DFT (BLYP and B3LYP) on a cluster this size using Orca, just fiddling with an explicitly correlated method.

start job_name

charge -1

geometry
 *coordinates, 316 atoms*
end

basis spherical
  all_not_hydrogen library aug-cc-pvdz
  h library cc-pvdz
end

scf
  rhf
  thresh 1.0e-10
  direct
end

mp2
 tight
 freeze atomic
end

# We could just as well do direct_rimp2
task direct_mp2 energy



Abbreviated PBS submission script. Calling 10 nodes, 12 ppn, each node pulls 48 GB memory.

# PBS header stuff

# Common memory adjusting things, works well with everything I've ever done
ulimit -s unlimited
export ARMCI_DEFAULT_SHMMAX=4096
unset MA_USE_ARMCI_MEM


RHF converges successfully and crashes in multipole analysis, before beginning MP2 module.

*RHF data*
*Vectors*
*Mulliken analysis*

       Multipole analysis of the density wrt the origin
       ------------------------------------------------

     L   x y z        total         open         nuclear
     -   - - -        -----         ----         -------
     0   0 0 0     -1.000000      0.000000   1059.000000

     1   1 0 0    -30.877247      0.000000      0.000000
     1   0 1 0     13.037500      0.000000      0.000000
     1   0 0 1    -26.055544      0.000000      0.000000

     2   2 0 0   -432.118567      0.000000  68489.384441
     2   1 1 0     92.050461      0.000000    104.555336
     2   1 0 1    -13.323659      0.000000    -22.521644
     2   0 2 0   -718.291153      0.000000  71458.928432
     2   0 1 1     19.661159      0.000000     18.213106
     2   0 0 2   -574.530181      0.000000  56599.014560

(rank:72 hostname:n0477.ten.osc.edu pid:25477):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:36 hostname:n0480.ten.osc.edu pid:24452):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:60 hostname:n0478.ten.osc.edu pid:302):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:0 hostname:n0491.ten.osc.edu pid:30001):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:108 hostname:n0364.ten.osc.edu pid:22340):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:84 hostname:n0376.ten.osc.edu pid:4740):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:12 hostname:n0489.ten.osc.edu pid:2876):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:48 hostname:n0479.ten.osc.edu pid:10756):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:96 hostname:n0374.ten.osc.edu pid:9019):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))
(rank:24 hostname:n0482.ten.osc.edu pid:4205):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_server_register_region():1124 cond:(memhdl->memhndl!=((void *)0))