Hi,
I retried the same calculation with a significantly higher allocation of memory:
memory total 127000 mb stack 31750 mb heap 31750 mb global 63500 mb
The got a little further, but unfortunately still failed, albeit with a different error message, and with the following lines after the DFT energy calculation:
stpr_wrt_fd_from_sq: overwrite of existing file:
./3DPP_TT_T_trans_ResRaman.hess
stpr_wrt_fd_dipole: overwrite of existing file
./3DPP_TT_T_trans_ResRaman.fd_ddipole
HESSIAN: the one electron contributions are done in 384.6s
HESSIAN: 2-el 1st deriv. term done in 15857.4s
HESSIAN: 2-el 2nd deriv. term done in 2744.5s
stpr_wrt_fd_from_sq: overwrite of existing file:
./3DPP_TT_T_trans_ResRaman.hess
stpr_wrt_fd_dipole: overwrite of existing file
./3DPP_TT_T_trans_ResRaman.fd_ddipole
HESSIAN: the two electron contributions are done in 19384.2s
0:CreateSharedRegion:kr_malloc failed KB=: 20528
(rank:0 hostname:cx1-11-4-3.cx1.hpc.ic.ac.uk pid:22026):ARMCI DASSERT fail. ../../ga-5-1/armci/src/memory/shmem.c:Create_Shared_Region():1188 cond:0
Last System Error Message from Task 0:: Inappropriate ioctl for device
[cli_0]: aborting job:
application called MPI_Abort(comm=0x84000001, 20528) - process 0
rank 0 in job 1 cx1-11-4-3.cx1.hpc.ic.ac.uk_59624 caused collective abort of all ranks
exit status of rank 0: return code 48
Job terminated normally
I would be very grateful for any advice.
|