I got the same error as Davide on a system with the following settings
16 cores with 32 GB memory per node
kernel.shmmax = 1073741824
ARMCI_DEFAULT_SHMMAX=8092
I tried even using 100 nodes.
I tried on a different cluster with the following settings:
8 cores with 12 GB memory per node
kernel.shmmax = 68719476736
ARMCI_DEFAULT_SHMMAX=8092
and get the following error:
Entering Davidson iterations
Restricted singlet excited states
Iter NTrls NConv DeltaV DeltaE Time
---- ------ ------ --------- --------- ---------
1 20 0 0.21E+00 0.10+100 354.2
2 60 2 0.81E-01 0.82E-02 654.3
3 94 0 0.77E-01 0.66E-02 604.7
4 134 0 0.39E-01 0.90E-02 763.2
5 174 0 0.89E-01 0.42E-02 746.7
6 214 0 0.90E-01 0.10E-01 760.4
7 254 0 0.86E-01 0.11E-01 776.5
0:Terminate signal was sent, status=: 15
(rank:0 hostname:rs538 pid:32755):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigTermHandler():472 cond:0
I also get lots of the following errors in my slurm file:
1510
142 rep failed on Work product ndim
3 dims 37 1510
1510
136 rep failed on Work product ndim
3 dims 37 1510
1510
76 rep failed on Work product ndim
3 dims 37 1510
1510
140 rep failed on Work product ndim
3 dims 37 1510
1510
Last System Error Message from Task 68:: Numerical result out of range
Last System Error Message from Task 71:: Numerical result out of range
Last System Error Message from Task 66:: Numerical result out of range
Last System Error Message from Task 69:: Invalid argument
Last System Error Message from Task 67:: Numerical result out of range
Last System Error Message from Task 70:: Numerical result out of range
Last System Error Message from Task 64:: Numerical result out of range
Last System Error Message from Task 80:: Numerical result out of range
.
.
.
Last System Error Message from Task 106:: Numerical result out of range
|