running CCSD(T) using TCE, ARMCI DASSERT fail


Clicked A Few Times
Dear Community members,


I got a problem of running CCSD(T) using TCE with the error message of "ARMCI DASSERT fail".

Does TCE requires large stack memory?

For this jobs, the total basis functions is 779 with 68 closed-shell orbitals.

I requested 16 cores per node on cascade. The parameter of ARMCI_DEFAULT_SHMMAX was set default to be 32768. can I increase this as this is usually larger or equal to 16*global memory?

By the way, I also tried to use less cores, still didn't work with the same error message.

Clicked A Few Times
I attached the input, is there something wrong in input for TCE?

The input is

echo
start mo3o9.2.1-butanol.complex.ccsdt.ad.1
permanent_dir /***/***/***
memory stack 1400 mb heap 200 mb global 5400 mb noverify
title "mo3o9.2.1-butanol CCSD(T)/aug-cc-pVDZ(-PP)//B3LYP/cc-pVDZ(-PP)"
charge 0

geometry
 MO    -0.802370   -1.738459    1.555090
MO -2.541395 -0.551759 -1.178296
MO 0.951724 0.121925 -1.023389
O -0.557496 -3.362931 1.979747
O -0.982939 -0.855387 3.000000
O -2.324056 -1.576357 0.392038
O 0.631711 -1.131656 0.549941
O -3.264883 0.954993 -0.819329
O -3.489093 -1.347071 -2.335903
O 1.971362 -0.765834 -2.035464
O -0.790121 -0.264029 -1.769960
O 1.241220 1.748985 -1.432570
O 2.550832 0.514215 0.491306
H 2.694536 -0.281418 1.030107
C 3.798448 1.258902 0.275815
H 3.486959 2.143605 -0.279818
H 4.152005 1.562882 1.266547
C 4.849222 0.457101 -0.482208
H 5.694721 1.136119 -0.654486
H 4.457456 0.186550 -1.468308
C -0.340949 2.135280 1.793503
C -0.138749 3.602669 1.423384
H 0.588408 1.687066 2.148675
H -1.095913 2.016641 2.578199
H 0.294292 4.102907 2.300043
H 0.599969 3.669617 0.616696
O -0.742189 1.322960 0.651054
H -1.525963 1.735999 0.246872
C 5.344338 -0.798266 0.253307
H 4.512797 -1.500433 0.400808
H 5.707286 -0.520087 1.252020
C -1.428529 4.332205 1.014848
H -1.877868 3.850492 0.136341
H -2.164648 4.249174 1.824870
C 6.460092 -1.523019 -0.513704
H 7.328446 -0.870210 -0.655117
H 6.794337 -2.414566 0.025126
H 6.113628 -1.839313 -1.502965
C -1.188895 5.813294 0.687507
H -2.120904 6.310635 0.401852
H -0.775295 6.346562 1.550928
H -0.483152 5.923975 -0.142964
end

basis spherical
H library aug-cc-pVDZ

C library aug-cc-pVDZ

O library aug-cc-pVDZ

Mo library aug-cc-pVDZ-PP


ecp
  1. ECP28MDF
Mo nelec 28
Mo S
2 10.097000 180.076853
2 4.375670 24.715920
Mo P
2 9.126564 41.227678
2 8.863223 82.452670
2 4.044948 6.345092
2 3.866657 12.458423
Mo D
2 7.535754 19.308744
2 7.278976 28.977674
2 2.763205 3.189516
2 2.772085 4.700169
Mo F
2 6.306633 -7.178888
2 6.356448 -9.745978
end

scf
rhf
singlet
maxiter 100
thresh 1.0e-8
tol2e 1.0e-8
end

tce
freeze core 31
tilesize 18
2eorb
2emet 13
split 2
diis 5
ccsd(t)
maxiter 60
thresh 1e-6
lshift 0.3
end

task tce energy

Clicked A Few Times
For the error in the output file, i always got something like



...................


Global array virtual files algorithm will be used

Parallel file system coherency ......... OK

Integral file          = ./mo3o9.2.1-butanol.complex.ccsdt.ad.1.aoints.000
Record size in doubles = 65536 No. of integs per rec = 32766
Max. records in memory = 2759 Max. records in file = 30427
No. of bits per label = 16 No. of bits per value = 64


#quartets = 8.903D+08 #integrals = 2.787D+10 #direct =  0.0% #cached =100.0%


File balance: exchanges= 6119 moved= 77032 time= 0.9


Fock matrix recomputed
1-e file size = 556516
1-e file name = ./mo3o9.2.1-butanol.
Cpu & wall time / sec 22.0 22.4
4-electron integrals stored in orbital form

v2    file size   =      40625293636
4-index algorithm nr. 13 is used
imaxsize = 30
imaxsize ichop = 0
(rank:208 hostname:g295 pid:15781):ARMCI DASSERT fail. ../../ga-5-2/armci/src/devices/openib/openib.c:armci_server_register_region():1144 cond:(memhdl->memhndl!=((void *)0))
(rank:384 hostname:g374 pid:18705):ARMCI DASSERT fail. ../../ga-5-2/armci/src/devices/openib/openib.c:armci_server_register_region():1144 cond:(memhdl->memhndl!=((void *)0))
(rank:240 hostname:g297 pid:18816):ARMCI DASSERT fail. ../../ga-5-2/armci/src/devices/openib/openib.c:armci_server_register_region():1144 cond:(memhdl->memhndl!=((void *)0))
(rank:128 hostname:g281 pid:18699):ARMCI DASSERT fail. ../../ga-5-2/armci/src/devices/openib/openib.c:armci_server_register_region():1144 cond:(memhdl->memhndl!=((void *)0))
(rank:0 hostname:g271 pid:23598):ARMCI DASSERT fail. ../../ga-5-2/armci/src/devices/openib/openib.c:armci_server_register_region():1144 cond:(memhdl->memhndl!=((void *)0))




Thanks


Zongtang

Clicked A Few Times
Sorry to post separately for my problem, the web just didn't let me to do with one.

Clicked A Few Times
COMEX
If you are using the latest version, perhaps you can give the COMEX a try? This seems to be a very new feature, so make sure your compilation is sensible by running the tests.
 setenv ARMCI_NETWORK OFA
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Dr. O. Baris Malcioglu,
University of Liege,
Bât. B5 Physique de la matière condensée
allée du 6 Août 17
4000 Liège 1
Belgique


Forum >> NWChem's corner >> Running NWChem