createfile: failed ga create size=*********


Clicked A Few Times
Hi,

I'm running nwchem with input file [1]. The application ran successfully until it failed with the error below.

Anyone has an idea what is wrong?


Global array virtual files algorithm will be used

Parallel file system coherency ......... OK

Integral file          = ./cytosine_oh.aoints.0
Record size in doubles = 65536 No. of integs per rec = 32766
Max. records in memory = 0 Max. records in file = 47733
No. of bits per label = 16 No. of bits per value = 64


#quartets = 2.379D+07 #integrals = 6.384D+08 #direct =  0.0% #cached =100.0%


File balance: exchanges= 4 moved= 327 time= 11.7


Fock matrix recomputed
1-e file size = 134162
1-e file name = ./cytosine_oh.f1
Cpu & wall time / sec 93.5 247.6

tce_ao2e: fast2e=1
half-transformed integrals in memory

2-e (intermediate) file size =     10492351200
2-e (intermediate) file name = ./cytosine_oh.v2i
available GA memory 209115096 bytes
------------------------------------------------------------------------
createfile: failed ga_create size=*********
------------------------------------------------------------------------
available GA memory 209156568 bytes
available GA memory 209156576 bytes
createfile: failed ga_create size=*********
available GA memory 209192864 bytes
createfile: failed ga_create size=*********
------------------------------------------------------------------------
------------------------------------------------------------------------
createfile: failed ga_create size=*********
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
149: task tce energy
------------------------------------------------------------------------
current input line :
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
For more information see the NWChem manual at http://nwchemgit.github.io/index.php/NWChem_Documentation


For further details see manual section:                                                                                                                              

------------------------------------------------------------------------
current input line :
------------------------------------------------------------------------
current input line :
0:
0:
0:
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
For more information see the NWChem manual at http://nwchemgit.github.io/index.php/NWChem_Documentation
For more information see the NWChem manual at http://nwchemgit.github.io/index.php/NWChem_Documentation

------------------------------------------------------------------------


For further details see manual section:                                                                                                                              


For further details see manual section:                                                                                                                              

------------------------------------------------------------------------
------------------------------------------------------------------------
For more information see the NWChem manual at http://nwchemgit.github.io/index.php/NWChem_Documentation


For further details see manual section:                                                                                                                              

0:0:createfile: failed ga_create size=:: 1902416608
1:1:createfile: failed ga_create size=:: 1902416608
(rank:0 hostname:cs-nsl17 pid:17269):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/armci.c:ARMCI_Error():208 cond:0
(rank:1 hostname:cs-nsl17 pid:17270):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/armci.c:ARMCI_Error():208 cond:0
3:3:createfile: failed ga_create size=:: 1902416608
2:2:createfile: failed ga_create size=:: 1902416608
(rank:2 hostname:cs-nsl17 pid:17271):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/armci.c:ARMCI_Error():208 cond:0
(rank:3 hostname:cs-nsl17 pid:17272):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/armci.c:ARMCI_Error():208 cond:0
root@cs-nsl17:~/nwchem/nwchem-6.1.1-src/src# cat time.txt
Command exited with non-zero status 224
3220.75user 720.50system 4:07:14elapsed 26%CPU (0avgtext+0avgdata 324880maxresident)k
1315859616inputs+40717200outputs (411major+91638minor)pagefaults 0swaps

Clicked A Few Times
This is my input file

echo

start cytosine_oh

geometry units bohr noautosym
    C   -0.88585950    -2.67531900    -0.80252415
C 1.83250395 -1.94806964 -0.92232063
C 2.46979608 0.59626734 -0.06738289
N 0.81506782 2.35228727 0.44314268
C -1.72632547 1.87745746 -0.06014199
N -2.37724606 -0.45255919 -1.16224647
O -3.38053327 3.44647379 0.33508870
N 4.94688010 1.13240987 0.21358662
H -1.35612541 -4.08910710 -2.24485177
H 3.28179767 -3.38084721 -1.21836650
H -4.28198798 -0.71685946 -1.16156268
H 5.41251821 2.86592369 0.88241272
H 6.28781773 -0.21710736 0.06809531
O -1.36453915 -3.96811090 1.54700866
H -1.18842914 -2.71714154 2.88370944
end

basis spherical
H S
    33.8650140              0.0060680
5.0947880 0.0453160
1.1587860 0.2028460
0.3258400 0.5037090
H S
     0.1027410              1.0000000
H S
     0.0324000              1.0000000
H P
     1.1588000              0.1884400
0.3258000 0.8824200
H P
     0.1027000              0.1178000
0.0324000 0.0042000
  1. BASIS SET: (10s,6p,4d) -> [5s,3p,2d]
C S
  5240.6353000              0.0009370
782.2048000 0.0072280
178.3508300 0.0363440
50.8159420 0.1306000
16.8235620 0.3189310
C S
     6.1757760              0.4387420
2.4180490 0.2149740
C S
     0.5119000              1.0000000
C S
     0.1565900              1.0000000
C S
     0.0479000              1.0000000
C P
    18.8418000              0.0138870
4.1592400 0.0862790
1.2067100 0.2887440
0.3855400 0.4994110
C P
     0.1219400              1.0000000
C P
     0.0385680              1.0000000
C D
     1.2067000              0.2628500
0.3855000 0.8043000
C D
     0.1219000              0.6535000
0.0386000 0.8636000
  1. BASIS SET: (10s,6p,4d) -> [5s,3p,2d]
N S
  8104.0716000              0.0008020
1216.0215000 0.0061740
277.2342800 0.0312330
76.9040230 0.1151980
25.8744190 0.2969510
N S
     9.3467670              0.4473490
3.5797940 0.2450030
N S
     0.7396100              1.0000000
N S
     0.2226170              1.0000000
N S
     0.0670060              1.0000000
N P
    26.8689870              0.0144780
5.9912270 0.0911560
1.7508420 0.2974200
0.5605110 0.4937960
N P
     0.1759480              1.0000000
N P
     0.0552310              1.0000000
N D
     1.7508000              0.2247700
0.5605000 0.6595600
N D
     0.1795900              0.8713600
0.0552000 0.7042200
  1. BASIS SET: (10s,6p,4d) -> [5s,3p,2d]
O S
 10662.2850000              0.0007990
1599.7097000 0.0061530
364.7252600 0.0311570
103.6517900 0.1155960
33.9058050 0.3015520
O S
    12.2874690              0.4448700
4.7568050 0.2431720
O S
     1.0042710              1.0000000
O S
     0.3006860              1.0000000
O S
     0.0900300              1.0000000
O P
    34.8564630              0.0156480
7.8431310 0.0981970
2.3062490 0.3077680
0.7231640 0.4924700
O P
     0.2148820              1.0000000
O P
     0.0638500              1.0000000
O D
     2.3062000              0.2027000
0.7232000 0.5791000
O D
     0.2149000              0.7854500
0.0639000 0.5338700
end

scf
thresh 1e-10
tol2e 1e-10
doublet
uhf
end

tce
freeze atomic
tilesize 40
thresh 1e-4
ccsd
nroots 1
end

task tce energy

Forum Vet
Looking at your input deck, you don't specify the "memory" input. As a result you are using the default memory settings, which are set pretty small (I think it's 400 mb only, you can see that in the first part of your output). You should try and increase your memory usage. Don't know how many processors you are using, and how many processors you are running on one node.

Some information on using the memory keyword, see http://nwchemgit.github.io/Special_AWCforum/st/id648/Is_there_a_systematic_way_of_... .

Note, this is not a small 20 min single processor calculation.

Bert


[QUOTE=Dhaminah Nov 17th 7:36 am]This is my input file

echo

start cytosine_oh

geometry units bohr noautosym
    C   -0.88585950    -2.67531900    -0.80252415
C 1.83250395 -1.94806964 -0.92232063
C 2.46979608 0.59626734 -0.06738289
N 0.81506782 2.35228727 0.44314268
C -1.72632547 1.87745746 -0.06014199
N -2.37724606 -0.45255919 -1.16224647
O -3.38053327 3.44647379 0.33508870
N 4.94688010 1.13240987 0.21358662
H -1.35612541 -4.08910710 -2.24485177
H 3.28179767 -3.38084721 -1.21836650
H -4.28198798 -0.71685946 -1.16156268
H 5.41251821 2.86592369 0.88241272
H 6.28781773 -0.21710736 0.06809531
O -1.36453915 -3.96811090 1.54700866
H -1.18842914 -2.71714154 2.88370944
end

basis spherical
H S
    33.8650140              0.0060680
5.0947880 0.0453160
1.1587860 0.2028460
0.3258400 0.5037090
H S
     0.1027410              1.0000000
H S
     0.0324000              1.0000000
H P
     1.1588000              0.1884400
0.3258000 0.8824200
H P
     0.1027000              0.1178000
0.0324000 0.0042000
  1. BASIS SET: (10s,6p,4d) -> [5s,3p,2d]
C S
  5240.6353000              0.0009370
782.2048000 0.0072280
178.3508300 0.0363440
50.8159420 0.1306000
16.8235620 0.3189310
C S
     6.1757760              0.4387420
2.4180490 0.2149740
C S
     0.5119000              1.0000000
C S
     0.1565900              1.0000000
C S
     0.0479000              1.0000000
C P
    18.8418000              0.0138870
4.1592400 0.0862790
1.2067100 0.2887440
0.3855400 0.4994110
C P
     0.1219400              1.0000000
C P
     0.0385680              1.0000000
C D
     1.2067000              0.2628500
0.3855000 0.8043000
C D
     0.1219000              0.6535000
0.0386000 0.8636000
  1. BASIS SET: (10s,6p,4d) -> [5s,3p,2d]
N S
  8104.0716000              0.0008020
1216.0215000 0.0061740
277.2342800 0.0312330
76.9040230 0.1151980
25.8744190 0.2969510
N S
     9.3467670              0.4473490
3.5797940 0.2450030
N S
     0.7396100              1.0000000
N S
     0.2226170              1.0000000
N S
     0.0670060              1.0000000
N P
    26.8689870              0.0144780
5.9912270 0.0911560
1.7508420 0.2974200
0.5605110 0.4937960
N P
     0.1759480              1.0000000
N P
     0.0552310              1.0000000
N D
     1.7508000              0.2247700
0.5605000 0.6595600
N D
     0.1795900              0.8713600
0.0552000 0.7042200
  1. BASIS SET: (10s,6p,4d) -> [5s,3p,2d]
O S
 10662.2850000              0.0007990
1599.7097000 0.0061530
364.7252600 0.0311570
103.6517900 0.1155960
33.9058050 0.3015520
O S
    12.2874690              0.4448700
4.7568050 0.2431720
O S
     1.0042710              1.0000000
O S
     0.3006860              1.0000000
O S
     0.0900300              1.0000000
O P
    34.8564630              0.0156480
7.8431310 0.0981970
2.3062490 0.3077680
0.7231640 0.4924700
O P
     0.2148820              1.0000000
O P
     0.0638500              1.0000000
O D
     2.3062000              0.2027000
0.7232000 0.5791000
O D
     0.2149000              0.7854500
0.0639000 0.5338700
end

scf
thresh 1e-10
tol2e 1e-10
doublet
uhf
end

tce
freeze atomic
tilesize 40
thresh 1e-4
ccsd
nroots 1
end

task tce energy

Clicked A Few Times
Thanks Bert,

I'm running the application on a single-node machine. The machine has an Intel quad-core Xeon processor and 4 GB of RAM. Basically I'm running across the 4 cores.

What would be a sufficient memory configuration? Currently, I have the following configs::

memory stack 1300 mb heap 100 mb global 2000 mb noverify

Forum Vet
Please carefully read the posting I pointed to. The memory keyword is per core, so what you specify times 4 is what you try to allocate. You cannot and should not over allocate memory or use virtual memory!

You have 4 cores and ONLY 4 Gbyte, so at most you can allocate probably 850 mb per core, leaving space for the OS. So "memory heap 100 mb stack 250 mb global 500 mb" is probably the largest that could work.

Make your tilesize much smaller, maybe 5 to reduce the local memory needs.

I want to stress this again, this calculation is too big for such a small machine, and will tak much longer than 20 minutes!.

Bert

Quote:Dhaminah Nov 18th 3:39 am
Thanks Bert,

I'm running the application on a single-node machine. The machine has an Intel quad-core Xeon processor and 4 GB of RAM. Basically I'm running across the 4 cores.

What would be a sufficient memory configuration? Currently, I have the following configs::

memory stack 1300 mb heap 100 mb global 2000 mb noverify

Clicked A Few Times
I tried several setups for the memory requirements. I understand that I have a very limited resources which might keep the application to run for so long (hours), yet the application fails at the same point mentioned earlier.

I made the tilesize 5 and sometimes smaller with no luck.

Any suggestions?! I'm out of ideas!!

Thanks

Forum Vet
Suggestion: Not to a TCE calculation or do a much smaller molecule, like H2O or NH3 or CO2.

Or, run QA/tests/cytosine_ccsd on 4 processors where you set the memory to 800 or 900 mb. This one I just tested does run with such limited resources and runs for a good 15 minutes or longer.

Bert

[QUOTE=Dhaminah Nov 19th 7:36 pm]I tried several setups for the memory requirements. I understand that I have a very limited resources which might keep the application to run for so long (hours), yet the application fails at the same point mentioned earlier. 

I made the tilesize 5 and sometimes smaller with no luck.

Any suggestions?! I'm out of ideas!!

Thanks

Clicked A Few Times
Much appreciated ...


Forum >> NWChem's corner >> Running NWChem