TCE error ival=4 problem


  • Guest -
Dear NWChem users and Developers!

As discussed in another topic I try to run CCSD(T) calculations in the TCE module to obtain electric dipole moment and polarizability. Now I finished the test calculations and turned to the actual problem. For these systems I want to do the following

tce
freeze atomic or core
ccsd(t)
io ga
2eorb
2emet 3
tilesize 15
end

set tce:lineresp T
set tce:afreq 0.000
set tce:respaxis T T T

task tce energy

for which I use a 6-311g** basis set. If I do so for higher symmetry structures with C2v, D2h symmetry the calculation converges normally. With the same number of atoms but lower symmetry (of course by symmetry more integrals have to be calculated explicitly) , C2 or Cs, the calculations produce the error

0: error ival=4

This error always occurs in one of the first steps of CCSD Lambda iterations so after the CCSD routine is finished.
I am aware of the fact that these calculations are computationally more expensive but I also doubled the amount of processors used in the calculation but still the error keeps appearing. So far I have used
memory stack 800 mb heap 100 mb global 1000 mb
and increased it to
memory stack 2320 mb heap 180 mb global 2000 mb
but without solving the problem.

I would be pleased for any suggestion how I could solve this issue.

Thanks in advance and all the best

Sven

Forum Vet
Sven,

Can you please post the full input deck so we can try it and see what's wrong? Also, are you really using "freeze atomic or core" as in input line? This would give issues as this is an incorrect input.

Bert

Quote: Oct 3rd 1:10 pm
Dear NWChem users and Developers!

As discussed in another topic I try to run CCSD(T) calculations in the TCE module to obtain electric dipole moment and polarizability. Now I finished the test calculations and turned to the actual problem. For these systems I want to do the following

tce
freeze atomic or core
ccsd(t)
io ga
2eorb
2emet 3
tilesize 15
end

set tce:lineresp T
set tce:afreq 0.000
set tce:respaxis T T T

task tce energy

for which I use a 6-311g** basis set. If I do so for higher symmetry structures with C2v, D2h symmetry the calculation converges normally. With the same number of atoms but lower symmetry (of course by symmetry more integrals have to be calculated explicitly) , C2 or Cs, the calculations produce the error

0: error ival=4

This error always occurs in one of the first steps of CCSD Lambda iterations so after the CCSD routine is finished.
I am aware of the fact that these calculations are computationally more expensive but I also doubled the amount of processors used in the calculation but still the error keeps appearing. So far I have used
memory stack 800 mb heap 100 mb global 1000 mb
and increased it to
memory stack 2320 mb heap 180 mb global 2000 mb
but without solving the problem.

I would be pleased for any suggestion how I could solve this issue.

Thanks in advance and all the best

Sven

  • Guest -
Hi Bert!

No I just mentioned core or atomic since I was not sure if there is a difference. I tried both options but only one at a time. Right now I am using 80 processors with, as far as I know, 4.5 GB per Processor but if that is an important parameter I could ask the admin of the cluster. The full input is

start Si11_II_CC

memory stack 2320 mb heap 180 mb global 2000 mb

geometry units angstroms
  Si  -3.13917621     1.04569897     0.00000000
Si -0.48608866 -2.05930346 0.00000000
Si -1.44054750 -0.17461559 -1.23279420
Si -0.91486947 2.10257468 0.00000000
Si 0.81574414 1.05614863 -1.29483629
Si 2.74618502 1.01873330 0.00000000
Si 0.55884417 -1.29069123 -2.07231568
Si 1.92586771 -1.28938713 0.00000000
symmetry cs
end

basis spherical
 si library 6-311g**
end

tce
freeze atomic
ccsd(t)
io ga
2eorb
2emet 3
tilesize 15
end

set tce:lineresp T
set tce:afreq 0.000
set tce:respaxis T T T

task tce energy

All the best

Sven

Forum Vet
How much memory do you actually have per node, and how many processors per node?

Bert

Quote: Oct 3rd 6:07 pm
Hi Bert!

No I just mentioned core or atomic since I was not sure if there is a difference. I tried both options but only one at a time. Right now I am using 80 processors with, as far as I know, 4.5 GB per Processor but if that is an important parameter I could ask the admin of the cluster. The full input is

start Si11_II_CC

memory stack 2320 mb heap 180 mb global 2000 mb

geometry units angstroms
  Si  -3.13917621     1.04569897     0.00000000
Si -0.48608866 -2.05930346 0.00000000
Si -1.44054750 -0.17461559 -1.23279420
Si -0.91486947 2.10257468 0.00000000
Si 0.81574414 1.05614863 -1.29483629
Si 2.74618502 1.01873330 0.00000000
Si 0.55884417 -1.29069123 -2.07231568
Si 1.92586771 -1.28938713 0.00000000
symmetry cs
end

basis spherical
 si library 6-311g**
end

tce
freeze atomic
ccsd(t)
io ga
2eorb
2emet 3
tilesize 15
end

set tce:lineresp T
set tce:afreq 0.000
set tce:respaxis T T T

task tce energy

All the best

Sven

  • Guest -
As I was not sure with the old settings on the cluster I reran the calculation with

memory stack 800 mb heap 100 mb global 1000 mb

with 20 nodes with 4 cores on each and using 8GB per node. Still the error keeps
appearing.

Sven

Forum Vet
Hmmm, I have been running it and its working so far past this point. Took out the "io ga", I'll put it back to see if this is an issue.

Bert

Quote: Oct 4th 2:52 pm
As I was not sure with the old settings on the cluster I reran the calculation with

memory stack 800 mb heap 100 mb global 1000 mb

with 20 nodes with 4 cores on each and using 8GB per node. Still the error keeps
appearing.

Sven

  • Guest -
Mhh, this is strange it is maybe a compilation problem. Unfortunately, I do not know how the program is compiled and what libraries are used.
The rest of the error message is

CCSD Lambda iterations
---------------------------------------------
Iter Residuum Cpu Wall
---------------------------------------------
1 65.6301924841714 407.4 417.5
0: error ival=4
(rank:0 hostname:u4n075 pid:24296):ARMCI DASSERT fail. openib.c:armci_call_data_server():2010 cond:(pdscr->status==IBV_WC_SUCCESS)
4: error ival=4
(rank:4 hostname:u4n053 pid:29430):ARMCI DASSERT fail. openib.c:armci_call_data_server():2010 cond:(pdscr->status==IBV_WC_SUCCESS)

Please let me know if you are able to run the calculation with io ga. If so I think I have to contact the admin of the cluster.

Sven

Forum Vet
Sven,

I was able to run it successfully with your input deck.

What kind of hardware are you running on?

Do you set ARMCI_DEFAULT_SHMMAX in your job script or environment?

Do you set MA_USE_ARMCI_MEM in your job script or environment?

Bert


Quote: Oct 6th 9:39 am
Mhh, this is strange it is maybe a compilation problem. Unfortunately, I do not know how the program is compiled and what libraries are used.
The rest of the error message is

CCSD Lambda iterations
---------------------------------------------
Iter Residuum Cpu Wall
---------------------------------------------
1 65.6301924841714 407.4 417.5
0: error ival=4
(rank:0 hostname:u4n075 pid:24296):ARMCI DASSERT fail. openib.c:armci_call_data_server():2010 cond:(pdscr->status==IBV_WC_SUCCESS)
4: error ival=4
(rank:4 hostname:u4n053 pid:29430):ARMCI DASSERT fail. openib.c:armci_call_data_server():2010 cond:(pdscr->status==IBV_WC_SUCCESS)

Please let me know if you are able to run the calculation with io ga. If so I think I have to contact the admin of the cluster.

Sven

Forum Vet
Looks like your running on an Infiniband cluster. Setting the following in your job script may help:

  setenv ARMCI_DEFAULT_SHMMAX 2048
unsetenv MA_USE_ARMCI_MEM

If the output from your job below was run on 80 processors, then it is running slow. Ran it on our cluster at 40 seconds a step for the Lambda iterations...

Bert



Quote:Bert Oct 6th 11:44 pm
Sven,

I was able to run it successfully with your input deck.

What kind of hardware are you running on?

Do you set ARMCI_DEFAULT_SHMMAX in your job script or environment?

Do you set MA_USE_ARMCI_MEM in your job script or environment?

Bert


Quote: Oct 6th 9:39 am
Mhh, this is strange it is maybe a compilation problem. Unfortunately, I do not know how the program is compiled and what libraries are used.
The rest of the error message is

CCSD Lambda iterations
---------------------------------------------
Iter Residuum Cpu Wall
---------------------------------------------
1 65.6301924841714 407.4 417.5
0: error ival=4
(rank:0 hostname:u4n075 pid:24296):ARMCI DASSERT fail. openib.c:armci_call_data_server():2010 cond:(pdscr->status==IBV_WC_SUCCESS)
4: error ival=4
(rank:4 hostname:u4n053 pid:29430):ARMCI DASSERT fail. openib.c:armci_call_data_server():2010 cond:(pdscr->status==IBV_WC_SUCCESS)

Please let me know if you are able to run the calculation with io ga. If so I think I have to contact the admin of the cluster.

Sven

  • Guest -
Thanks for the moment!

I'll give it a try. About the speed: I know there were some issues compiling NWChem v6.0 and it took the admin quite a while.
Maybe it is a general compilation problem of the program on the cluster. Would it be ok if our admin contacts you if the error
keeps appearing?

Thanks

Sven

Forum Vet
Off course. Anybody that needs help can contact us.

Bert

Quote: Oct 7th 9:15 am
Thanks for the moment!

I'll give it a try. About the speed: I know there were some issues compiling NWChem v6.0 and it took the admin quite a while.
Maybe it is a general compilation problem of the program on the cluster. Would it be ok if our admin contacts you if the error
keeps appearing?

Thanks

Sven


Forum >> NWChem's corner >> Running NWChem