MK-CCSD on Kogence isn't much faster as expected.


Gets Around
Hello Nwchem users

I have following input

title "cubane CCSD/cc-pVDZ hessian"

memory stack 500 mb heap 500 mb global 22000 mb

geometry
 symmetry C2v
 H                     2.05734129    -1.32981120     0.02496974
 C                     1.09520984    -0.79607173    -0.03872782
 H                     2.05734129     1.32981120     0.02496974
 C                     1.09520984     0.79607173    -0.03872782
 H                     0.00000000    -1.59258557    -2.05035614
 C                     0.00000000    -1.20794408    -1.02589691
 H                     0.00000000     1.59258557    -2.05035614
 C                     0.00000000     1.20794408    -1.02589691
 H                     0.00000000    -1.45028854     1.97989592
 C                     0.00000000    -0.78106969     1.10677261
 H                     0.00000000     1.45028854     1.97989592
 C                     0.00000000     0.78106969     1.10677261
 H                    -2.05734129    -1.32981120     0.02496974
 C                    -1.09520984    -0.79607173    -0.03872782
 H                    -2.05734129     1.32981120     0.02496974
 C                    -1.09520984     0.79607173    -0.03872782
end

basis spherical
 H library cc-pVDZ
 C library cc-pVDZ
end

scf
 direct
end

tce
 mkccsd
 2emet 1
 freeze atomic
end

mrccdata
 root 1
 nref 2
 22222222222222222222222222220
 22222222222222222222222222202
end

task tce freq


I ran it on my personal 4-core computer and
get the following benchmarks

Symmetry of references

Ref.   1 sym:a   
Ref.   2 sym:a   
MR MkCCSD, version 1.0

Heff
=============================================
    0    1 -664.87322151    0.09099694
    0    2    0.09099694 -664.79730579

Eigenvalues (real and imaginary)
=============================================
 -664.933860013217    0.00000000
 -664.736667291187    0.00000000

Left eigenvectors
=============================================
    1   -0.83216055   -0.55453478
    2    0.55453478   -0.83216055

Right eigenvectors
=============================================
VR    1   -0.83216055   -0.55453478
VR    2    0.55453478   -0.83216055
Target root:    1

MkCC iter. #   1      -664.9338600132171      -307.3328942731508      -664.9338600132171
ddot R:  0.049763349351  1.692918082150
Iter cpu           236.1          315.6   1


I ran it on Kogence 128-core instance
https://kogence.com/app/jobs/files/list/-632%5ETransition_state_of_Cubane_and_azo-Cubane_t....
and have got
Symmetry of references

Ref.   1 sym:a   
Ref.   2 sym:a   
MR MkCCSD, version 1.0

Heff
=============================================
    0    1 -664.87322151    0.09099694
    0    2    0.09099694 -664.79730579

Eigenvalues (real and imaginary)
=============================================
 -664.933860013248    0.00000000
 -664.736667291216    0.00000000

Left eigenvectors
=============================================
    1   -0.83216055   -0.55453478
    2    0.55453478   -0.83216055

Right eigenvectors
=============================================
VR    1   -0.83216055   -0.55453478
VR    2    0.55453478   -0.83216055
Target root:    1

MkCC iter. #   1      -664.9338600132476      -307.3328942731813      -664.9338600132476
ydot R:  0.043047295224  1.692940929883
Iter cpu             3.7           64.3   1


128-core isn't faster 32 times, it faster only 5 times.

Why this happens?

Best Vladimir.

Forum Regular
Nothing scales perfectly. Your calculation isn't big enough to expect reasonable scaling to 128 cores.

Clicked A Few Times
Re: MK-CCSD on Kogence isn't much faster as expected
Vladimir,

I suggest you try running same problem on 8 threads, 16 threads and 32 threads on Kogence and see how computational time is scaling. I am guessing may be able to see that performance improves linearly initially and sub-linearly later.

As Sean said, most computational algorithms do not scale linearly. Scaling is also problem dependent.
You also have to understand what is taking time. If you try to run a very small problem on a 128 core computer, you dont expect to see much benefit because overhead is more expense and your problem is small enough to not compensate for overhead.

Gets Around
Quote:JenniferCarter Oct 9th 11:13 pm
Vladimir,

I suggest you try running same problem on 8 threads, 16 threads and 32 threads on Kogence and see how computational time is scaling. I am guessing may be able to see that performance improves linearly initially and sub-linearly later.

As Sean said, most computational algorithms do not scale linearly. Scaling is also problem dependent.
You also have to understand what is taking time. If you try to run a very small problem on a 128 core computer, you dont expect to see much benefit because overhead is more expense and your problem is small enough to not compensate for overhead.


I apologize for the fact that I did not quite accurately formulate my question.
I would like to draw attention to the fact that only Multireference CC iterations not scaling linearly.

I want to ask you what settings I should choose for the MRCC method to achieve the highest performance on 64-128 cores

http://nwchemgit.github.io/index.php/Release66:TCE#State-Specific_Multireference_Coupled_Clu...

P.S. to Jennifer
NWCHEM has many methods of calculating and many parameters to adjust their performance. Possible default settings do not fit.
can You describe in more detail the topology of your computer and in particular the network through which the data is exchanged between processors and with which keys NWCHEM was compiled in accordance with this.

Best, Vladimir.


Forum >> NWChem's corner >> Running NWChem