Apparent bug in MP3 portion of MP4 calculation


Jump to page 12Next 16Last
Gets Around
echo

start molecule

title "CN"
charge 0

geometry units angstroms print xyz
 symmetry c1
 C  -0.61234453     0.00000000    -0.00000000
 N   0.52670530     0.00000000    -0.00000000
end

basis spherical
 * library cc-pVTZ
end

scf
 #uhf
 doublet
end

tce
 freeze atomic
 #mp2
 #mp3
 mp4
 #ccsd(t)
end

task tce energy


Rather to my surprise, the mp4 choice here is by far the costliest in RAM and CPU time. All tests ran with 2 CPU cores on a machine with 16 GB physical memory.

Method
GA max memory, megabytes
Wall clock time, seconds
MP2/ROHF2105.0
MP3/ROHF2106.7
MP4/ROHF1361125.4
CCSD(T)/ROHF19414.5
MP2/UHF2104.9
MP3/UHF2107.9
MP4/UHF76826.4
CCSD(T)/UHF19414.6


It is not too surprising that the more frequently used CCSD(T) method appears to be better optimized. It is surprising just how much RAM, especially, the MP4 method takes. It makes it challenging to do even 3 heavy atoms at the triple zeta level on a workstation. Why run MP4 at all when CCSD(T) is more accurate and better optimized? I need it for reproducing others' work, and to implement composite thermochemical methods that incorporate MP4 steps.

EDIT: is there a bug in the MP3 portion of the MP4 calculation? If I run MP3 by itself, the MP3 iterations are much faster than the MP3 portion of an MP4 calculation. Also the initial residuum starts out considerably higher. These excerpts are from the log of the triple zeta ROHF calculation, single-CPU run.

MP2 and MP3 steps of MP3 calculation:
 MBPT(2) iterations
 --------------------------------------------------------
 Iter          Residuum       Correlation     Cpu    Wall
 --------------------------------------------------------
    1   0.0689206701564  -0.3204021136745     0.0     0.0
    2   0.0378756234900  -0.3104112192009     0.0     0.0
    3   0.0316630900929  -0.3143286709281     0.0     0.0
    4   0.0289327875332  -0.3120686928720     0.0     0.0
    5   0.0270555227660  -0.3137666102899     0.0     0.0
 MICROCYCLE DIIS UPDATE:                    5                    5
    6   0.0000649608831  -0.3129966718043     0.0     0.0
    7   0.0000282990332  -0.3130017756141     0.0     0.0
    8   0.0000192080241  -0.3130016111436     0.0     0.0
    9   0.0000138044951  -0.3130017293396     0.0     0.0
   10   0.0000102865536  -0.3130016631116     0.0     0.0
 MICROCYCLE DIIS UPDATE:                   10                    5
   11   0.0000001100377  -0.3130016902778     0.0     0.0
   12   0.0000000500976  -0.3130016841266     0.0     0.0
 --------------------------------------------------------
 Iterations converged
 MBPT(2) correlation energy / hartree =        -0.313001684126637
 MBPT(2) total energy / hartree       =       -92.534571481517744

 MBPT(3) iterations
 --------------------------------------------------------
 Iter          Residuum       Correlation     Cpu    Wall
 --------------------------------------------------------
    1   0.1886264605800   0.0000000000000     0.1     0.1
    2   0.0330324992094   0.0023896036508     0.1     0.1
    3   0.0286129350116   0.0038524264758     0.1     0.1
    4   0.0326935969874   0.0025105584886     0.1     0.1
    5   0.0383018337784   0.0037702129485     0.1     0.1
 MICROCYCLE DIIS UPDATE:                    5                    5
    6   0.0002619247770   0.0031556789756     0.1     0.1
    7   0.0001527083672   0.0031643922847     0.1     0.1
    8   0.0001326724482   0.0031585938569     0.1     0.1
    9   0.0001224034744   0.0031637762660     0.1     0.1
   10   0.0001149708289   0.0031588265427     0.1     0.1
 MICROCYCLE DIIS UPDATE:                   10                    5
   11   0.0000004758086   0.0031612687709     0.1     0.1
   12   0.0000002241956   0.0031612568480     0.1     0.1
   13   0.0000002603418   0.0031612656238     0.1     0.1
   14   0.0000003050224   0.0031612580448     0.1     0.1
   15   0.0000003573232   0.0031612649036     0.1     0.1
 MICROCYCLE DIIS UPDATE:                   15                    5
   16   0.0000000018320   0.0031612615990     0.1     0.1
 --------------------------------------------------------
 Iterations converged
 MBPT(3) correlation energy / hartree =         0.003161261598981
 MBPT(3) total energy / hartree       =       -92.531410219918769


MP2 and MP3 steps of MP4 calculation, same system:
 MBPT(2) iterations
 --------------------------------------------------------
 Iter          Residuum       Correlation     Cpu    Wall
 --------------------------------------------------------
    1   0.0689206701564  -0.3204021136745     0.0     0.0
    2   0.0378756234900  -0.3104112192009     0.0     0.0
    3   0.0316630900929  -0.3143286709281     0.0     0.0
    4   0.0289327875332  -0.3120686928720     0.0     0.0
    5   0.0270555227660  -0.3137666102899     0.0     0.0
 MICROCYCLE DIIS UPDATE:                    5                    5
    6   0.0000649608831  -0.3129966718043     0.0     0.0
    7   0.0000282990332  -0.3130017756141     0.0     0.0
    8   0.0000192080241  -0.3130016111436     0.0     0.0
    9   0.0000138044951  -0.3130017293396     0.0     0.0
   10   0.0000102865536  -0.3130016631116     0.0     0.0
 MICROCYCLE DIIS UPDATE:                   10                    5
   11   0.0000001100377  -0.3130016902778     0.0     0.0
   12   0.0000000500976  -0.3130016841266     0.0     0.0
 --------------------------------------------------------
 Iterations converged
 MBPT(2) correlation energy / hartree =        -0.313001684126637
 MBPT(2) total energy / hartree       =       -92.534571481517744

 MBPT(3) iterations
 --------------------------------------------------------
 Iter          Residuum       Correlation     Cpu    Wall
 --------------------------------------------------------
    1   0.8387396286641   0.0000000000000     7.7     7.8
    2   0.0378480408875   0.0023896036265     7.7     7.8
    3   0.0278027039909   0.0037728361587     7.7     7.8
    4   0.0323941245749   0.0024673134418     7.7     7.7
    5   0.0377463575074   0.0036942361183     7.7     7.7
 MICROCYCLE DIIS UPDATE:                    5                    5
    6   0.0004768057635   0.0030932877697     7.7     7.7
    7   0.0002341277283   0.0031081837819     7.7     7.7
    8   0.0002188813889   0.0030970036896     7.7     7.7
    9   0.0002076640543   0.0031072000626     7.7     7.7
   10   0.0001976053509   0.0030976477284     7.7     7.7
 MICROCYCLE DIIS UPDATE:                   10                    5
   11   0.0000009204401   0.0031022871190     7.7     7.7
   12   0.0000002259750   0.0031022873689     7.7     7.7
   13   0.0000001624172   0.0031022948900     7.7     7.7
   14   0.0000001545262   0.0031022880045     7.7     7.7
   15   0.0000001799059   0.0031022943861     7.7     7.7
 MICROCYCLE DIIS UPDATE:                   15                    5
   16   0.0000000019473   0.0031022912791     7.7     7.7
 --------------------------------------------------------
 Iterations converged
 MBPT(3) correlation energy / hartree =         0.003102291279072
 MBPT(3) total energy / hartree       =       -92.531469190122465


In the second excerpt from the MP4 job, note that the MP3 iterations take ~70 times as long as in the stand-alone MP3 job. In fact, the MP3 iterations of the MP4 job are each considerably slower than the MP4 iterations. That can't be right.

Forum Vet
Based on the input provided here, it seems to confirm Dr. Mernst's findings; furthermore, the MBPT(4) step goes faster than the MBPT(3) one in the MP4 calculation.

Gets Around
I do not have a doctorate. I am not even sure that it is a bug, to be honest. I don't know if the commonly cited asymptotic complexity of MP4 is valid for the more flexible MBPT4, but I would guess that MBPT4 is more demanding. I haven't examined the MBPT4 code in detail so for all I know the bulk of the work is by necessity in the MBPT3 iterations, though it leads to some confusing results.

The only issue with MP4 that seems to for-sure be a bug is that it once worked with 2eorb, in NWChem 6.0, but hasn't worked with 2eorb since NWChem 6.1: http://nwchemgit.github.io/Special_AWCforum/st/id1755/bug%3A_2eorb_used_to_work_wi...

Gets Around
It's not a bug in the traditional sense. It is a defect in how Moller-Plesset methods were implemented in TCE. For example, MP4 requires the storage of quadruples amplitudes. This is not necessary for canonical methods, but TCE was designed to be fully general, and such an approach may be required for nonorthogonal basis sets.

https://github.com/jeffhammond/nwchem/blob/master/QA/tests/tce_h2o/mp4sdq_h2o.nw demonstrates the "mbpt4sdq(t)" method, which is equivalent in cost to CCSD(T), as should be the case when RHF orbitals are used, and may be true to UHF and/or ROHF, but I have not verified. You can compare to GAMESS to know for sure.

Forum Vet
The calculation using NWChen6.6 finishes very fast.

Forum Vet
Even in a serial run, GAMESS finishes faster with a lot of worked out coupled cluster energies together, none of which is close to those obtained by NWChem,
I think the total energy may not be very meaningful, furthermore, what is important is the
one derived from it and able to be compared with an experimental result.

Gets Around
GAMESS and NWChem should agree. I think your input files are not equivalent.

Forum Vet
I have changed the default convergence limit of GAMESS into 10^-10 and only carried out ccsd(t), but still get the same results.

Gets Around
Ok, you appear to not have spent much time thinking about how to use NWChem. CCSD(T) and CCSDT are different methods. If you want to compare NWChem CCSD(T) against GAMESS CCSD(T), you need to run CCSD(T), not CCSDT.

Forum Vet
Of course, ccsd(t) and full ccsdt are different, and what I have compared is the ccsd(t) results of GAMESS and NWCHEM, respectively.

Forum Vet
Of course, ccsd(t) and full ccsdt are different, and what I have compared are the ccsd(t) results of GAMESS and NWCHEM, respectively. GAMESS cannot do ccsdt.
Why I give the CCSDT result? That is for the comparison of the two methods using NWChem.

Gets Around
Then why are you talking about CCSD(T) and CCSDT w.r.t. being iterative or not?

Please just put complete input and ouput files for *both* codes on https://gist.github.com/ so we can resolve this.

Forum Vet
Quote:Jhammond Nov 30th 7:03 am
Then why are you talking about CCSD(T) and CCSDT w.r.t. being iterative or not?

Please just put complete input and ouput files for *both* codes on https://gist.github.com/ so we can resolve this.


I think the iterative CCSDT is a benchmark in some cases, and both softwares are correct.

Sorry, the website above is not accessible to me.

Gets Around
When you say things like "GAMESS finishes faster with a lot of worked out coupled cluster energies together, none of which is close to those obtained by NWChem", you give the impression that GAMESS is both faster and more accurate, which is not even wrong if you are comparing CCSD(T) to CCSDT.

This thread is far from its original purpose. Let's not comment further unless something useful is being said.

Forum Vet
I declare no such impression should be intended to be given or implied.


Forum >> NWChem's corner >> Running NWChem
Jump to page 12Next 16Last