Solved: Nwchem 6.3 running 2-5 times slower than 6.1.1


Click here for full thread
Gets Around
Edo,
I've started with fresh sources, and have compared patched and unpatched versions (with openblas or acml).

It worked.
I get
  patched: 10,8,6,4,3 scf steps
unpatched: 10,8,7,7,7,5 scf steps

I also confirmed that unpatched 6.3, in addition to doing more SCF cycles per optimisation step, also does one extra set of cycles (1-7-1-7) during the second DFT geometry optimisation step:
(cat test.out|egrep "d= 0|@")

Unpatched version
@    2    -230.10089843 -4.4D-04  0.00174  0.00035  0.00946  0.02027     30.0
 d= 0,ls=0.0,diis     1   -230.1009079200 -4.33D+02  1.34D-04  2.13D-05    29.3
 d= 0,ls=0.0,diis     2   -230.1009116988 -3.78D-06  3.31D-05  6.41D-06    30.1
 d= 0,ls=0.0,diis     3   -230.1009113295  3.69D-07  2.19D-05  8.30D-06    30.9
 d= 0,ls=0.0,diis     4   -230.1009124163 -1.09D-06  7.07D-06  5.27D-07    31.7
 d= 0,ls=0.0,diis     5   -230.1009124888 -7.24D-08  1.69D-06  8.46D-08    32.4
 d= 0,ls=0.0,diis     6   -230.1009124997 -1.10D-08  4.71D-07  7.33D-10    33.2
 d= 0,ls=0.0,diis     7   -230.1009124998 -6.67D-11  3.56D-07  2.42D-10    34.0
 d= 0,ls=0.0,diis     1   -230.1009125958 -4.33D+02  1.55D-05  2.78D-07    34.9
 d= 0,ls=0.0,diis     2   -230.1009126449 -4.91D-08  3.61D-06  8.82D-08    35.7
 d= 0,ls=0.0,diis     3   -230.1009126412  3.69D-09  2.42D-06  1.06D-07    36.5
 d= 0,ls=0.0,diis     4   -230.1009126548 -1.36D-08  7.88D-07  6.60D-09    37.3
 d= 0,ls=0.0,diis     5   -230.1009126557 -8.82D-10  2.04D-07  1.32D-09    38.0
 d= 0,ls=0.0,diis     6   -230.1009126558 -1.67D-10  3.35D-08  7.94D-11    38.8
 d= 0,ls=0.0,diis     7   -230.1009126558  1.36D-12  1.38D-08  7.66D-11    39.6
@    3    -230.10091266 -1.4D-05  0.00021  0.00004  0.00193  0.00737     44.0


Patched version:
@    2    -230.10090574 -2.5D-04  0.00126  0.00030  0.00679  0.01605     29.3
 d= 0,ls=0.0,diis     1   -230.1009101932 -4.33D+02  1.13D-04  1.13D-05    28.7
 d= 0,ls=0.0,diis     2   -230.1009123213 -2.13D-06  2.11D-05  2.38D-06    29.4
 d= 0,ls=0.0,diis     3   -230.1009122054  1.16D-07  1.34D-05  2.98D-06    30.2
 d= 0,ls=0.0,diis     4   -230.1009125831 -3.78D-07  4.50D-06  2.33D-07    31.0
@    3    -230.10091258 -6.8D-06  0.00029  0.00006  0.00150  0.00369     35.4


This is reproducible -- I've tested it on five different machines, using intel or amd, and linked against acml or openblas.

Anyway, the patch makes the performance of 6.3 comparable to that of 6.1.1.
Thanks again!