Hi all,
I've tried some calculations with NWCHEM (version 6.5) on a local multi-node cluster and I get some weird results and a lot of abnormal terminations... The following simple calculation should converge in 5 steps.
echo
title "rhf_benchmark"
start rhf_benchmark
geometry noautoz noautosym nocenter
C 9.760800 10.312800 24.029200
C 9.415700 10.321500 21.238400
C 7.491800 8.241800 17.193500
C 6.891000 7.281600 17.961300
C 7.051900 7.269700 19.373900
C 7.425500 7.327000 22.254400
C 7.619300 7.343300 23.659900
C 8.399500 8.315500 24.236000
C 8.986300 9.289800 23.447000
C 8.810100 9.306000 22.037500
C 8.019000 8.278600 21.441200
C 10.330700 11.293700 23.276500
C 7.843600 8.264600 19.973200
C 10.175200 11.309900 21.877100
C 9.766800 11.289400 18.983400
C 9.602000 11.243900 17.573900
C 8.868700 10.252200 16.975600
C 8.269400 9.249700 17.759600
C 8.440100 9.254100 19.174100
C 9.222000 10.292300 19.789600
H 9.862800 10.328000 25.153100
H 7.394600 8.271100 16.123000
H 6.293300 6.495600 17.495200
H 6.579100 6.495600 20.017600
H 6.799000 6.571400 21.803400
H 7.116800 6.506400 24.335900
H 8.575300 8.314400 25.344800
H 10.866400 12.060200 23.881900
H 10.665200 12.125100 21.268700
H 10.390700 12.060200 19.472800
H 10.080400 12.027700 16.899900
H 8.742200 10.144000 15.911200
end
basis
* library "6-31+g*"
end
scf
direct
maxiter 500
vectors input atomic output rhf.movecs
end
task scf energy
When I run it on a single node, I get the proper convergence (also checked with my workstation and another HPC cluster I have access on):
iter energy gnorm gmax time
----- ------------------- --------- --------- --------
1 -764.1592324706 1.52D+00 2.03D-01 21.7
2 -764.3730890559 4.21D-01 4.49D-02 32.8
3 -764.3914294187 3.79D-02 7.22D-03 64.8
4 -764.3916250719 7.35D-04 1.09D-04 117.7
5 -764.3916251806 5.31D-06 9.45D-07 191.6
When trying multiple nodes, I have convergence issues. Of course for this simple case it does not matter. Larger, more demanding runs that combine SCF and DFT runs just crash... This is an example of an erratic behaviour:
iter energy gnorm gmax time
----- ------------------- --------- --------- --------
1 -764.1592324706 7.03D+01 2.95D+01 8.0
Setting level-shift to 334.61 to force positive preconditioner
2 -768.1858396399 5.41D+01 3.37D+01 77.6
Setting level-shift to 9.48 to force positive preconditioner
3 -778.0312416259 3.86D+01 1.39D+01 113.7
Setting level-shift to 8.03 to force positive preconditioner
4 -779.2666815470 5.25D+01 5.02D+01 123.3
5 -751.0850942711 2.43D+01 1.27D+01 209.3
Setting level-shift to 26.03 to force positive preconditioner
6 -763.7538899332 4.15D+00 3.25D+00 218.6
7 -764.3283623955 1.10D+00 3.23D-01 227.5
8 -764.3753745707 2.56D+00 5.05D-01 231.9
Setting level-shift to 114.24 to force positive preconditioner
9 -764.3854876192 4.25D-01 7.98D-02 261.4
ga_iter_lsolve: convergence stagnant ... aborting solve
Disabled NR: increased maxiter to 510
10 -764.3906628061 8.58D-02 1.62D-02 273.5
11 -764.3913533709 1.06D-01 2.77D-02 277.7
12 -764.3915627027 3.27D-02 6.39D-03 281.8
13 -764.3916095335 2.10D-02 5.43D-03 285.9
14 -764.3916196286 9.96D-03 2.44D-03 290.0
15 -764.3916228030 2.79D-03 5.27D-04 294.2
16 -764.3916233437 2.11D-03 5.17D-04 298.2
17 -764.3916235550 5.07D-04 1.08D-04 302.3
18 -764.3916235980 4.34D-04 6.34D-05 310.1
19 -764.3916236203 2.29D-04 2.04D-05 319.1
20 -764.3916236235 8.51D-05 8.99D-06 323.1
I'm pretty much convinced that it has to do with the way my cluster admins built NWCHEM. I contacted them but they couldn't really help me... The building specs they used are here:
https://github.com/UCL-RITS/rcps-buildscripts/blob/master/nwchem-6.5_install
This is the submission script I am using:
#!/bin/bash -l
#$ -S /bin/bash
#$ -l h_rt=01:00:00
#$ -l mem=1G
#$ -N nw64
#$ -pe mpi 64
#$ -wd <... directory ...>
module load python/2.7.9
module load nwchem/6.5-r26243/intel-2015-update2
module list
mpirun -np $NSLOTS -machinefile $TMPDIR/machines nwchem rhf.nw
We managed to fix the issue with the machinefile, but the calculations do not behave as they should...
Any suggestions will be greatly appreciated!!!
Best regards,
Orestis
|