I ran a test job that another NWChem user has identified as sensitive to BLAS choice:
http://verahill.blogspot.com/2012/09/my-own-personal-benchmarks-for-nwchem.html
http://verahill.blogspot.com.au/2012/09/new-compute-node-using-amd-fx-8150.html
Title "Test 1"
Start biphenyl_cation_twisted-1
echo
charge 1
geometry autosym units angstrom
C 0.00000 -3.54034 0.00000
C -1.20296 -2.84049 -0.216000
C -1.20944 -1.46171 -0.206253
C 0.00000 -0.721866 0.00000
C 1.20944 -1.46171 0.206253
C 1.20296 -2.84049 0.216000
C 0.00000 0.721866 0.00000
C 1.20944 1.46171 -0.206253
C 1.20296 2.84049 -0.216000
C -1.20944 1.46171 0.206253
C 0.00000 3.54034 0.00000
C -1.20296 2.84049 0.216000
H 0.00000 -4.62590 0.00000
H -2.12200 -3.38761 -0.395378
H -2.13673 -0.938003 -0.401924
H 2.12200 -3.38761 0.395378
H 2.12200 3.38761 -0.395378
H -2.13673 0.938003 0.401924
H 0.00000 4.62590 0.00000
H -2.12200 3.38761 0.395378
H 2.13673 0.938003 -0.401924
H 2.13673 -0.938003 0.401924
end
nwpw
simulation_cell
lattice_vectors
2.000000e+01 0.000000e+00 0.000000e+00
0.000000e+00 2.000000e+01 0.000000e+00
0.000000e+00 0.000000e+00 2.000000e+01
end
mult 2
np_dimensions -1 -1
tolerances 1e-7 1e-7
end
driver
default
end
task pspw optimize
The job completed in 1640 seconds when NWChem was compiled with framework Accelerate and 1670 seconds when using OpenBLAS 0.28. This was using 4 processor cores, Intel i7-4770HQ CPU @ 2.20GHz, running OS X Yosemite, GNU compilers. The mp2_si2h6 QA job showed a smaller performance difference. If there is an even more BLAS-heavy job I could try for comparison I would like to do so.
The wiki unfortunately gives bad advice for BLASOPT on Yosemite which results not just in a slow build but (IME) build failure: http://nwchemgit.github.io/index.php/Compiling_NWChem
|