Slow frequency calculation


Gets Around
System: NWChem 6.3, 64-bit CentOS 6.5, 32 GB memory, dual quad-core Xeon node
WebMO 14 job queue manager

Input file:

scratch_dir /scratch/webmo-4656/3420
title "Pyr-6311ppG2d2p-B3LYP-OptFreq-NW"
charge 0
geometry
C 0.00000000 0.00000000 0.00000000
N 0.00000000 1.13688500 -0.69228100
C 0.00000000 2.27377000 0.00000000
C 0.00000000 2.32857900 1.38780800
C 0.00000000 1.13688500 2.09612400
C 0.00000000 -0.05480900 1.38780800
H 0.00000000 -1.01099400 1.89475400
H 0.00000000 1.13688500 3.17952800
H 0.00000000 3.28476400 1.89475400
H 0.00000000 3.18852600 -0.58437300
H 0.00000000 -0.91475600 -0.58437300
end
basis noprint
 * library 6-311++G(2d,2p)
end
dft
 XC b3lyp
 mult 1
end
task dft optimize
task dft freq



Executing command: mpirun -machinefile /scratch/webmo-4656/johnkeller/3420/nodes -np 8 /usr/local/nwchem/bin/nwchem input.inp

This job finished correctly, but required 1h 35 min! while G09 took 10 min. Is this speed difference normal? Can anyone see problems in the input file, or machine setup, that would lead to the slow execution by NWC?

Gets Around
NWChem may be doing a little extra work if you use default tolerances in both programs: https://getd.libs.uga.edu/pdfs/papas_brian_n_200605_phd.pdf

The DFT accuracy comparison there shows that NWChem settings tend to a bit higher accuracy than Gaussian for DFT.
There may be some speedups possible but I don't think you are doing anything particularly wrong to see these slow results with NWChem. Gaussian runs very efficiently on small computers, with the disadvantage of poor parallel scaling.

NWChem's price and openness can't be beat but I would not expect it to best Gaussian on execution speed except for large calculations on large parallel machines. I use it on small machines nonetheless because its capabilities are rich, I don't need to advise other people to ask for software licenses to reproduce my results, and I find its input syntax far more sane than Gaussian or GAMESS. If you already have Gaussian licenses then you needn't worry about the expense, and Gaussian is the 800 pound gorilla of published computational chemistry anyway so you get reproducibility (of a sort) "for free."

Gets Around
I partially retract my previous remarks: this job ran on a quad-core Intel i7 laptop in 24 minutes when I used all 4 cores with NWChem 6.5. It ran in 26 minutes when I used 2 cores.

I still expect Gaussian to be faster than NWChem on small jobs like this, but your result seems to be excessively slow. It may be that you have reached negative returns on parallelism by using 8 cores. The parallel scaling is obviously poor. What sort of speed do you get with 2 or 4 cores?

EDIT: also, given the poor parallel scaling, if your Xeon system is older it might be slower to complete a single calculation than a recent fast laptop. What is the model name reported for the CPUs in /proc/cpuinfo ?


Forum >> NWChem's corner >> General Topics