I agree that your runs are not using the Cuda code. If the Cuda code is used, you should see the following lines in your output
Using CUDA CCSD(T) code
Using 1 device per node
Instead, this is what you are getting
Using plain CCSD(T) code
My suggestion is to try to recompile the Cuda code. The quickest way is the following
export TCE_CUDA=y
cd $NWCHEM_TOP/src/tce/ccsd_t
touch `egrep -l TCE_CUDA *`
make
cd ..
touch `egrep -l TCE_CUDA *`
make
cd ..
make link
|