Re: Beta version of NWChem 6.5 available for testing


Click here for full thread
Gets Around
li2h2_tce_ccsd
I have built my 6.5 beta with gfortran 4.8.2, default compiler settings, on 64 bit Ubuntu 14.04 with OpenBLAS 0.2.11 as BLAS.

The li2h2_tce_ccsd completes during serial execution. The test evaluation fails, but I think that's the fault of the evaluation procedure rather than the actual output. Ignoring the first 6 normal modes, the results are in good agreement with the reference data:

Reference:
                   7           8           9          10          11          12
 
 Frequency        489.60      573.34      865.88      937.51     1072.22     1146.66
 
           1     0.10486     0.00000     0.00000     0.00000     0.65864    -0.69651
           2     0.00000     0.00000     0.64073    -0.65864     0.00000     0.00000
           3     0.00000     0.65862     0.00000     0.00000     0.00000     0.00000
           4     0.00000     0.00000    -0.11088     0.00000    -0.09461     0.00000
           5     0.26398     0.00000     0.00000     0.09461     0.00000     0.03974
           6     0.00000    -0.09463     0.00000     0.00000     0.00000     0.00000
           7     0.00000     0.00000     0.11088     0.00000    -0.09461     0.00000
           8    -0.26398     0.00000     0.00000     0.09461     0.00000    -0.03974
           9     0.00000    -0.09463     0.00000     0.00000     0.00000     0.00000
          10    -0.10486     0.00000     0.00000     0.00000     0.65864     0.69651
          11     0.00000     0.00000    -0.64073    -0.65864     0.00000     0.00000
          12     0.00000     0.65862     0.00000     0.00000     0.00000     0.00000


Computed by 6.5 beta:
                    7           8           9          10          11          12
 
 Frequency        489.60      573.32      865.88      937.51     1072.22     1146.66
 
           1     0.10486     0.00000    -0.00000    -0.00000    -0.65864    -0.69651
           2    -0.00000     0.00000     0.64073    -0.65864     0.00000     0.00000
           3     0.00000     0.65862     0.00000     0.00000     0.00000     0.00000
           4     0.00000     0.00000    -0.11088     0.00000     0.09461     0.00000
           5     0.26398     0.00000     0.00000     0.09461    -0.00000     0.03974
           6     0.00000    -0.09463     0.00000     0.00000     0.00000     0.00000
           7    -0.00000     0.00000     0.11088    -0.00000     0.09461    -0.00000
           8    -0.26398     0.00000     0.00000     0.09461    -0.00000    -0.03974
           9     0.00000    -0.09463     0.00000     0.00000     0.00000     0.00000
          10    -0.10486     0.00000    -0.00000     0.00000    -0.65864     0.69651
          11     0.00000     0.00000    -0.64073    -0.65864    -0.00000     0.00000
          12     0.00000     0.65862     0.00000     0.00000     0.00000     0.00000


This test stalls part way through when run with parallel execution. I am just running on a single machine using 2 or 4 cores in parallel, ARMCI_NETWORK=SOCKETS. The CPUs remain pinned at 100% but the job does not continue to make progress after about half of the CCSD energy calculations are complete. I first noticed this problem after the introduction of commit 25937 to tce_energy.F. If I revert that one file to the prior revision but otherwise leave the 6.5 beta code as-is, the li2h2_tce_ccsd test completes normally in parallel execution. Strangely none of the other TCE tests I have run stall in parallel. I don't know what is special about the interaction between this job and commit 25937.