QA tests failing

Click here for full thread

Gets Around

7:38:56 PM PDT - Tue, Aug 12th 2014

(I had to split my post in two because otherwise the forum gave an error)

You will see a bunch of output like this, with the most severe failures at the bottom:

[{'basic_status': 'failed',
  'name': 'tce_mrcc_bwcc_subgroups',
  'reference': '/home/niels/QA/testoutputs/tce_mrcc_bwcc_subgroups.ok.out.nwparse',
  'score': (0, 3.000053538926295e-10),
  'trial': '/home/niels/QA/testoutputs/tce_mrcc_bwcc_subgroups.out.nwparse'},
 {'basic_status': 'failed',
  'name': 'pspw_md',
  'reference': '/home/niels/QA/testoutputs/pspw_md.ok.out.nwparse',
  'score': (0, 1.9999999999242846e-05),
  'trial': '/home/niels/QA/testoutputs/pspw_md.out.nwparse'},
 {'basic_status': 'failed',
  'name': 'sadsmall',
  'reference': '/home/niels/QA/testoutputs/sadsmall.ok.out.nwparse',
  'score': (0, 0.00010000000000001674),
  'trial': '/home/niels/QA/testoutputs/sadsmall.out.nwparse'},
 {'basic_status': 'failed',
  'name': 'autosym',
  'reference': '/home/niels/QA/testoutputs/autosym.ok.out.nwparse',
  'score': (0, 0.00010000000020227162),
  'trial': '/home/niels/QA/testoutputs/autosym.out.nwparse'},
 {'basic_status': 'failed',
  'name': 'ch3radical_rot',
  'reference': '/home/niels/QA/testoutputs/ch3radical_rot.ok.out.nwparse',
  'score': (0, 0.0009999999999763531),
  'trial': '/home/niels/QA/testoutputs/ch3radical_rot.out.nwparse'},
 {'basic_status': 'failed',
  'name': 'ch3radical_unrot',
  'reference': '/home/niels/QA/testoutputs/ch3radical_unrot.ok.out.nwparse',
  'score': (0, 0.0009999999999763531),
  'trial': '/home/niels/QA/testoutputs/ch3radical_unrot.out.nwparse'},
 {'basic_status': 'failed',
  'name': 'prop_ch3f',
  'reference': '/home/niels/QA/testoutputs/prop_ch3f.ok.out.nwparse',
  'score': (0, 0.0009999999999976694),
  'trial': '/home/niels/QA/testoutputs/prop_ch3f.out.nwparse'},
 {'basic_status': 'failed',
  'name': 'h2o-response',
  'reference': '/home/niels/QA/testoutputs/h2o-response.ok.out.nwparse',
  'score': (0, 0.01200000000000001),
  'trial': '/home/niels/QA/testoutputs/h2o-response.out.nwparse'}]
Total 140 passed 132 failed 8

The score for each failed test is a tuple:

(num_gross_failures, total_numeric_deviation)

Gross failures are rarer and bear more scrutiny. They indicate that e.g. reference and trial outputs had different numbers of numeric values in a line of output, or that an output file is entirely missing some section that belongs in the .nwparse. The numeric deviation part of the score is just a sum of all absolute numeric differences between the reference and trial nwparse files.

Run 'diff' on the trial and reference files to see how bad the problem really is, e.g.:

niels@bohr:~/QA$ diff /home/niels/QA/testoutputs/prop_ch3f.ok.out.nwparse /home/niels/QA/testoutputs/prop_ch3f.out.nwparse
169c169
< anisotropy = 37.128
---
> anisotropy = 37.129

I don't think I am going to worry about such a minor difference.

This appears more serious:

niels@bohr:~/QA$ diff /home/niels/QA/testoutputs/h2o-response.ok.out.nwparse /home/niels/QA/testoutputs/h2o-response.out.nwparse
4c4
< Anisotropic = 2.693
---
> Anisotropic = 2.705

That looks significant to me. But the reference file was generated with NWChem 6.1, which is rather an old release. Has the code changed since 6.1 so that the reference value needs updating? Is there excessive numerical error in my result, due to how the code was built? Answering this latter question is particularly difficult now that there are no official binary builds to check against.