Ecce: Segmentation fault when reconnecting job monitoring


Just Got Here
After successfully using Ecce for years, I have recently started getting a reoccurring segmentation fault when reconnecting the job monitoring after Ecce reports a monitor error. The state switches from error to submitted, then the following appears on the terminal:

sh: line 1: 4776 Segmentation fault (core dumped) nohup ./eccejobstore -jobId 398.erbium.etsu.edu -configFile /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.conf -pipe /tmp/ecce_skirkby/AuthPipe.cHiZr7 > /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.log 2>&1
sh: line 1: 4796 Segmentation fault (core dumped) nohup ./eccejobstore -jobId 398.erbium.etsu.edu -configFile /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.conf -pipe /tmp/ecce_skirkby/AuthPipe.JLi5zX -restart > /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.log.restart_1 2>&1
sh: line 1: 4809 Segmentation fault (core dumped) nohup ./eccejobstore -jobId 398.erbium.etsu.edu -configFile /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.conf -pipe /tmp/ecce_skirkby/AuthPipe.Zfd2zO -restart > /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.log.restart_2 2>&1
sh: line 1: 4824 Segmentation fault (core dumped) nohup ./eccejobstore -jobId 398.erbium.etsu.edu -configFile /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.conf -pipe /tmp/ecce_skirkby/AuthPipe.ioqekG -restart > /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.log.restart_3 2>&1
sh: line 1: 4845 Segmentation fault (core dumped) nohup ./eccejobstore -jobId 398.erbium.etsu.edu -configFile /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.conf -pipe /tmp/ecce_skirkby/AuthPipe.jgN5Rz -restart > /tmp/ecce_skirkby/jobs/Cl2-trans-cyclo_DFT_geom_aug-cc-pVQZ__xAYNzI/eccejobstore.log.restart_4 2>&1

and the job stays listed as submitted until it switches back to error regardless of whether the job is queued, running or completed. This is happening for several jobs. Except for the particulars of the job, the output to the terminal is the same. Any insight would be appreciated.

Thanks.


Forum >> ECCE: Extensible Computational Chemistry Environment >> General ECCE Topics