6.6 cosmo segmentation violation


Clicked A Few Times
I just upgraded to 6.6 (Ubuntu package 6.6+r27746-2), and things seemed ok until I tried a COSMO run. My input file:


start job

memory 2500 mb
scratch_dir /scratch

geometry
 N-hi                 -1.84595152    -1.71169857    -0.80468442
 H-hi                 -2.23135939    -2.13842499    -1.63435909
 H-hi                 -2.13056247    -2.19277487     0.03616278
 C-lo                 -1.99497458     1.82126368    -1.89647314
 C-lo                 -1.97147556     0.42852243    -1.94312899
 C-lo                 -1.93620542    -0.32250179    -0.75546229
 C-lo                 -1.93262749     0.35533944     0.47496152
 C-lo                 -1.95743521     1.74856253     0.51065998
 C-lo                 -1.98501983     2.49393120    -0.67116188
 H-lo                 -2.00395507     3.58461879    -0.63815448
 H-lo                 -2.02184088     2.38682430    -2.83047244
 H-lo                 -1.97716709    -0.09286832    -2.90333700
 H-lo                 -1.90489042    -0.22309453     1.40155848
 H-lo                 -1.95262993     2.25741650     1.47728966
end

basis
  N-hi  library def2-tzvp
  H-hi  library def2-tzvp
  C-lo  library def2-svp
  H-lo  library def2-svp
end

dft
  xc m06-2x
  grid fine
  iterations 50
end

cosmo
  dielec 5.6968
end

driver
  maxiter 100
  xyz strct
end

task dft optimize



The job ends in a segmentation violation error at the end of the COSMO gas phase calculation:



   convergence    iter        energy       DeltaE   RMS-Dens  Diis-err    time
 ---------------- ----- ----------------- --------- --------- ---------  ------
     COSMO gas phase
 d= 0,ls=0.0,diis     1   -287.2831281650 -5.58D+02  5.47D-03  5.30D-01   104.3
 Grid integrated density:      50.000013592439
 Requested integration accuracy:   0.10E-06
 d= 0,ls=0.0,diis     2   -287.3369826790 -5.39D-02  1.71D-03  7.43D-02   183.5
 Grid integrated density:      50.000013799568
 Requested integration accuracy:   0.10E-06
 d= 0,ls=0.0,diis     3   -287.3411124499 -4.13D-03  8.21D-04  3.53D-02   262.6
 Grid integrated density:      50.000013805174
 Requested integration accuracy:   0.10E-06
 d= 0,ls=0.0,diis     4   -287.3445806482 -3.47D-03  3.29D-04  1.26D-03   351.3
 Grid integrated density:      50.000013816685
 Requested integration accuracy:   0.10E-06
 d= 0,ls=0.0,diis     5   -287.3447274923 -1.47D-04  1.34D-04  2.06D-04   435.1
 Grid integrated density:      50.000013812989
 Requested integration accuracy:   0.10E-06
  Resetting Diis
 d= 0,ls=0.0,diis     6   -287.3447545517 -2.71D-05  2.41D-05  1.15D-05   538.5
 Grid integrated density:      50.000013813024
 Requested integration accuracy:   0.10E-06
 d= 0,ls=0.0,diis     7   -287.3447558069 -1.26D-06  8.43D-06  9.35D-07   630.1
 Grid integrated density:      50.000013815859
 Requested integration accuracy:   0.10E-06
 d= 0,ls=0.0,diis     8   -287.3447558024  4.49D-09  4.03D-06  1.16D-06   716.6
0:Segmentation Violation error, status=: 11
(rank:0 hostname:BeastOfBurden pid:20235):ARMCI DASSERT fail. ../../ga-5-4/armci/src/common/signaltrap.c:SigSegvHandler():315 cond:0
Last System Error Message from Task 0:: Numerical result out of range
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 4 DUP FROM 0 
with errorcode 11.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
Last System Error Message from Task 1:: Numerical result out of range
[BeastOfBurden:20233] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[BeastOfBurden:20233] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages




When I don't use COSMO, the job succeeds. I tried changing the memory settings, but that didn't help.
As I understand, this package already has the cosmo_meminit-patch applied (if that matters).

Forum Vet
Unfortunately, I have not been able to reproduce your failure with my 6.6 builds.
Could you try to run it by adding the "direct" keyword in the dft field to see if it makes any difference? How many processors are you using?

Clicked A Few Times
cosmo runs
Ivo,

Out of curiosity I checked your input too. It also runs runs fine on my workstation with build 6.6.

Your failure report indicates some issues with the ARMCI settings of your compilation.

Regards
Manfred

Forum Vet
Dear Drs. Ivo, Edoapra and Manfred

    Employing NWChem6.6 on MAC OS X EI Capitan 10.11.3 with mpich 3.1.4_1 installed, a
parallel run of the optimization by three cores through the original and unaltered input
converged at the 9th step and the following is obtained

   ...
----------------------
Optimization converged
----------------------


 Step       Energy      Delta E   Gmax     Grms     Xrms     Xmax   Walltime
---- ---------------- -------- -------- -------- -------- -------- --------
@ 9 -287.35442308 -1.5D-07 0.00006 0.00001 0.00038 0.00179 1042.2
                                    ok       ok       ok       ok  

   ...

Best regards!

Forum Vet
debian unstable OK
I have tried the input shown here in a debian sid/unstable installation with the 6.6+r27746-2 package and the code did not stop with the Segv reported.

Clicked A Few Times
In that case it must be my installation.

I tried running with "direct", but it gives the same result.

I also tried reinstalling the package; that didn't help either.


My machine is a workstation with two 12-core processors with hyperthreading. With nwchem 6.5 and openmpi 1.6 it was most efficient to run with 44 processes. Nwchem 6.6 needs openmpi 1.10, and it looks like the behavior changed a bit. I'm not quite sure yet what I the optimal way is now. I tried from 2 to 44 processes, but always get the above behavior.

Clicked A Few Times
I solved the problem by updating the libgcc1 and libgfortran3 packages and their dependencies to the latest version.

Now everything seems to work ok.



Forum >> NWChem's corner >> Running NWChem