CCSD(T) - error for scandium (quartet) energy calculation


Clicked A Few Times
Dear developers and users of nwchem,

Please give me some advice on how to solve my problem. I think it is due to tce section... but not really sure.

I got an error while i was running the " task tce energy" job for bare Scandium neutral atom but as a quartet at CCSD(T)/aug-cc-pVTZ level of theory. I need to compute the energies for doublet, quartet, sextet and octet spin states. For doublet, which suppose to be the ground state the job goes well. The problem appears for the other spin states.

This is my input file and error at the bottom:

echo
start Sc
title "Sc, CCSD(T)/aug-cc-pVTZ "
memory stack 2000 mb heap 100 mb global 2000 mb noverify
charge 0

geometry
  Sc         0.00000000     0.00000000     0.00000000
end

basis spherical
Sc library aug-cc-pvdz
end
scf
quartet ; rohf
end

tce
freeze atomic
ccsd(t)
maxiter 1100
diis 3
lshift 0.2
thresh 1.0d-5
tilesize 20
attilesize 30
io ga
2eorb
2emet 13
end

task tce energy

=======
And the error:

MICROCYCLE DIIS UPDATE:                  1080                     3
1081 0.0016079076448 -0.0009315896341 0.2 0.3 0.0
1082 0.0016095314389 -0.0009810957862 0.3 0.3 0.0
1083 0.0016279154335 -0.0010339840807 0.3 0.3 0.0
MICROCYCLE DIIS UPDATE: 1083 3
1084 0.0016006063764 -0.0009320271033 0.3 0.3 0.0
1085 0.0016014997985 -0.0009812648545 0.2 0.3 0.0
1086 0.0016196289995 -0.0010338652003 0.3 0.3 0.0
MICROCYCLE DIIS UPDATE: 1086 3
1087 0.0015934780901 -0.0009325592240 0.3 0.3 0.0
1088 0.0015950860391 -0.0009816016329 0.2 0.3 0.0
1089 0.0016132812076 -0.0010339933662 0.2 0.3 0.0
MICROCYCLE DIIS UPDATE: 1089 3
1090 0.0015862514481 -0.0009329983491 0.2 0.3 0.0
1091 0.0015871359082 -0.0009817751789 0.2 0.3 0.0
1092 0.0016050790426 -0.0010338820084 0.2 0.3 0.0
MICROCYCLE DIIS UPDATE: 1092 3
1093 0.0015791953957 -0.0009335294641 0.3 0.3 0.0
1094 0.0015807876642 -0.0009821130121 0.2 0.3 0.0
1095 0.0015987962712 -0.0010340134423 0.2 0.3 0.0
MICROCYCLE DIIS UPDATE: 1095 3
1096 0.0015720424785 -0.0009339701313 0.3 0.3 0.0
1097 0.0015729180674 -0.0009822908877 0.3 0.3 0.0
1098 0.0015906773647 -0.0010339094215 0.2 0.3 0.0
MICROCYCLE DIIS UPDATE: 1098 3
1099 0.0015650577981 -0.0009345001829 0.2 0.3 0.0
1100 0.0015666345486 -0.0009826296927 0.2 0.3 0.0
0 ga offset 0 size_xx_perproc 35mx 4
1 ga offset 35 size_xx_perproc 35mx 4
3 ga offset 105 size_xx_perproc 36mx 4
2 ga offset 70 size_xx_perproc 35mx 4
WRITE TENSOR
filename: ./Sc.t1amp.000
unit nr: 79
file size: 35
rec_mem (KB): 1
rec_size: 128
number of tasks: 1
0 ga offset 0 size_xx_perproc 3314mx 4
WRITE TENSOR
filename: ./Sc.t2amp.000
unit nr: 80
file size: 3314
rec_mem (KB): 1
rec_size: 128
number of tasks: 26
1 ga offset 3314 size_xx_perproc 3314mx 4
2 ga offset 6628 size_xx_perproc 3314mx 4
3 ga offset 9942 size_xx_perproc 3315mx 4
------------------------------------------------------------------------
ccsd_energy_loc: maxiter exceeded 1101
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
32: task tce energy
------------------------------------------------------------------------
ccsd_energy_loc: maxiter exceeded 1101
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
0:
------------------------------------------------------------------------
------------------------------------------------------------------------
This type of error is most commonly associated with calculations not reaching
convergence criteria
----------------------

Forum Vet
Try this tce input section
tce
 freeze atomic
 ccsd(t)
 maxiter 1100
 lshift 0.4
 thresh 1.0d-5
 tilesize 24
 attilesize 30
 io ga
 2eorb
 2emet 13
 diis 11
end

Clicked A Few Times
Quote:Edoapra Nov 8th 3:35 pm
Try this tce input section
tce
 freeze atomic
 ccsd(t)
 maxiter 1100
 lshift 0.4
 thresh 1.0d-5
 tilesize 24
 attilesize 30
 io ga
 2eorb
 2emet 13
 diis 11
end


Dear Edoapra,

Thank you for your suggestion, I've run the job with new tce section , and again it ends up with different but quite similar error.
Could it be also something with memory limits?
Should I increase number of nodes or memory size ?

I wonder also if it is possible to find the universal input file for other metals, or in NWChem the input files will be different for each atomic system like for other 3rd row transition metals and for other spin states?


This is my input file and ones again the error that appears:


echo
start Sc
title "Sc, CCSD(T)/aug-cc-pVTZ energy"
memory stack 2000 mb heap 100 mb global 2000 mb noverify
charge 0
geometry
  Sc         0.00000000     0.00000000     0.00000000
end
basis spherical
Sc library aug-cc-pvdz
end
scf
quartet ; rohf
end
tce
freeze atomic
ccsd(t)
maxiter 1100
lshift 0.4
thresh 1.0d-5
tilesize 24
attilesize 30
io ga
2eorb
2emet 13
diis 11
end
task tce energy




Error:

   33a   (alpha)    17a   (alpha) ---    12a   (alpha)    11a   (alpha)        0.3877664385
33a (alpha) 32a (alpha) --- 11a (alpha) 12a (alpha) 0.1577169409
33a (alpha) 32a (alpha) --- 12a (alpha) 11a (alpha) -0.1577169330
14a (alpha) 46a (alpha) --- 11a (alpha) 12a (alpha) -0.1399880233
14a (alpha) 46a (alpha) --- 12a (alpha) 11a (alpha) 0.1399880043
15a (alpha) 45a (alpha) --- 11a (alpha) 12a (alpha) 0.1399880233
15a (alpha) 45a (alpha) --- 12a (alpha) 11a (alpha) -0.1399880043
16a (alpha) 48a (alpha) --- 11a (alpha) 12a (alpha) -0.2013751515
16a (alpha) 48a (alpha) --- 12a (alpha) 11a (alpha) 0.2013751156
17a (alpha) 47a (alpha) --- 11a (alpha) 12a (alpha) 0.2013751516
17a (alpha) 47a (alpha) --- 12a (alpha) 11a (alpha) -0.2013751155
CCSD(T)
Using plain CCSD(T) code
total no. of tasks 4
total no. of tasks / no. procs 0
wl_min 328509 8.3
wl_max 373248 8.5
thresh for no. of tasks 125
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
nwchem 0000000005476B25 Unknown Unknown Unknown
nwchem 0000000005474747 Unknown Unknown Unknown
nwchem 000000000541F924 Unknown Unknown Unknown
nwchem 000000000541F736 Unknown Unknown Unknown
nwchem 00000000053BAE06 Unknown Unknown Unknown
nwchem 00000000053C11B0 Unknown Unknown Unknown
nwchem 00000000051B2300 Unknown Unknown Unknown
nwchem 0000000001CE8363 Unknown Unknown Unknown
nwchem 000000000183ED07 Unknown Unknown Unknown
nwchem 000000000183A4A5 Unknown Unknown Unknown
nwchem 0000000000420BC7 Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
nwchem 0000000005476B25 Unknown Unknown Unknown
nwchem 0000000005474747 Unknown Unknown Unknown
nwchem 000000000541F924 Unknown Unknown Unknown
nwchem 000000000541F736 Unknown Unknown Unknown
nwchem 00000000053BAE06 Unknown Unknown Unknown
nwchem 00000000053C11B0 Unknown Unknown Unknown
nwchem 00000000051B2300 Unknown Unknown Unknown
nwchem 0000000001CE8363 Unknown Unknown Unknown

(long, long similar section here)

nwchem 000000000041FE8A Unknown Unknown Unknown
nwchem 00000000004133EF Unknown Unknown Unknown
nwchem 000000000040AA80 Unknown Unknown Unknown
nwchem 000000000040A59E Unknown Unknown Unknown
nwchem 0000000005495B00 Unknown Unknown Unknown
nwchem 000000000040A487 Unknown Unknown Unknown
srun: error: nid00626: task 97: Exited with exit code 174
srun: Terminating job step 8266983.0
srun: error: nid00611: tasks 41,44-46: Exited with exit code 174
srun: error: nid00612: tasks 64,68,74,76-78,86: Exited with exit code 174
srun: error: nid00610: tasks 0-5,7,9-13,15,21-26,28-30: Exited with exit code 174
slurmstepd: error: *** STEP 8266983.0 ON nid00610 CANCELLED AT 2017-11-09T22:04:58 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: nid00611: tasks 32-40,42-43,47-50,52-63: Killed
srun: error: nid00610: tasks 6,8,14,16-20,27,31: Killed
srun: error: nid00612: tasks 65-67,69-73,75,79-85,87-95: Killed
srun: error: nid00626: tasks 96,98-127: Killed
srun: error: nid00611: task 51: Killed

Forum Vet
Please provide more details about your run: no. of processes used, value of ARMCI_NETWORK, etc ...

Clicked A Few Times
Quote:Edoapra Nov 10th 9:26 am
Please provide more details about your run: no. of processes used, value of ARMCI_NETWORK, etc ...



Dear Edoapra,

thank you for your help, I guess you need to know how my run-script file looks like, so:
this is it:

  1. !/bin/bash -l
  2. SBATCH -J a1_avtz
  3. SBATCH -C haswell
  4. SBATCH -p regular
  5. SBATCH -N 4
  6. SBATCH -t 05:00:00
  7. SBATCH -o Sc.o%j
  8. SBATCH --mail-type=BEGIN,END,FAIL
  9. SBATCH --mail-user=nothez@mail.com
module load nwchem
  1. for Edison which has 24 cores per node
  2. srun -n 48 nwchem test1.nw >& test1.out
    1. for Cori which has 32 cores per node:
srun -n 128 nwchem Sc.nw >& Sc.out

Clicked A Few Times
Something goes wrong when I copied the script :
This is it:

!/bin/bash -l
SBATCH -J a1_avtz
SBATCH -C haswell
SBATCH -p regular
SBATCH -N 4
SBATCH -t 05:00:00
SBATCH -o Sc.o%j
SBATCH --mail-type=BEGIN,END,FAIL
SBATCH --mail-user=nothez@mail.com
module load nwchem
srun -n 128 nwchem Sc.nw >& Sc.out

Forum Vet
use 4 processes
NWChem 6.6 contains a bug that makes the (T) part crash when the run is oversubscribed (i.e. you are using too many processors). Please just four processors as in the example below
#!/bin/bash -l
SBATCH -J a1_avtz
SBATCH -C haswell
SBATCH -p regular
SBATCH -N 1
SBATCH -t 05:00:00
SBATCH -o Sc.o%j
SBATCH --mail-type=BEGIN,END,FAIL
SBATCH --mail-user=nothez@mail.com
module load nwchem
srun -n 4 nwchem Sc.nw >& Sc.out 

Clicked A Few Times
Quote:Edoapra Nov 13th 3:44 pm
NWChem 6.6 contains a bug that makes the (T) part crash when the run is oversubscribed (i.e. you are using too many processors). Please just four processors as in the example below
#!/bin/bash -l
SBATCH -J a1_avtz
SBATCH -C haswell
SBATCH -p regular
SBATCH -N 1
SBATCH -t 05:00:00
SBATCH -o Sc.o%j
SBATCH --mail-type=BEGIN,END,FAIL
SBATCH --mail-user=nothez@mail.com
module load nwchem
srun -n 4 nwchem Sc.nw >& Sc.out 


Dear Edoapra,

I've try to re-run the jobs with your new TCE section and new run file.
It works for Sc quartet state, but doesn't for any other spin state and other transition metals.

usually it ends up with the error like that, this particular one is for neutral Sc (sextet) atom:

      Multipole analysis of the density wrt the origin
------------------------------------------------

L x y z total open nuclear
- - - - ----- ---- -------
0 0 0 0 0.000000 -5.000000 21.000000

1 1 0 0 0.000000 0.000000 0.000000
1 0 1 0 0.000000 0.000000 0.000000
1 0 0 1 -0.000000 -0.000000 0.000000

2 2 0 0 -23.929351 -20.140909 0.000000
2 1 1 0 0.000375 -0.001093 0.000000
2 1 0 1 -0.000073 0.000211 0.000000
2 0 2 0 -23.579043 -21.160950 0.000000
2 0 1 1 -0.067669 0.197042 0.000000
2 0 0 2 -23.916280 -20.178970 0.000000

------------------------------------------------------------------------
tce energy failed 0
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
32: task tce energy
------------------------------------------------------------------------
------------------------------------------------------------------------
This type of error is most commonly associated with calculations not reaching
convergence criteria
------------------------------------------------------------------------
For more information see the NWChem manual at
http://nwchemgit.github.io/index.php/NWChem_Documentation


For further details see manual section: 




[0] Received an Error in Communication: (-1) 0:tce energy failed:
Rank 0 [Tue Nov 14 15:36:48 2017] [c6-5c2s7n1] application called MPI_Abort(comm=0x84000004, -1) - process 0
forrtl: error (76): Abort trap signal
Image PC Routine Line Source
nwchem 0000000005476B25 Unknown Unknown Unknown
nwchem 0000000005474747 Unknown Unknown Unknown
nwchem 000000000541F924 Unknown Unknown Unknown
nwchem 000000000541F736 Unknown Unknown Unknown
nwchem 00000000053BAE06 Unknown Unknown Unknown
nwchem 00000000053C19A8 Unknown Unknown Unknown
nwchem 00000000051B2300 Unknown Unknown Unknown
nwchem 00000000053AC94B Unknown Unknown Unknown
nwchem 000000000549BC58 Unknown Unknown Unknown
nwchem 0000000005235132 Unknown Unknown Unknown
nwchem 00000000052081E9 Unknown Unknown Unknown
nwchem 0000000002DC41F4 Unknown Unknown Unknown
nwchem 0000000002D6DFB7 Unknown Unknown Unknown
nwchem 0000000000B14E4B Unknown Unknown Unknown
nwchem 0000000000413D31 Unknown Unknown Unknown
nwchem 000000000040AA80 Unknown Unknown Unknown
nwchem 000000000040A59E Unknown Unknown Unknown
nwchem 0000000005495B00 Unknown Unknown Unknown
nwchem 000000000040A487 Unknown Unknown Unknown
srun: error: nid12829: task 0: Aborted
srun: Terminating job step 8366663.0
slurmstepd: error: *** STEP 8366663.0 ON nid12829 CANCELLED AT 2017-11-14T15:36:48 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: nid12829: tasks 1-3: Killed




These jobs are only a single point energy calculations of bare one atom.
What about the optimization or frequency calculation for similar but bigger systems, like MCl or MCl3?
Is it possible to use more then 4 cpu's for ROHF approximation with CCSD(T) method ?

Could you help me and provide me the all-purpose TCE section and run-script for OPT and FREQ jobs also?

Thank you,
best wishes
Nothez

Forum Vet
SCF converge
Nothez
Please have a look at the output file in its entirety.
Your failure(s) is(are) likely to be due to SCF convergence problems (often encountered with transition metals)

Forum Vet
Dear Edoapra
 The inputs modified by your proposal do make NWCHEM6.6 converge for the quartet Sc problem
...
     
MICROCYCLE DIIS UPDATE: 66 11
67 0.0000068048084 -0.0642225070901 0.1 0.1 0.0,

and at least converge for doublet Sc

...
 MICROCYCLE DIIS UPDATE:                  385                   11
386 0.0000084592507 -0.0624645967774 0.4 0.4 0.2


and quartet Mn

...MICROCYCLE DIIS UPDATE: 33 11
  34   0.0000085281859  -0.1366924253622     0.4     0.4     0.2,

both of the latter two are arbitrarily  chosen by me.

The basis set used here is the same as the original input by the author of the topic,
i.e.,aug-cc-pvdz, different from the one stated at the beginning by himself.


By the way, could you pleas help me with the link problem of the candidate release of NWCHEM6.8 without the reinstallation of GCC?

 Thanks a lot!
Very Best Regards!

Sorry! I just see your suggestion for my problem.
I'll try it tomorow.

Thanks a lot again!


Forum >> NWChem's corner >> Running NWChem