Hello all,
I have followed the cuda tce install notes and am trying to run the H2O QA test, but i get the following result
argument 1 = ccsd_test.nw
============================== echo of input deck ==============================
start tce_cuda
echo
memory stack 1000 mb heap 100 mb global 500 mb verify
geometry units bohr
O 0.00000000 0.00000000 0.22138519
H 0.00000000 -1.43013023 -0.88554075
H 0.00000000 1.43013023 -0.88554075
end
basis spherical
H library cc-pVDZ
O library cc-pVDZ
end
charge 0
scf
thresh 1.0e-10
tol2e 1.0e-10
singlet
rhf
end
tce
ccsd(t)
io ga
cuda 1
tilesize 16
end
task tce energy
================================================================================
Northwest Computational Chemistry Package (NWChem) 6.8
------------------------------------------------------
Environmental Molecular Sciences Laboratory
Pacific Northwest National Laboratory
Richland, WA 99352
Copyright (c) 1994-2017
Pacific Northwest National Laboratory
Battelle Memorial Institute
NWChem is an open-source computational chemistry package
distributed under the terms of the
Educational Community License (ECL) 2.0
A copy of the license is included with this distribution
in the LICENSE.TXT file
ACKNOWLEDGMENT
--------------
This software and its documentation were developed at the
EMSL at Pacific Northwest National Laboratory, a multiprogram
national laboratory, operated for the U.S. Department of Energy
by Battelle under Contract Number DE-AC05-76RL01830. Support
for this work was provided by the Department of Energy Office
of Biological and Environmental Research, Office of Basic
Energy Sciences, and the Office of Advanced Scientific Computing.
Job information
---------------
hostname = physdep-117-5
program = /home/afik/nwchem-6.8/bin/LINUX64/nwchem
date = Tue May 29 20:29:50 2018
compiled = Tue_May_29_20:19:21_2018
source = /home/afik/nwchem-6.8
nwchem branch = 6.8
nwchem revision = v6.8-47-gdf6c956
ga revision = ga-5.6.3
use scalapack = F
input = ccsd_test.nw
prefix = tce_cuda.
data base = ./tce_cuda.db
status = startup
nproc = 8
time left = -1s
Memory information
------------------
heap = 13107194 doubles = 100.0 Mbytes
stack = 131071999 doubles = 1000.0 Mbytes
global = 65536000 doubles = 500.0 Mbytes (distinct from heap & stack)
total = 209715193 doubles = 1600.0 Mbytes
verify = yes
hardfail = no
Directory information
---------------------
0 permanent = .
0 scratch = .
NWChem Input Module
-------------------
C2V symmetry detected
------
auto-z
------
Geometry "geometry" -> ""
-------------------------
Output coordinates in a.u. (scale by 1.000000000 to convert to a.u.)
No. Tag Charge X Y Z
---- ---------------- ---------- -------------- -------------- --------------
1 O 8.0000 0.00000000 0.00000000 0.22138519
2 H 1.0000 1.43013023 0.00000000 -0.88554075
3 H 1.0000 -1.43013023 0.00000000 -0.88554075
Atomic Mass
-----------
O 15.994910
H 1.007825
Effective nuclear repulsion energy (a.u.) 9.1968845623
Nuclear Dipole moment (a.u.)
----------------------------
X Y Z
---------------- ---------------- ----------------
0.0000000000 0.0000000000 0.0000000000
Symmetry information
--------------------
Group name C2v
Group number 16
Group order 4
No. of unique centers 2
Symmetry unique atoms
1 2
Z-matrix (autoz)
--------
Units are Angstrom for bonds and degrees for angles
Type Name I J K L M Value
----------- -------- ----- ----- ----- ----- ----- ----------
1 Stretch 1 2 0.95700
2 Stretch 1 3 0.95700
3 Bend 2 1 3 104.52000
XYZ format geometry
-------------------
3
geometry
O 0.00000000 0.00000000 0.11715200
H 0.75679238 0.00000000 -0.46860802
H -0.75679238 0.00000000 -0.46860802
==============================================================================
internuclear distances
------------------------------------------------------------------------------
center one | center two | atomic units | a.u.
------------------------------------------------------------------------------
2 H | 1 O | 1.80847 | 1.80847
3 H | 1 O | 1.80847 | 1.80847
------------------------------------------------------------------------------
number of included internuclear distances: 2
==============================================================================
==============================================================================
internuclear angles
------------------------------------------------------------------------------
center 1 | center 2 | center 3 | degrees
------------------------------------------------------------------------------
2 H | 1 O | 3 H | 104.52
------------------------------------------------------------------------------
number of included internuclear angles: 1
==============================================================================
Basis "ao basis" -> "" (spherical)
-----
H (Hydrogen)
------------
Exponent Coefficients
-------------- ---------------------------------------------------------
1 S 1.30100000E+01 0.019685
1 S 1.96200000E+00 0.137977
1 S 4.44600000E-01 0.478148
2 S 1.22000000E-01 1.000000
3 P 7.27000000E-01 1.000000
O (Oxygen)
----------
Exponent Coefficients
-------------- ---------------------------------------------------------
1 S 1.17200000E+04 0.000710
1 S 1.75900000E+03 0.005470
1 S 4.00800000E+02 0.027837
1 S 1.13700000E+02 0.104800
1 S 3.70300000E+01 0.283062
1 S 1.32700000E+01 0.448719
1 S 5.02500000E+00 0.270952
1 S 1.01300000E+00 0.015458
2 S 1.17200000E+04 -0.000160
2 S 1.75900000E+03 -0.001263
2 S 4.00800000E+02 -0.006267
2 S 1.13700000E+02 -0.025716
2 S 3.70300000E+01 -0.070924
2 S 1.32700000E+01 -0.165411
2 S 5.02500000E+00 -0.116955
2 S 1.01300000E+00 0.557368
3 S 3.02300000E-01 1.000000
4 P 1.77000000E+01 0.043018
4 P 3.85400000E+00 0.228913
4 P 1.04600000E+00 0.508728
5 P 2.75300000E-01 1.000000
6 D 1.18500000E+00 1.000000
Summary of "ao basis" -> "" (spherical)
------------------------------------------------------------------------------
Tag Description Shells Functions and Types
---------------- ------------------------------ ------ ---------------------
H cc-pVDZ 3 5 2s1p
O cc-pVDZ 6 14 3s2p1d
NWChem SCF Module
-----------------
ao basis = "ao basis"
functions = 24
atoms = 3
closed shells = 5
open shells = 0
charge = 0.00
wavefunction = RHF
input vectors = atomic
output vectors = ./tce_cuda.movecs
use symmetry = T
symmetry adapt = T
Summary of "ao basis" -> "ao basis" (spherical)
------------------------------------------------------------------------------
Tag Description Shells Functions and Types
---------------- ------------------------------ ------ ---------------------
H cc-pVDZ 3 5 2s1p
O cc-pVDZ 6 14 3s2p1d
Symmetry analysis of basis
--------------------------
a1 11
a2 2
b1 7
b2 4
Forming initial guess at 0.1s
Superposition of Atomic Density Guess
-------------------------------------
Sum of atomic energies: -75.76222910
Non-variational initial energy
------------------------------
Total energy = -75.926598
1-e energy = -121.777341
2-e energy = 36.653859
HOMO = -0.469523
LUMO = 0.091436
Symmetry analysis of molecular orbitals - initial
-------------------------------------------------
Numbering of irreducible representations:
1 a1 2 a2 3 b1 4 b2
Orbital symmetries:
1 a1 2 a1 3 b1 4 a1 5 b2
6 a1 7 b1 8 b1 9 a1 10 a1
11 b2 12 b1 13 a1 14 a2 15 b2
Starting SCF solution at 0.2s
----------------------------------------------
Quadratically convergent ROHF
Convergence threshold : 1.000E-10
Maximum no. of iterations : 30
Final Fock-matrix accuracy: 1.000E-10
----------------------------------------------
#quartets = 1.953D+03 #integrals = 1.482D+04 #direct = 0.0% #cached =100.0%
Integral file = ./tce_cuda.aoints.0
Record size in doubles = 65536 No. of integs per rec = 43688
Max. records in memory = 2 Max. records in file = 52416
No. of bits per label = 8 No. of bits per value = 64
File balance: exchanges= 0 moved= 0 time= 0.0
iter energy gnorm gmax time
----- ------------------- --------- --------- --------
1 -75.9919313494 8.32D-01 3.68D-01 0.2
2 -76.0245328052 1.73D-01 7.81D-02 0.2
3 -76.0267916568 1.46D-02 6.36D-03 0.3
4 -76.0268078570 3.41D-05 1.89D-05 0.3
5 -76.0268078572 2.09D-10 1.15D-10 0.4
6 -76.0268078572 2.83D-12 1.14D-12 0.4
Final RHF results
------------------
Total SCF energy = -76.026807857177
One-electron energy = -123.154586049207
Two-electron energy = 37.930893629732
Nuclear repulsion energy = 9.196884562298
Time for solution = 0.3s
Symmetry analysis of molecular orbitals - final
-----------------------------------------------
Numbering of irreducible representations:
1 a1 2 a2 3 b1 4 b2
Orbital symmetries:
1 a1 2 a1 3 b1 4 a1 5 b2
6 a1 7 b1 8 b1 9 a1 10 a1
11 b2 12 b1 13 a1 14 a2 15 b2
Final eigenvalues
-----------------
1
1 -20.5504
2 -1.3368
3 -0.6994
4 -0.5666
5 -0.4932
6 0.1856
7 0.2563
8 0.7895
9 0.8545
10 1.1635
11 1.2004
12 1.2533
13 1.4446
14 1.4763
15 1.6748
ROHF Final Molecular Orbital Analysis
-------------------------------------
Vector 2 Occ=2.000000D+00 E=-1.336810D+00 Symmetry=a1
MO Center= 3.5D-17, -1.9D-19, -5.4D-02, r^2= 5.0D-01
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
2 0.442847 1 O s 3 0.375524 1 O s
15 0.193713 2 H s 20 0.193713 3 H s
Vector 3 Occ=2.000000D+00 E=-6.994436D-01 Symmetry=b1
MO Center= -2.7D-16, -2.3D-17, -1.1D-01, r^2= 7.7D-01
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
4 0.490008 1 O px 15 0.328055 2 H s
20 -0.328055 3 H s 7 0.221765 1 O px
Vector 4 Occ=2.000000D+00 E=-5.666047D-01 Symmetry=a1
MO Center= -5.7D-17, 3.3D-32, 1.6D-01, r^2= 6.7D-01
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
6 0.545529 1 O pz 9 0.365329 1 O pz
3 0.349885 1 O s 15 -0.206362 2 H s
20 -0.206362 3 H s 2 0.150410 1 O s
Vector 5 Occ=2.000000D+00 E=-4.931619D-01 Symmetry=b2
MO Center= 4.5D-17, 3.7D-17, 9.3D-02, r^2= 6.0D-01
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
5 0.631158 1 O py 8 0.495642 1 O py
Vector 6 Occ=0.000000D+00 E= 1.856128D-01 Symmetry=a1
MO Center= 3.7D-17, -1.6D-17, -6.1D-01, r^2= 3.0D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
3 1.003062 1 O s 16 -0.829439 2 H s
21 -0.829439 3 H s 9 -0.336808 1 O pz
6 -0.190376 1 O pz
Vector 7 Occ=0.000000D+00 E= 2.562882D-01 Symmetry=b1
MO Center= -2.5D-16, 8.1D-19, -6.2D-01, r^2= 3.6D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
16 1.444952 2 H s 21 -1.444952 3 H s
7 -0.671020 1 O px 4 -0.283072 1 O px
Vector 8 Occ=0.000000D+00 E= 7.895205D-01 Symmetry=b1
MO Center= 4.3D-15, -5.6D-17, -2.5D-01, r^2= 1.7D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
15 0.944396 2 H s 20 -0.944396 3 H s
16 -0.685542 2 H s 21 0.685542 3 H s
7 -0.461871 1 O px 4 -0.267872 1 O px
19 -0.153045 2 H pz 24 0.153045 3 H pz
Vector 9 Occ=0.000000D+00 E= 8.545358D-01 Symmetry=a1
MO Center= -2.4D-15, 2.8D-16, -4.7D-01, r^2= 1.6D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
15 0.787221 2 H s 20 0.787221 3 H s
16 -0.547465 2 H s 21 -0.547465 3 H s
6 0.329172 1 O pz 3 0.319623 1 O s
17 0.296348 2 H px 22 -0.296348 3 H px
2 -0.255712 1 O s
Vector 10 Occ=0.000000D+00 E= 1.163485D+00 Symmetry=a1
MO Center= -8.6D-16, -1.8D-17, 1.4D-01, r^2= 1.2D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
9 1.279725 1 O pz 6 -0.754396 1 O pz
3 -0.750184 1 O s 15 0.547420 2 H s
20 0.547420 3 H s 19 0.250488 2 H pz
24 0.250488 3 H pz
Vector 11 Occ=0.000000D+00 E= 1.200389D+00 Symmetry=b2
MO Center= -1.6D-17, -1.9D-16, 1.1D-01, r^2= 1.1D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
8 -1.025479 1 O py 5 0.967820 1 O py
Vector 12 Occ=0.000000D+00 E= 1.253280D+00 Symmetry=b1
MO Center= -8.3D-16, -3.5D-19, 1.2D-01, r^2= 1.7D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
7 1.764697 1 O px 16 -0.825968 2 H s
21 0.825968 3 H s 4 -0.733899 1 O px
15 -0.379694 2 H s 20 0.379694 3 H s
17 0.302681 2 H px 22 0.302681 3 H px
19 -0.186824 2 H pz 24 0.186824 3 H pz
Vector 13 Occ=0.000000D+00 E= 1.444621D+00 Symmetry=a1
MO Center= -5.2D-17, -6.8D-17, -5.9D-02, r^2= 1.4D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
9 0.739197 1 O pz 19 -0.545980 2 H pz
24 -0.545980 3 H pz 2 -0.529408 1 O s
3 0.507964 1 O s 15 0.332938 2 H s
20 0.332938 3 H s 17 -0.328827 2 H px
22 0.328827 3 H px 16 -0.209825 2 H s
Vector 14 Occ=0.000000D+00 E= 1.476304D+00 Symmetry=a2
MO Center= -4.2D-15, 7.8D-17, -4.3D-01, r^2= 1.0D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
18 0.685632 2 H py 23 -0.685632 3 H py
Vector 15 Occ=0.000000D+00 E= 1.674768D+00 Symmetry=b2
MO Center= 5.5D-15, -9.4D-17, -2.9D-01, r^2= 1.2D+00
Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
----- ------------ --------------- ----- ------------ ---------------
18 0.767191 2 H py 23 0.767191 3 H py
8 -0.633433 1 O py 11 -0.160750 1 O d -1
center of mass
--------------
x = 0.00000000 y = 0.00000000 z = 0.09750368
moments of inertia (a.u.)
------------------
2.193344434586 0.000000000000 0.000000000000
0.000000000000 6.315897898335 0.000000000000
0.000000000000 0.000000000000 4.122553463750
Mulliken analysis of the total density
--------------------------------------
Atom Charge Shell Charges
----------- ------ -------------------------------------------------------
1 O 8 8.31 2.00 0.83 0.82 2.82 1.81 0.01
2 H 1 0.85 0.69 0.07 0.09
3 H 1 0.85 0.69 0.07 0.09
Multipole analysis of the density wrt the origin
------------------------------------------------
L x y z total open nuclear
- - - - ----- ---- -------
0 0 0 0 -0.000000 0.000000 10.000000
1 1 0 0 0.000000 0.000000 0.000000
1 0 1 0 -0.000000 0.000000 0.000000
1 0 0 1 -0.808895 0.000000 0.000000
2 2 0 0 -3.064509 0.000000 4.090545
2 1 1 0 0.000000 0.000000 0.000000
2 1 0 1 -0.000000 0.000000 0.000000
2 0 2 0 -5.228664 0.000000 0.000000
2 0 1 1 0.000000 0.000000 0.000000
2 0 0 2 -4.376016 0.000000 1.960456
Parallel integral file used 8 records with 0 large values
NWChem Extensible Many-Electron Theory Module
---------------------------------------------
======================================================
This portion of the program was automatically
generated by a Tensor Contraction Engine (TCE).
The development of this portion of the program
and TCE was supported by US Department of Energy,
Office of Science, Office of Basic Energy Science.
TCE is a product of Battelle and PNNL.
Please cite: S.Hirata, J.Phys.Chem.A 107, 9887 (2003).
======================================================
General Information
-------------------
Number of processors : 8
Wavefunction type : Restricted Hartree-Fock
No. of electrons : 10
Alpha electrons : 5
Beta electrons : 5
No. of orbitals : 48
Alpha orbitals : 24
Beta orbitals : 24
Alpha frozen cores : 0
Beta frozen cores : 0
Alpha frozen virtuals : 0
Beta frozen virtuals : 0
Spin multiplicity : singlet
Number of AO functions : 24
Number of AO shells : 12
Use of symmetry is : on
Symmetry adaption is : on
Schwarz screening : 0.10D-09
Correlation Information
-----------------------
Calculation type : Coupled-cluster singles & doubles w/ perturbation
Perturbative correction : (T)
Max iterations : 100
Residual threshold : 0.10D-06
T(0) DIIS level shift : 0.00D+00
L(0) DIIS level shift : 0.00D+00
T(1) DIIS level shift : 0.00D+00
L(1) DIIS level shift : 0.00D+00
T(R) DIIS level shift : 0.00D+00
T(I) DIIS level shift : 0.00D+00
CC-T/L Amplitude update : 5-th order DIIS
I/O scheme : Global Array Library
L-threshold : 0.10D-06
EOM-threshold : 0.10D-06
no EOMCCSD initial starts read in
TCE RESTART OPTIONS
READ_INT: F
WRITE_INT: F
READ_TA: F
WRITE_TA: F
READ_XA: F
WRITE_XA: F
READ_IN3: F
WRITE_IN3: F
SLICE: F
D4D5: F
Memory Information
------------------
Available GA space size is 524287424 doubles
Available MA space size is 144177388 doubles
Maximum block size supplied by input
Maximum block size 16 doubles
tile_dim = 8
Block Spin Irrep Size Offset Alpha
-------------------------------------------------
1 alpha a1 3 doubles 0 1
2 alpha b1 1 doubles 3 2
3 alpha b2 1 doubles 4 3
4 beta a1 3 doubles 5 1
5 beta b1 1 doubles 8 2
6 beta b2 1 doubles 9 3
7 alpha a1 8 doubles 10 7
8 alpha a2 2 doubles 18 8
9 alpha b1 6 doubles 20 9
10 alpha b2 3 doubles 26 10
11 beta a1 8 doubles 29 7
12 beta a2 2 doubles 37 8
13 beta b1 6 doubles 39 9
14 beta b2 3 doubles 45 10
Global array virtual files algorithm will be used
Parallel file system coherency ......... OK
#quartets = 3.081D+03 #integrals = 2.434D+04 #direct = 0.0% #cached =100.0%
Integral file = ./tce_cuda.aoints.0
Record size in doubles = 65536 No. of integs per rec = 43688
Max. records in memory = 2 Max. records in file = 52416
No. of bits per label = 8 No. of bits per value = 64
File balance: exchanges= 0 moved= 0 time= 0.0
Fock matrix recomputed
1-e file size = 190
1-e file name = ./tce_cuda.f1
Cpu & wall time / sec 0.1 0.1
tce_ao2e: fast2e=1
half-transformed integrals in memory
2-e (intermediate) file size = 734976
2-e (intermediate) file name = ./tce_cuda.v2i
Cpu & wall time / sec 0.1 0.1
tce_mo2e: fast2e=1
2-e integrals stored in memory
2-e file size = 126194
2-e file name = ./tce_cuda.v2
Cpu & wall time / sec 0.1 0.1
T1-number-of-tasks 3
t1 file size = 33
t1 file name = ./tce_cuda.t1
t1 file handle = -999
T2-number-of-boxes 54
t2 file size = 4006
t2 file name = ./tce_cuda.t2
t2 file handle = -996
CCSD iterations
-----------------------------------------------------------------
Iter Residuum Correlation Cpu Wall V2*C2
-----------------------------------------------------------------
1 0.1070456887772 -0.2039386873789 0.1 0.1 0.0
2 0.0294986807718 -0.2094619997092 0.1 0.1 0.0
3 0.0105763599957 -0.2121844898871 0.0 0.1 0.0
4 0.0042072568507 -0.2128564675719 0.1 0.1 0.0
5 0.0017479975767 -0.2131115135048 0.0 0.1 0.0
MICROCYCLE DIIS UPDATE: 5 5
6 0.0001880601508 -0.2132610073945 0.0 0.1 0.0
7 0.0000815201141 -0.2132697144617 0.0 0.1 0.0
8 0.0000350162322 -0.2132697376807 0.1 0.1 0.0
9 0.0000193757975 -0.2132697104821 0.0 0.1 0.0
10 0.0000103080435 -0.2132698116584 0.1 0.1 0.0
MICROCYCLE DIIS UPDATE: 10 5
11 0.0000014547614 -0.2132700135646 0.0 0.1 0.0
12 0.0000005664855 -0.2132699471930 0.1 0.1 0.0
13 0.0000002613564 -0.2132699568216 0.0 0.1 0.0
14 0.0000001403230 -0.2132699545939 0.0 0.1 0.0
15 0.0000000746381 -0.2132699540642 0.1 0.1 0.0
-----------------------------------------------------------------
Iterations converged
CCSD correlation energy / hartree = -0.213269954064232
CCSD total energy / hartree = -76.240077811240866
Singles contributions
Doubles contributions
CCSD(T)
Using CUDA CCSD(T) code
Warning: Please check whether you have 1 cuda devices per node
------------------------------------------------------------------------
cuda 30
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
26: task tce energy
------------------------------------------------------------------------
------------------------------------------------------------------------
There is an error in the input file
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 3 DUP FROM 0
with errorcode 30.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
0:cuda:Received an Error in Communication
------------------------------------------------------------------------
For more information see the NWChem manual at http://nwchemgit.github.io/index.php/NWChem_Documentation
For further details see manual section:
running on ubuntu 17.10, nvidia gt730.
Afik
|