Dynamical Nucleation Theory Monte Carlo¶
- Schenter, G. K.; Kathmann, S. M.; Garrett, B. C. J. Chem. Phys. (1999), 110, 7951.
- Crosby, L. D.; Kathmann, S. M.; Windus, T. L. J. Comput. Chem. ( 2008), submitted.
The Dynamical Nucleation Theory Monte Carlo (DNTMC) module utilizes Dynamical Nucleation Theory (DNT) to compute monomer evaporation rate constants at a given temperature. The reactant is a molecular cluster of i rigid monomers while the product is a molecular cluster with i-1 monomers plus a free monomer. A Metropolis Monte Carlo (MC) methodology is utilized to sample the configurational space of these i rigid monomers. Both homogenous and heterogenous clusters are supported.
SubGroups¶
The DNTMC module supports the use of subgroups in the MC simulations. The number of subgroups is defined in the input through a set directive:
set subgroup_number <integer number>
where the number of subgroups requested is the argument. The number of processors that each subgroup has access to is determined by Total/subgroup_number. A separate MC simulation is performed within each subgroup. To use this functionality, NWChem must be compiled with the USE_SUBGROUPS environmental variable set.
Each MC simulation starts at a different starting configuration, which
is equally spaced along the reaction coordinate. The statistical
distributions which these MC simulations produce are averaged to form
the final statistical distribution. Output from these subgroups consists
of various files whose names are of the form (*.#num
). These files
include restart files and other data files. The NWChem runtime database
(RTDB) is used as input for these subgroups and must be globally
accessible (set through the Permanent_Dir directive) to all processes.
Input Syntax¶
The input block has the following form:
DNTMC
[nspecies <integer number>]
[species <list of strings name[nspecies]]
[nmol <list of integers number[nspecies]>]
[temp <real temperature>]
[rmin <real rmin>]
[rmax <real rmax>]
[nob <integer nob>]
[mcsteps <integer number>]
[tdisp <real disp>]
[rdisp <real rot>]
[rsim || rconfig]
[mprnt <integer number>]
[convergence <real limit>]
[norestart]
[dntmc_dir <string directory>]
[print &&|| noprint]
[procrestart <integer number>]
END
Definition of Monomers¶
Geometry information is required for each unique monomer (species). See the geometry input section 6 for more information. A unique label must be given for each monomer geometry. Additionally, the noautosym and nocenter options are suggested for use with the DNTMC module to prevent NWChem from changing the input geometries. Symmetry should also not be used since cluster configurations will seldom exhibit any symmetry; although monomers themselves may exhibit symmetry.
GEOMETRY [<string name species_1>] noautosym nocenter ...
...
symmetry c1
END
GEOMETRY [<string name species_2>] noautosym nocenter ...
...
symmetry c1
END
...
The molecular cluster is defined by the number of unique monomers (nspecies). The geometry labels for each unique monomer is given in a space delimited list (species). Also required are the number of each unique monomer in the molecular cluster given as a space delimited list (nmol). These keywords are required and thus have no default values.
[nspecies <integer number>]
[species <list of strings name[nspecies]]
[nmol <list of integers number[nspecies]>]
An example is shown for a 10 monomer cluster consisting of a 50/50 mixture of water and ammonia.
DNTMC runtime options¶
Several options control the behavior of the DNTMC module. Some required options such as simulation temperature (temp), cluster radius (rmin and rmax), and maximum number of MC steps (mcsteps) are used to control the MC simulation.
[temp <real temperature>]
This required option gives the simulation temperature in which the MC simulation is run. Temperature is given in kelvin.
[rmin <real rmin>]
[rmax <real rmax>]
[nob <integer nob>]
These required options define the minimum and maximum extent of the projected reaction coordinate (The radius of a sphere centered at the center of mass). Rmin should be large enough to contain the entire molecular cluster of monomers and Rmax should be large enough to include any relevant configurational space (such as the position of the reaction bottleneck). These values are given in Angstroms.
The probability distributions obtain along this projected reaction coordinate has a minimum value of Rmin and a maximum value of Rmax. The distributions are created by chopping this range into a number of smaller sized bins. The number of bins (nob) is controlled by the option of the same name.
[mcsteps <integer number>]
[tdisp <real disp default 0.04>]
[rdisp <real rot default 0.06>]
[convergence <real limit default 0.00>]
These options define some characteristics of the MC simulations. The maximum number of MC steps (mcsteps) to take in the course of the calculation run is a required option. Once the MC simulation has performed this number of steps the calculation will end. This is a per Markov chain quantity. The maximum translational step size (tdisp) and rotational step size (rdisp) are optional inputs with defaults set at 0.04 Angstroms and 0.06 radians, respectively. The convergence keyword allows the convergence threshold to be set. The default is 0.00 which effectively turns off this checking. Once the measure of convergence goes below this threshold the calculation will end.
[rsim || rconfig]
These optional keywords allow the selection of two different MC sampling methods. rsim selects a Metropolis MC methodology which samples configurations according to a Canonical ensemble. The rconfig keyword selects a MC methodology which samples configurations according to a derivative of the Canonical ensemble with respect to the projected reaction coordinate. These keywords are optional with the default method being rconfig.
[mprnt <integer number default 10>]
[dntmc_dir <string directory default ./>]
[norestart]
These three options define some of the output and data analysis
behavior. mprnt is an option which controls how often data analysis
occurs during the simulation. Currently, every mprnt*nob
MC steps data
analysis is performed and results are output to files and/or to the log
file. Restart files are also written every mprnt number of MC steps
during the simulation. The default value is 10. The keyword dntmc_dir
allows the definition of an alternate directory to place DNTMC specific
ouputfiles. These files can be very large so be sure enough space is
available. This directory should be accessible by every process
(although not necessarily globally accessible). The default is to place
these files in the directory which NWChem is run (./). The keyword
norestart turns off the production of restart files. By default restart
files are produced every mprnt number of MC steps.
Print Control¶
The DNTMC module supports the use of PRINT and NOPRINT Keywords. The specific labels which DNTMC recognizes are included below.
Name | Print Level | Description |
“debug” | debug | Some debug information written in Output file. |
“information” | none | Some information such as energies and geometries. |
“mcdata” | low | Production of a set of files (Prefix.MCdata.#num ). These files are a concatenated list of structures, Energies, and Dipole Moments for each accepted configuration sampled in the MC run. |
“alldata” | low | Production of a set of files (Prefix.Alldata.#num ). These files include the same information as MCdata files. However, they include ALL configurations (accepted or rejected). |
“mcout” | debug - low | Production of a set of files (Prefix.MCout.#num ). These files contain a set of informative and debug information. Also included is the set of information which mirrors the Alldata files. |
“fdist” | low | Production of a file (Prefix.fdist ) which contains a concatenated list of distributions every mprnt*100 MC steps. |
“timers” | debug | Enables some timers in the code. These timers return performance statistics in the output file every time data analysis is performed. Two timers are used. One for the mcloop itself and one for the communication step. |
Selected File Formats¶
Several output files are available in the DNTMC module. This section defines the format for some of these files.
-
*.fdist
This file is a concatenated list of radial distribution functions printed out every mprnt MC steps. Each distribution is normalized (sum equal to one) with respect to the entire (all species) distribution. The error is the RMS deviation of the average at each point. Each entry is as follows:
[1] # Total Configurations
[2] Species number #
[3](R coordinate in Angstroms) (Probability) (Error)
[Repeats nob times]
[2 and 3 Repeats for each species]
[4] *** separator.
-
*.MCdata.#
This file is a concatenated list of accepted configurations. Each file corresponds to a single Markov chain. The dipole is set to zero for methods which do not produce a dipole moment with energy calculations. Rsim is either the radial extent of the cluster (r-config) or the simulation radius (r-simulation). Each entry is as follows:
[1] (Atomic label) (X Coord.) (Y Coord.) (Z Coord.)
[1 Repeats for each atom in the cluster configuration, units are in
angstroms]
[2] Ucalc = # hartree
[3] Dipole = (X) (Y) (Z) au
[4] Rsim = # Angstrom
[1 through 4 repeats for each accepted configuration]
-
*.MCout.#
This file has the same format and information content as the MCdata file except that additional output is included. This additional output includes summary statistics such as acceptance ratios, average potential energy, and average radius. The information included for accepted configurations does not include dipole moment or radius.
-
*.MCall.#
This file has the same format as the MCdata file expect that it includes information for all configurations for which an energy is determined. All accepted and rejected configurations are included in this file.
-
*.restart.#
This file contains the restart information for each subgroup. Its format is not very human readable but the basic fields are described in short here.
Random number seed Potential energy in hartrees
Sum of potential energy
Average potential energy
Sum of the squared potential energy
Squared potential energy Dipole moment in au (x) (Y) (Z)
Rmin and Rmax Rsim (Radius corresponds to r-config or r-sim methods)
Array of nspecies length, value indicates the number of each type of monomer which lies at radius Rsim from the center of mass [r-simulation sets these to zero]
Sum of Rsim Average of Rsim
Number of accepted translantional moves
Number of accepted rotational moves Number of accepted volume moves
Number of attempted moves (volume) (translational) (rotational)
Number of accepted moves (Zero)
Number of accepted moves (Zero)
Number of MC steps completed
[1] (Atom label) (X Coord.) (Y Coord.) (Z Coord.)
[1 repeats for each atom in cluster configurations, units are in angstroms]
[2] Array of nspecies length, number of configurations in bin
[3] Array of nspecies length, normalized number of configurations in each bin
[4] (Value of bin in Angstroms) (Array of nspecies length, normalized probability of bin)
[2 through 4 repeats nob times]
DNTMC Restart¶
[procrestart <integer number>]
Flag to indicate restart postprocessing. It is suggested that this postprocessing run is done utilizing only one processor.
In order to restart a DNTMC run, postprocessing is required to put
required information into the runtime database (RTDB). During a run
restart information is written to files (Prefix.restart.#num
) every
mprnt MC steps. This information must be read and deposited into the
RTDB before a restart run can be done. The number taken as an argument
is the number of files to read and must also equal the number of
subgroups the calculation utilizes. The start directive must also be set
to restart for this to work properly. All input is read as usual.
However, values from the restart files take precedence over input
values. Some keywords such as mcsteps are not defined in the restart
files. Task directives are ignored. You must have a RTDB present in your
permanent directory.
Once postprocessing is done a standard restart can be done from the RTDB by removing the procrestart keyword and including the restart directive.
Task Directives¶
The DNTMC module can be used with any level of theory which can produce energies. Gradients and Hessians are not required within this methodology. If dipole moments are available, they are also utilized. The task directive for the DNTMC module is shown below:
task <string theory> dntmc
Example¶
This example is for a molecular cluster of 10 monomers. A 50/50 mixture of water and ammonia. The energies are done at the SCF/6-31++G** level of theory.
start
# start or restart directive if a restart run
MEMORY 1000 mb
PERMANENT_DIR /home/bill
# Globally accessible directory which the
# rtdb (*.db) file will/does reside.
basis "ao basis" spherical noprint
* library 6-31++G**
end
# basis set directive for scf energies
scf
singlet
rhf
tol2e 1.0e-12
vectors input atomic
thresh 1.0e-06
maxiter 200
print none
end
# scf directive for scf energies
geometry geom1 units angstroms noautosym nocenter noprint
O 0.393676503613369 -1.743794626956820 -0.762291912129271
H -0.427227157125777 -1.279138812526320 -0.924898279781319
H 1.075463952717060 -1.095883929075060 -0.940073459864222
symmetry c1
end
# geometry of a monomer with title "geom1"
geometry geom2 units angstroms noautosym nocenter noprint
N 6.36299e-08 0.00000 -0.670378
H 0.916275 0.00000 -0.159874
H -0.458137 0.793517 -0.159874
H -0.458137 -0.793517 -0.159874
symmetry c1
end
# geometry of another monomer with title "geom2"
# other monomers may be included with different titles
set subgroups_number 8
# set directive which gives the number of subgroups
# each group runs a separate MC simulation
dntmc
# DNTMC input block
nspecies 2
# The number of unique species (number of titled geometries
# above)
species geom1 geom2
# An array of geometry titles (one for each
# nspecies/geometry)
nmol 5 5
# An array stating the number of each
# monomer/nspecies/geometry in simulation.
temp 243.0
mcsteps 1000000
rmin 3.25
rmax 12.25
mprnt 10
tdisp 0.04
rdisp 0.06
print none fdist mcdata
# this print line first sets the print-level to none
# then it states that the *.fdist and *.mcdata.(#num)
# files are to be written
rconfig
dntmc_dir /home/bill/largefile
# An accessible directory which to place the *.fdist,
# *.mcdata.(#num), and *.restart.(#num) files.
convergence 1.0D+00
end
task scf dntmc
# task directive stating that energies are to be done at the scf
#level of theory.