reacnetgenerator package#
ReacNetGenerator is an automatic reaction network generator for reactive molecular dynamics simulation.[Rb972b5c58b34-1]_.
References#
Jinzhe Zeng, Liqun Cao, Chih-Hao Chin, Haisheng Ren, John Z. H. Zhang, Tong Zhu, ReacNetGenerator: an automatic reaction network generator for reactive molecular dynamic simulations, Phys. Chem. Chem. Phys., 2020, 22 (2): 683-691, doi: 10.1039/C9CP05091D.
- class reacnetgenerator.ReacNetGenerator(**kwargs: Any)[source]#
Bases:
object
Use ReacNetGenerator for trajectory analysis.
- Parameters:
- inputfiletype: str
- The type of the input file. The following type is allowed:
dump: LAMMPS dump file, which can be outputed by using dump 1 all custom 100 dump.reaxc id type x y z. See https://lammps.sandia.gov/doc/dump.html for details.
bond: LAMMPS ReaxFF bond file. See https://lammps.sandia.gov/doc/fix_reaxc_bonds.html for details.
xyz: XYZ file, which can also be outputed by LAMMPS using dump.
- inputfilename: str or list of strs
The filename(s) of the input file, which can be either relative path or absolute path. If it is a list, the files will be read in order.
- atomname: tuple of strs
The list of the atomic names in the input file, such as (‘C’, ‘H’, ‘O’). It should match the order of that in the input file.
- runHMM: bool, optional, default: True
Process trajectory with Hidden Markov Model (HMM) or not. If the user find too many species are filtered, they can turn off this option.
- miso: int, optional, default: 0
Merge the isomers and the highest frequency is used as the representative. 0, off two available levels: 1, merge the isomers with same atoms and same bond-network but different bond levels; 2, merge the isomers with same atoms with different bond-network.
- pbc: bool, optional, default: True
Use periodic boundary conditions (PBC) or not.
- cell: (3,3) array_like or (3,) array_like or (9,) array_like, optional, default: None
The cell (box size) of the system. If None (default), the cell will be read from the input file. If the input file doesn’t have cell information, this parameter will be necessary.
- nproc: int, optional, default: None
The number of processors used for analysis. If None (default), the program will try to use all processors.
- selectatoms: str, optional, default: None
Select an element from the atomic names, such as C, and only show species with this element in the reaction network. If None (default), the network will show all elements.
- split: int, optional, default: None
Split number for the time axis. For example, if set to 10, the whole trajectroy will be divided into 10 parts and reactions of each part will be shown.
- a: (2,2) array_like, optional, default: [[0.999, 0.001], [0.001, 0.009]]
Transition matrix A of HMM parameters. It is recommended for users to choose their own parameters. See the paper for details.
- b: (2,2) array_like, optional, default: [[0.6, 0.4], [0.4, 0.6]]
Emission matrix B of HMM parameters. It is recommended for users to choose their own parameters. See the paper for details.
Examples
>>> from reacnetgenerator import ReacNetGenerator >>> rng=ReacNetGenerator(inputfiletype="dump", inputfilename="dump.ch4", atomname=['C', 'H', 'O']) >>> rng.runanddraw()
- class Status(value)[source]#
Bases:
Enum
ReacNetGenerator status.
The ReacNetGenerator consists of several modules and algorithms to process the information from the given trajectory, including:
DOWNLOAD: Download trajectory from urls
DETECT: Read bond information and detect molecules
HMM: HMM filter
MISO: Merge isomers
PATH: Indentify isomers and collect reaction paths
MATRIX: Reaction matrix generation
NETWORK: Draw reaction network
REPORT: Generate analysis report
- DETECT = 'Read bond information and detect molecules'#
- DOWNLOAD = 'Download trajectory'#
- HMM = 'HMM filter'#
- INIT = 'Init'#
- MATRIX = 'Reaction matrix generation'#
- MISO = 'Merge isomers'#
- NETWORK = 'Draw reaction network'#
- PATH = 'Indentify isomers and collect reaction paths'#
- REPORT = 'Generate analysis report'#
- run() None [source]#
Process MD trajectory, including DOWNLOAD, DETECT, HMM, PATH, and MATRIX steps.
- runanddraw(run: bool = True, draw: bool = True, report: bool = True) None [source]#
Analyze the trajectory from MD simulation.
- Parameters:
- runbool, optional, default: True
Process the trajectory or not, including DOWNLOAD, DETECT, HMM, PATH, and MATRIX steps.
- drawbool, optional, default: True
Draw the reaction network or not, i.e. NETWORK step.
- reportbool, optional, default: True
Generate the analysis report, i.e. NETWORK step.
Submodules#
reacnetgenerator.commandline module#
Command line interface for reacnetgenerator.
- reacnetgenerator.commandline.main_parser() ArgumentParser [source]#
Return main parser.
- Returns:
- argparse.ArgumentParser
reacnetgenerator cli parser
reacnetgenerator.dps module#
Connect molecule with Depth-First Search.
- reacnetgenerator.dps.dps(bonds, levels)#
Connect molecule with Depth-First Search.
- Parameters:
- bondslist of list
The bonds of molecule.
- levelslist of int
The levels of atoms.
- Returns:
- list of list
The connected atoms in each molecule.
- list of list
The connected bonds in each molecule.
- reacnetgenerator.dps.dps_reaction(reactdict)#
Find A+B->C+D reactions.
- Parameters:
- reactdictlist of dict of list
Two dictionaries of reactions.
- Returns:
- list of list of list
List of reactions. The secons axis matches the position of species (left or right).
reacnetgenerator.gui module#
reacnetgenerator.reacnetgen module#
The main module of ReacNetGenerator.
reacnetgenerator.tools module#
Useful methods to futhur process ReacNetGenerator results.
- reacnetgenerator.tools.calculate_rate(specfile: str | Path, reacfile: str | Path, cell: ndarray, timestep: float) Dict[str, float] [source]#
Calculate the rate constant of each reaction.
The rate constants are calculated by the method developed in [1]. The time interval of the trajectory is assumed to be uniform.
- Parameters:
- specfilestr
The species file.
- reacfilestr
The reactions file.
- cellnp.ndarray
The cell with the shape (3, 3). Unit: Angstrom.
- timestepfloat
The time step. Unit: femtosecond.
- Returns:
- ratesDict[str, float]
The rate of each reaction. The dict key is the reaction SMILES. The value is in unit of [(cm^3/mol)^(n-1)s^(-1)], where n is the reaction order.
References
[1]Yanze Wu, Huai Sun, Liang Wu, Joshua D. Deetz, Extracting the mechanisms and kinetic models of complex reactions from atomistic simulation data, J. Comput. Chem. 40, 16, 1586-1592.
Examples
>>> cell = np.eye(3) * 3.7601e1 # in unit Angstrom >>> timestep = 0.1 # in unit fs >>> rates = calculate_rate('methane.species', 'methane.reactionabcd', cell, timestep)
- reacnetgenerator.tools.read_reactions(reacfile: str | Path) List[Tuple[int, Counter, str]] [source]#
Read reactions from the reactions file (ends with .reaction or .reactionsabcd).
For accuracy, HMM filter should be disabled.
- Parameters:
- reacfilestr or Path
The reactions file.
- Returns:
- occsList[Tuple[int, Counter, str]]
The number of occurences of each reaction. The tuple is (occurence, counter_reactants, reaction).
- reacnetgenerator.tools.read_species(specfile: str | Path) Tuple[ndarray, Dict[str, ndarray]] [source]#
Read species from the species file (ends with .species).
For accuracy, HMM filter should be disabled.
- Parameters:
- specfilestr or Path
The species file.
- Returns:
- step_idxnp.ndarray
The index of the step.
- n_speciesDict[str, np.ndarray]
The number of species in each step. The dict key is the species SMILES.
Examples
Plot the number of methane in each step.
>>> from reacnetgenerator.tools import read_species >>> import matplotlib.pyplot as plt >>> step_idx, n_species = read_species('methane.species') >>> plt.plot(step_idx, n_species['[H]C([H])([H])[H]']) >>> plt.savefig("methane.svg")
reacnetgenerator.utils module#
Provide utils for ReacNetGenerator.
- class reacnetgenerator.utils.SCOUROPTIONS[source]#
Bases:
object
Scour (SVG optimization) options.
- enable_viewboxing = True#
- newlines = False#
- remove_descriptions = True#
- remove_descriptive_elements = True#
- remove_metadata = True#
- remove_titles = True#
- shorten_ids = True#
- strip_comments = True#
- strip_ids = True#
- strip_xml_prolog = True#
- strip_xml_space_attribute = True#
Bases:
object
Share ReacNetGenerator data with a class of the submodule.
- Parameters:
- rng: reacnetgenerator.ReacNetGenerator
The centered ReacNetGenerator class.
- usedRNGKeys: list of strs
Keys that needs to pass from ReacNetGenerator class to the submodule.
- returnedRNGKeys: list of strs
Keys that needs to pass from the submodule to ReacNetGenerator class.
- extraNoneKeys: list of strs, optional, default: None
Set keys to None, which will be used in the submodule.
Return back keys to ReacNetGenerator class.
- class reacnetgenerator.utils.WriteBuffer(f: IO, linenumber: int = 1200, sep: AnyStr | None = None)[source]#
Bases:
object
Store a buffer for writing files.
It is expensive to write to a file, so we need to make a buffer.
- Parameters:
- f: fileObject
The file object to write.
- linenumber: int, default: 1200
The number of contents to store in the buffer. The buffer will be flushed if it exceeds the set number.
- sep: str or bytes, default: None
The separator for contents. If None (default), there will be no separator.
- append(text: AnyStr) None [source]#
Append a text.
- Parameters:
- textstr or bytes
The text to be appended.
- check() None [source]#
Check if the number of stored contents exceeds.
If so, the buffer will be flushed.
- reacnetgenerator.utils.appendIfNotNone(f: WriteBuffer | ExitStack, wbytes: AnyStr | None) None [source]#
Append a line to a file if the line is not None.
- Parameters:
- fWriteBuffer
The file to write.
- wbytesstr or bytes
The line to write.
- reacnetgenerator.utils.bytestolist(x: bytes) Any [source]#
Convert a compressed line to an object.
- Parameters:
- xbytes
The compressed line.
- Returns:
- object
The decompressed object.
- reacnetgenerator.utils.checksha256(filename: str, sha256_check: str | List[str])[source]#
Check sha256 of a file is correct.
- Parameters:
- filenamestr
The filename.
- sha256_checkstr or list of strs
The sha256 to be checked.
- Returns:
- bool
Indicate whether sha256 is correct.
- reacnetgenerator.utils.compress(x: str | bytes) bytes [source]#
Compress the line.
This function reduces IO overhead to speed up the program. The functions will use lz4 to compress, since lz4 has better performance that any others.
The compressed format is size + data + size + data + …, where size is a 64-bit little-endian integer.
- Parameters:
- xstr or bytes
The line to compress.
- Returns:
- bytes
The compressed line, with a linebreak in the end.
- reacnetgenerator.utils.decompress(x: bytes, isbytes: bool = False) str | bytes [source]#
Decompress the line.
- Parameters:
- xbytes
The line to decompress.
- isbytesbool, optional, default: False
If the decompressed content is bytes. If not, the line will be decoded.
- Returns:
- str or bytes
The decompressed line.
- async reacnetgenerator.utils.download_file(urls: str | List[str], pathfilename: str, sha256: str) str [source]#
Download files from remote urls if not exists.
- Parameters:
- urls: str or list of strs
The url(s) that is available to download.
- pathfilename: str
The downloading path of the file.
- sha256: str
Sha256 of the file. If not None and match the file, the download will be skiped.
- Returns:
- pathfilename: str
The downloading path of the file.
- reacnetgenerator.utils.download_multifiles(urls: List[dict]) None [source]#
Download multiple files from dicts.
- Parameters:
- urlslist of dicts
- The information of download files. Each dict should contain the following key:
- url: str or list of strs
The url(s) that is available to download.
- pathfilename: str
The downloading path of the file.
- sha256: str, optional, default: None
Sha256 of the file. If not None and match the file, the download will be skiped.
- async reacnetgenerator.utils.gather_download_files(urls: List[dict]) None [source]#
Asynchronously download files from remote urls if not exists.
See download_multifiles function for details.
See also
- reacnetgenerator.utils.listtobytes(x: Any) bytes [source]#
Convert an object to a compressed line.
- Parameters:
- xobject
The object to convert, such as numpy.ndarray.
- Returns:
- bytes
The compressed line.
- reacnetgenerator.utils.listtostirng(l: str | list | tuple | ndarray, sep: List[str] | Tuple[str, ...]) str [source]#
Convert a list to string, that is easier to store.
- Parameters:
- lstr or array-like
The list to convert, which can contain any number of dimensions.
- seplist of strs
The seperators for each dimension.
- Returns:
- str
The converted string.
- reacnetgenerator.utils.multiopen(pool: multiprocessing.pool.Pool, func: Callable, l: IO, semaphore: multiprocessing.synchronize.Semaphore | None = None, nlines: int | None = None, unordered: bool = True, return_num: bool = False, start: int = 0, extra: Any | None = None, interval: int | None = None, bar: bool = True, desc: str | None = None, unit: str = 'it', total: int | None = None) Iterable [source]#
Return an interated object for process a file with multiple processors.
- Parameters:
- poolmultiprocessing.Pool
The pool for multiprocessing.
- funcfunction
The function to process lines.
- lFile object
The file object.
- semaphoremultiprocessing.Semaphore, optional, default: None
The semaphore to acquire. If None (default), the object will be passed without control.
- nlinesint, optional, default: None
The number of lines to pass to the function each time. If None (default), only one line will be passed to the function.
- unorderedbool, optional, default: True
Whether the process can be unordered.
- return_numbool, optional, default: False
If True, adds a counter to an iterable.
- startint, optional, default: 0
The start number of the counter.
- extraobject, optional, default: None
The extra object passed to the item.
- intervalint, optional, default: None
The interval of items that will be passed to the function. For example, if set to 10, a item will be passed once every 10 items and others will be dropped.
- barbool, optional, default: True
If True, show a tqdm bar for the iteration.
- descstr, optional, default: None
The description of the iteration shown in the bar.
- unitstr, optional, default: it
The unit of the iteration shown in the bar.
- totalint, optional, default: None
The total number of the iteration shown in the bar.
- Returns:
- object
An object that can be iterated.
- reacnetgenerator.utils.must_be_list(obj: Any | List[Any]) List[Any] [source]#
Convert a object to a list if the object is not a list.
- Parameters:
- objObject
The object to convert.
- Returns:
- obj: list
If the input object is not a list, returns a list that only contains that object. Otherwise, returns that object.
- reacnetgenerator.utils.produce(semaphore: multiprocessing.synchronize.Semaphore, plist: Iterable[Any], parameter: Any) Generator[Tuple[Any, Any], None, None] [source]#
Item producer with a semaphore.
Prevent large memory usage due to slow IO.
- Parameters:
- semaphoremultiprocessing.Semaphore
The semaphore to acquire.
- plistlist of objects
The list of items to be passed.
- parameterobject
The parameter yielded with each item.
- Yields:
- item: object
The item to be yielded.
- parameter: object
The parameter yielded with each item.
- reacnetgenerator.utils.read_compressed_block(f: BinaryIO) Generator[bytes, None, None] [source]#
Read compressed binary file, assuming the format is size + data + size + data + …
- Parameters:
- ffileObject
The file object to read.
- Yields:
- data: bytes
The compressed block.
- reacnetgenerator.utils.run_mp(nproc: int, **kwargs: Any) Iterable[Any] [source]#
Process a file with multiple processors.
- Parameters:
- nprocint
The number of processors to be used.
- **kwargsdict, optional
Other parameters can be found in the multiopen method.
- Yields:
- object
The yielded object from the multiopen method.
See also
reacnetgenerator.utils_np module#
- reacnetgenerator.utils_np.check_zero_signal(signal)#
Check if the given signal contains only zeros.
- Parameters:
- signal1D array
The signal to check.
- Returns:
- bool
True if the signal contains only zeros, False otherwise.
Notes
This method uses Cython to speed up the computation.
- Benchmark for 1,000,000 loops (1000/2000):
Cython: 1.45 s
Python: 3.67 s
- reacnetgenerator.utils_np.idx_to_signal(idx, step)#
Converts an index array to a signal array.
- Parameters:
- idxarray_like
Index array.
- stepint
Step size.
- Returns:
- signalndarray
Signal array.
Notes
This method uses Cython to speed up the computation.
- Benchmark for 1,000,000 loops (step=250000):
Cython: 18.61 s
Python: 31.30 s