IO utilities#
pasted._io#
XYZ format serialization and deserialization helpers.
Public API#
- format_xyz(atoms, positions, charge, mult, metrics, prefix=””) → str
Serialize a structure to an extended-XYZ string. The second line (the XYZ comment line) includes charge, multiplicity, and all metric values as
key=valuepairs.- parse_xyz(text) → list[dict]
Parse one or more XYZ frames from text. Each frame is returned as a dict with keys
atoms,positions,charge,mult,metrics, andprefix. Blank lines between frames are silently skipped.
Notes
The extended-XYZ comment line written by
format_xyz()is machine-readable: all fields use=as a separator with no spaces, allowing downstream tools to extract metrics without regex.parse_xyz()accepts files produced by any tool that writes standard XYZ (atom-count line, then free-form comment line, then N coordinate lines). Unrecognized comment-line content is stored asprefixand does not raise an error.
- pasted._io.format_xyz(atoms: list[str], positions: list[Vec3], charge: int, mult: int, metrics: dict[str, float], prefix: str = '') str[source]#
Serialise a structure to the extended XYZ format.
The second line (comment line) encodes prefix, charge, multiplicity, composition, and all metric values.
- Parameters:
atoms – Element symbols.
positions – Cartesian coordinates (Å), one per atom.
charge – Total system charge.
mult – Spin multiplicity 2S+1.
metrics – Dict of computed disorder metrics.
prefix – Prepended to the comment line (e.g.
"sample=1 mode=gas").
- Return type:
A multi-line string (no trailing newline).
- pasted._io.parse_xyz(text: str) list[tuple[list[str], list[Vec3], int, int, dict[str, float]]][source]#
Parse a (possibly multi-frame) XYZ string — standard or extended format.
Supports both:
Standard XYZ — atom count line, comment line, then coordinate lines.
chargedefaults to 0,multto 1,metricsis empty.Extended XYZ (as written by PASTED) — the comment line may contain
charge=+0,mult=1, andKEY=VALUEmetric tokens.
- Parameters:
text – Full contents of one or more XYZ frames (concatenated).
- Return type:
list of
(atoms, positions, charge, mult, metrics)tuples, one per frame.- Raises:
ValueError – When the atom-count line or a coordinate line cannot be parsed.
Note
Extended XYZ comment-line format.
The comment line written by PASTED follows this structure:
sample=N mode=M charge=+Q mult=M comp=[El1:n1,El2:n2,...] KEY1=V1 KEY2=V2 ...
comp= encodes the composition as a sorted comma-separated list of
Element:count pairs. All metric keys from
ALL_METRICS appear in order; nan is written
for any metric that could not be computed. Metric values are formatted
to 4 decimal places.
parse_xyz() extracts charge, mult, and any
KEY=FLOAT tokens from this line. Unknown keys are silently ignored,
making the format forward-compatible with future metric additions.
High-level helpers
For most use-cases the higher-level methods on
Structure are more convenient than calling
format_xyz() and parse_xyz() directly:
Method / function |
Description |
|---|---|
|
Serialise one structure to an extended XYZ string in memory. |
|
Write or append one frame to a file. |
Load all frames from a file or raw string and return
|
Note
Bug fix — ``read_xyz`` now raises ``FileNotFoundError`` for missing paths (v0.4.0).
Prior to v0.4.0, calling read_xyz("missing.xyz") (a string without
newlines that does not exist as a file) fell through the path-existence
check and tried to parse the path string as XYZ text, raising a confusing
ValueError: Expected atom count on line 1, got 'missing.xyz'.
The function now raises FileNotFoundError (or
IsADirectoryError for directory paths), matching the behavior of
from_xyz().