Metric functions#

pasted._metrics#

Disorder-metric computations for PASTED structures.

The primary entry point is compute_all_metrics(), which accepts an atom list and a position array and dispatches internally to the per-metric helpers. Lower-level helpers have varying signatures: some take pts (an (N, 3) NumPy array), others take dmat (a full N×N distance matrix), and compute_h_atom() takes only an element-symbol list. All cutoff-based metrics share the same cutoff distance threshold (Å).

Acceleration paths#

C++ single-pass path (HAS_COMBINED = True, v0.4.0+): compute_all_metrics() calls all_metrics_cpp, which builds a single FlatCellList and accumulates all pair-based metrics (RDF, graph, Steinhardt, bond-angle entropy, radial variance, local anisotropy) in one traversal. Speedup vs. the legacy path: ~1.9× at N=1000.

C++ legacy path (HAS_GRAPH = True, HAS_COMBINED = False): compute_all_metrics() calls rdf_h_cpp, graph_metrics_cpp, steinhardt_per_atom, and bond_angle_entropy_cpp independently, each building its own FlatCellList.

Pure-Python fallback (HAS_GRAPH = False): compute_h_spatial() and compute_rdf_deviation() use scipy.spatial.cKDTree (O(N·k)), but _compute_graph_ring_charge falls back to a full O(N²) pdist / squareform matrix. For N ≳ 500 this path becomes significantly slower; reinstall with a C++17 compiler to enable HAS_GRAPH = True.

Metrics catalog#

All 17 metrics returned by compute_all_metrics() (13 original + 4 adversarial metrics added in v0.4.0):

Changelog highlights#

  • v0.4.0: _combined_core single-pass C++ kernel added; all 17 metrics now computed in one FlatCellList traversal when HAS_COMBINED=True (~1.9× speedup at N=1000). Four adversarial metrics introduced in v0.4.0-alpha (bond_angle_entropy, coordination_variance, radial_variance, local_anisotropy) are now part of the stable ALL_METRICS set.

  • v0.3.10: compute_shape_anisotropy guard tightened from tr == 0 to tr < 1e-30, fixing a ZeroDivisionError on subnormal coordinate differences.

  • v0.3.9: _steinhardt_per_atom_sparse masks bonds with d < 1e-10 Å, matching the C++ threshold and fixing a C++/Python consistency bug.

  • v0.3.8: moran_I_chi clamped to 1.0 to fix sparse-graph artefact.

  • v0.3.6: Tarjan bridge-finding replaces Union-Find in compute_ring_fraction, fixing systematic undercounting of ring members.

pasted._metrics.compute_all_metrics(atoms: list[str], positions: list[Vec3], n_bins: int = 20, w_atom: float = 0.5, w_spatial: float = 0.5, cutoff: float | None = None, cov_scale: float = 1.0) dict[str, float][source]#

Compute all disorder metrics for a single structure.

The exact count is len(ALL_METRICS) (currently ALL_METRICS).

C++ path (HAS_GRAPH = True): all pair-enumeration uses a single shared adjacency list built via FlatCellList (O(N·k)). The adjacency list is constructed once and reused for all five graph/ring/charge/Moran metrics. graph_cc triangle counting uses sorted adjacency lists with binary_search for O(N·k²·log k) rather than the former O(N·k³) std::find scan. rdf_h_cpp streams distances directly into the histogram without materialising an intermediate pair_dists vector. scipy.spatial.distance.pdist / squareform are never called.

Pure-Python fallback (HAS_GRAPH = False): compute_h_spatial() and compute_rdf_deviation() use scipy.spatial.cKDTree (O(N·k)), but the five graph/ring/charge/Moran metrics are computed via _compute_graph_ring_charge(), which builds a full N×N distance matrix with pdist / squareform — an O(N²) operation that is ~100× slower than the C++ path at N=500. This fallback is intended only for environments where the C++ extension cannot be compiled.

Parameters:
  • atoms – Element symbols.

  • positions – Cartesian coordinates (Å).

  • n_bins – Histogram bins for compute_h_spatial() and compute_rdf_deviation().

  • w_atom – Weight of H_atom in H_total.

  • w_spatial – Weight of H_spatial in H_total.

  • cutoff – Distance cutoff (Å) for all local metrics. When None (the default), auto-computed as 1.5 × median(r_i + r_j) over covalent radii.

  • cov_scale – Retained for API compatibility; no longer used internally. Defaults to 1.0.

Return type:

dict with keys matching pasted._atoms.ALL_METRICS.

pasted._metrics.compute_angular_entropy(positions: list[Vec3], cutoff: float, n_bins: int = 20) float[source]#

Mean per-atom angular entropy of neighbor direction distributions.

For each atom i, the directions to its neighbors within cutoff are projected onto the unit sphere. The polar angle theta distribution is histogrammed and its Shannon entropy is computed. The result is averaged over all atoms that have at least one neighbor.

A value close to ln(n_bins) indicates a near-uniform (maximum-entropy) angular distribution – neighbors are spread evenly over the sphere. A low value indicates clustering of neighbors in certain directions, i.e. accidental local order.

This metric is intended as a diagnostic for the maxent placement mode and is not included in ALL_METRICS or in XYZ comment lines.

Uses scipy.spatial.cKDTree for O(N*k) pair enumeration instead of a full O(N^2) distance matrix.

Parameters:
  • positions – Cartesian coordinates (Å).

  • cutoff – Neighbor distance cutoff (Å).

  • n_bins – Number of histogram bins for the theta distribution (default: 20).

Returns:

Mean per-atom angular Shannon entropy. Returns 0.0 for structures with fewer than two atoms or no neighbors within cutoff.

Return type:

float

pasted._metrics.compute_charge_frustration(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Variance of Pauling electronegativity differences across neighbor pairs.

For each neighbor pair (i, j) within cutoff, the absolute electronegativity difference abs(chi_i - chi_j) is computed. The metric is the variance of these differences over all neighbor pairs.

A high value indicates a structure where electronegativity differences are inconsistently distributed across bonds: some neighbors are well matched while others are highly mismatched. This is analogous to charge frustration in disordered materials, where local charge neutrality cannot be satisfied simultaneously at every site.

Noble gases and elements without a Pauling value use the module-level fallback of 1.0 (see pauling_electronegativity()).

Parameters:
  • atoms – Element symbols.

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as connected when d_ij <= cutoff.

Returns:

Variance of abs(delta-chi) across all neighbor pairs. Returns 0.0 when fewer than two neighbor pairs are detected (variance is undefined for a single observation).

Return type:

float

pasted._metrics.compute_graph_metrics(dmat: ndarray, cutoff: float) dict[str, float][source]#

Largest connected-component fraction and mean local clustering coefficient [0, 1].

Pure-Python fallback used when HAS_GRAPH is False. The C++ path in graph_metrics_cpp is preferred and is invoked automatically by _compute_graph_ring_charge().

Parameters:
  • dmat – Full n x n pairwise distance matrix.

  • cutoff – Adjacency distance cutoff (Å).

Return type:

dict with keys "graph_lcc" and "graph_cc".

pasted._metrics.compute_h_atom(atoms: list[str]) float[source]#

Shannon entropy of the element composition.

Range: 0 (pure single element) to ln(k) for k distinct elements.

pasted._metrics.compute_h_spatial(pts: ndarray, cutoff: float, n_bins: int) float[source]#

Shannon entropy of the pair-distance histogram within cutoff.

Only pairs with d_ij <= cutoff are included, matching the locality assumption used by all other metrics. Higher values indicate a more uniform distribution of short-range distances.

Parameters:
  • pts – Positions array of shape (n, 3).

  • cutoff – Neighbor distance cutoff (Å).

  • n_bins – Number of histogram bins over [0, cutoff].

pasted._metrics.compute_moran_I_chi(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Moran's I spatial autocorrelation for Pauling electronegativity.

Measures whether atoms with similar electronegativity cluster spatially.

\[I = \frac{N}{W} \frac{\sum_{i \neq j} w_{ij}(\chi_i - \bar{\chi}) (\chi_j - \bar{\chi})}{\sum_i (\chi_i - \bar{\chi})^2}\]

where \(w_{ij} = 1\) when \(d_{ij} \leq\) cutoff and 0 otherwise.

Parameters:
  • atoms – Element symbols.

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff for the step-function weight matrix (Å).

Returns:

Moran's I clamped to (-∞, 1]. Returns 0.0 when all atoms share electronegativity or no pair falls within cutoff.

  • I ~= 0 : random spatial arrangement (target for disordered structures)

  • I > 0 : same-electronegativity atoms cluster spatially

  • I < 0 : alternating high/low electronegativity (ionic-crystal-like)

Note

Binary (0/1) weights are used rather than row-standardised weights, so the raw (n/W) * numer/denom can exceed +1 when the graph is very sparse (W < n). The result is clamped to 1.0 to honour the documented upper bound.

Return type:

float

pasted._metrics.compute_rdf_deviation(pts: ndarray, cutoff: float, n_bins: int) float[source]#

RMS deviation of the empirical g*(*r) from an ideal-gas baseline.

A value of 0 indicates a perfectly random (ideal-gas-like) distribution of pair distances within cutoff. Only pairs with d_ij <= cutoff are included in the histogram, consistent with the other local metrics.

Parameters:
  • pts – Positions array of shape (n, 3).

  • cutoff – Neighbor distance cutoff (Å). The histogram range is [0, cutoff].

  • n_bins – Number of histogram bins.

pasted._metrics.compute_ring_fraction(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Fraction of atoms that belong to at least one ring.

Builds a neighbor graph from the distance matrix and finds all bridge edges using Tarjan’s iterative DFS algorithm (O(N + E)). An atom is considered to be in a ring if and only if at least one of its incident edges is a non-bridge — that is, it participates in a cycle.

This correctly identifies all members of every cycle, including atoms reached only via tree edges that connect into a cycle. The previous Union-Find implementation only marked the two direct endpoints of each detected back-edge, which systematically undercounted ring membership (e.g. a 3-cycle was reported as 2/3 instead of 3/3, a 6-cycle as 2/6 instead of 6/6).

Note

The C++ graph_metrics_cpp path (HAS_GRAPH = True) has been updated in the same release to use the same Tarjan algorithm, so both paths now return consistent values.

Parameters:
  • atoms – Element symbols (unused; retained for API symmetry with other metrics).

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as bonded when d_ij <= cutoff.

Returns:

Fraction of atoms in at least one ring, in [0, 1]. Returns 0.0 for structures with fewer than three atoms or no cycles.

Return type:

float

pasted._metrics.compute_shape_anisotropy(pts: ndarray) float[source]#

Relative shape anisotropy from the gyration tensor.

Range: [0, 1] (0=spherical, 1=rod-like).

Returns NaN when:

  • The input contains fewer than 2 atoms.

  • Any coordinate is non-finite (NaN or Inf). In earlier versions a structure with Inf coordinates would emit a NumPy RuntimeWarning: invalid value encountered in subtract from the mean-centering step; the explicit np.isfinite guard (added in v0.4.0) suppresses that warning and returns NaN cleanly.

Returns 0.0 when all atoms are coincident (gyration tensor trace < 1e-30), guarding against ZeroDivisionError from subnormal or identical coordinates.

Implementation note#

Only the sum of eigenvalues (trace) and the sum of squared eigenvalues (Frobenius norm squared) of the 3×3 gyration tensor T are needed:

trace(T)  = λ₁ + λ₂ + λ₃
‖T‖²_F    = λ₁² + λ₂² + λ₃²

Computing these directly avoids the LAPACK eigvalsh call (~1.5× faster per call, saving ~10 ms over a 500-step optimizer run).

The near-zero guard tr < 1e-30 replaces the previous exact tr == 0 check, which failed to catch subnormal coordinate differences (e.g. two atoms at positions differing by ≈ 6e-108 Å) and raised ZeroDivisionError.

pasted._metrics.compute_steinhardt(pts: ndarray, l_values: list[int], cutoff: float) dict[str, float][source]#

Steinhardt Q_l averaged over all atoms.

Delegates to compute_steinhardt_per_atom() and returns the per-structure mean for each l.

Parameters:
  • pts – Positions array of shape (n, 3).

  • l_values – List of l values (e.g. [4, 6, 8]).

  • cutoff – Neighbor distance cutoff (Å).

Return type:

dict mapping "Q{l}" to its global average value.

pasted._metrics.compute_steinhardt_per_atom(pts: ndarray, l_values: list[int], cutoff: float) dict[str, ndarray][source]#

Per-atom Steinhardt Q_l values.

When the C++ extension pasted._ext._steinhardt_core is available (HAS_STEINHARDT = True), the computation uses a sparse neighbor list built internally by the extension, evaluating spherical harmonics only for actual neighbor pairs. This gives O(N*k) complexity (k = mean neighbor count).

The C++ accumulator uses layout (N, n_l, l_max+1) with atom index outermost (v0.3.6+), so every bond’s (l_idx, m) writes are contiguous (stride 8 B). The former (n_l, l_max+1, N) layout wrote at strides of N × 8 bytes, causing L2→L3 cache spill for N ≳ 1 000 and superlinear wall-time growth.

Since v0.3.7 the per-bond phi-trig cost is also eliminated: atan2 is replaced by a single sqrt + 2 divs to obtain cos_phi / sin_phi, and all higher orders cos(m·phi) / sin(m·phi) follow from the Chebyshev two-term recurrence (2 mults + 1 sub each) instead of l_max separate libm calls. The P_lm table is now stack-allocated (double[13][13]) rather than heap-allocated per bond.

Since v0.3.8 (optimization ④), when l_values = [4, 6, 8] a hardcoded Cartesian-polynomial fast-path is used instead of the associated-Legendre recurrence. Every real spherical harmonic S_lm(x,y,z) is a pure integer-coefficient polynomial on the unit sphere; SymPy joint CSE across l=4,6,8 yields 84 intermediates and 39 accumulation lines with no sqrt, atan2, or std::pow. Measured speedup: 1.4–1.6× vs. ①②③ at N = 100–1 000 (gas structures, k ≈ 0.7, -O3 -std=c++17).

When the extension is absent the function falls back to a sparse Python/NumPy implementation using scipy.spatial.cKDTree for neighbor enumeration and np.bincount for accumulation. Both paths have the same O(N*k) complexity.

Parameters:
  • pts – Positions array of shape (n, 3).

  • l_values – List of l values (e.g. [4, 6, 8]).

  • cutoff – Neighbor distance cutoff (Å).

Returns:

  • dict mapping "Q{l}" to a numpy.ndarray of shape (n,).

  • Atoms with no neighbors within *cutoff are assigned Q_l = 0.*

pasted._metrics.passes_filters(metrics: dict[str, float], filters: list[tuple[str, float, float]]) bool[source]#

Return True iff metrics satisfies every (metric, lo, hi) filter.

MM-level structural descriptors#

The following three functions document the MM-level structural descriptors in ALL_METRICS. ring_fraction and charge_frustration were added in v0.1.9 and revised in v0.1.13; moran_I_chi was added in v0.1.12. All three use the same cutoff distance threshold as graph_lcc and graph_cc, so all five cutoff-based metrics share a single unified adjacency definition.

All metrics are included in ALL_METRICS and can therefore be used as --filter targets on the CLI and in the StructureGenerator filters= parameter.

pasted._metrics.compute_ring_fraction(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Fraction of atoms that belong to at least one ring.

Builds a neighbor graph from the distance matrix and finds all bridge edges using Tarjan’s iterative DFS algorithm (O(N + E)). An atom is considered to be in a ring if and only if at least one of its incident edges is a non-bridge — that is, it participates in a cycle.

This correctly identifies all members of every cycle, including atoms reached only via tree edges that connect into a cycle. The previous Union-Find implementation only marked the two direct endpoints of each detected back-edge, which systematically undercounted ring membership (e.g. a 3-cycle was reported as 2/3 instead of 3/3, a 6-cycle as 2/6 instead of 6/6).

Note

The C++ graph_metrics_cpp path (HAS_GRAPH = True) has been updated in the same release to use the same Tarjan algorithm, so both paths now return consistent values.

Parameters:
  • atoms – Element symbols (unused; retained for API symmetry with other metrics).

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as bonded when d_ij <= cutoff.

Returns:

Fraction of atoms in at least one ring, in [0, 1]. Returns 0.0 for structures with fewer than three atoms or no cycles.

Return type:

float

pasted._metrics.compute_charge_frustration(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Variance of Pauling electronegativity differences across neighbor pairs.

For each neighbor pair (i, j) within cutoff, the absolute electronegativity difference abs(chi_i - chi_j) is computed. The metric is the variance of these differences over all neighbor pairs.

A high value indicates a structure where electronegativity differences are inconsistently distributed across bonds: some neighbors are well matched while others are highly mismatched. This is analogous to charge frustration in disordered materials, where local charge neutrality cannot be satisfied simultaneously at every site.

Noble gases and elements without a Pauling value use the module-level fallback of 1.0 (see pauling_electronegativity()).

Parameters:
  • atoms – Element symbols.

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as connected when d_ij <= cutoff.

Returns:

Variance of abs(delta-chi) across all neighbor pairs. Returns 0.0 when fewer than two neighbor pairs are detected (variance is undefined for a single observation).

Return type:

float

pasted._metrics.compute_moran_I_chi(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Moran's I spatial autocorrelation for Pauling electronegativity.

Measures whether atoms with similar electronegativity cluster spatially.

\[I = \frac{N}{W} \frac{\sum_{i \neq j} w_{ij}(\chi_i - \bar{\chi}) (\chi_j - \bar{\chi})}{\sum_i (\chi_i - \bar{\chi})^2}\]

where \(w_{ij} = 1\) when \(d_{ij} \leq\) cutoff and 0 otherwise.

Parameters:
  • atoms – Element symbols.

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff for the step-function weight matrix (Å).

Returns:

Moran's I clamped to (-∞, 1]. Returns 0.0 when all atoms share electronegativity or no pair falls within cutoff.

  • I ~= 0 : random spatial arrangement (target for disordered structures)

  • I > 0 : same-electronegativity atoms cluster spatially

  • I < 0 : alternating high/low electronegativity (ionic-crystal-like)

Note

Binary (0/1) weights are used rather than row-standardised weights, so the raw (n/W) * numer/denom can exceed +1 when the graph is very sparse (W < n). The result is clamped to 1.0 to honour the documented upper bound.

Return type:

float

Metric overview

Key

Description

Range

H_atom

Shannon entropy of the element composition histogram.

0 – ln(k) for k distinct elements

H_spatial

Shannon entropy of the radial-distribution histogram within cutoff.

0 – ln(n_bins)

H_total

w_atom · H_atom + w_spatial · H_spatial

0 – depends on weights

RDF_dev

RMS deviation of the normalized RDF from a flat distribution. High = crystalline; low = amorphous.

≥ 0

shape_aniso

Relative shape anisotropy κ² from the gyration tensor (κ² = 1.5·Σλᵢ²/(Σλᵢ)² − 0.5). 0 = perfectly spherical; 1 = perfectly rod-like.

[0, 1]

Q4, Q6, Q8

Steinhardt bond-orientational order parameters (l = 4, 6, 8). 0 = amorphous; 1 = crystalline.

[0, 1]

graph_lcc

Fraction of atoms in the largest connected component of the covalent-bond graph. 1.0 = fully connected.

[0, 1]

graph_cc

Mean local clustering coefficient of the covalent-bond graph. 0 = no triangles among neighbors; 1 = all neighbor pairs are mutually bonded.

[0, 1]

ring_fraction

Fraction of atoms that participate in at least one ring, determined by Tarjan’s iterative bridge-finding algorithm (O(N + E)). A bond is a bridge if its removal disconnects the graph; an atom is counted as a ring member when at least one of its incident bonds is a non-bridge.

[0, 1]

charge_frustration

Population variance of |Δχ| (Pauling electronegativity difference) across all cutoff-adjacent atom pairs. High = inconsistent bond-polarity landscape; 0 = all bonds equally polar (or fewer than two pairs detected).

≥ 0

moran_I_chi

Moran’s I spatial autocorrelation of electronegativity on the bond graph. +1 = clustered; −1 = alternating; 0 = no pattern. Clamped to 1.0 from above: binary-weight graphs with fewer edges than atoms can produce raw values > 1 due to the n/W prefactor.

(-∞, 1]

bond_angle_entropy

Mean per-atom Shannon entropy of the bond-angle (θ_ijk) distribution, binned into 36 uniform bins over [0°, 180°]. High = angles spread uniformly; low = narrow angular distribution. Added in v0.4.0.

[0, ln 36]

coordination_variance

Population variance of per-atom coordination numbers within cutoff. 0 = all atoms have the same number of neighbors; high = heterogeneous local environments. Added in v0.4.0.

≥ 0

radial_variance

Mean per-atom variance of neighbor distances (Ų). 0 = all neighbors equidistant; high = large spread of bond lengths. Added in v0.4.0.

≥ 0 Ų

local_anisotropy

Mean per-atom anisotropy of the local covariance tensor built from neighbor direction vectors. 0 = isotropic; 1 = fully anisotropic (all neighbors in one plane or along one axis). Added in v0.4.0.

[0, 1]

Note

C++ acceleration flags.

HAS_GRAPH enables O(N·k) pair enumeration for graph_lcc, graph_cc, ring_fraction, charge_frustration, moran_I_chi, H_spatial, and RDF_dev. The C++ implementation uses a single shared adjacency list (no duplicate allocations), sorted adjacency for O(log k) triangle lookup in graph_cc, and streaming histogram construction in rdf_h_cpp (no intermediate distance vector).

HAS_STEINHARDT enables sparse per-atom Steinhardt computation for Q4, Q6, and Q8 (~2000× vs. the dense Python fallback).

HAS_BA_CPP (v0.4.0) enables the C++ bond_angle_entropy_cpp path for bond_angle_entropy. When absent, the NumPy fallback in pasted._metrics._compute_bond_angle_entropy() is used; both paths use 36 bins and produce numerically identical results.

HAS_COMBINED (v0.4.0) enables the single-pass all_metrics_cpp kernel, which accumulates all 17 metrics in one shared FlatCellList traversal (~1.9× speedup at N=1000 vs. calling the individual C++ modules separately). When HAS_COMBINED = True, HAS_GRAPH, HAS_STEINHARDT, and HAS_BA_CPP are not consulted by compute_all_metrics().

Note

Steinhardt optimizations (v0.3.6 + v0.3.7).

v0.3.6 — accumulator buffer transpose. Layout changed from (n_l, l_max+1, N) to (N, n_l, l_max+1) (atom index outermost), making every bond’s writes contiguous (stride 8 B) and eliminating the L2→L3 spill that caused superlinear wall-time growth at N ≈ 1 000.

v0.3.7 — per-bond arithmetic. atan2 replaced by sqrt + div (cos_phi/sin_phi); cos(m·phi)/sin(m·phi) via Chebyshev recurrence (2 mults + 1 sub each) instead of 18 libm calls per bond; P_lm table stack-allocated (double[13][13]) instead of heap per bond.

Combined speedup: ~2.1–2.3× on compute_steinhardt and ~1.3× on compute_all_metrics at N = 500–1 000. See docs/architecture.mdPer-bond arithmetic optimizations.

Note

Bug fix — ``shape_aniso`` RuntimeWarning for Inf coordinates (v0.4.0).

In prior versions, a structure whose positions contained Inf triggered a NumPy RuntimeWarning: invalid value encountered in subtract during mean-centering in compute_shape_anisotropy(). An explicit np.isfinite guard now returns NaN cleanly before the subtraction, suppressing the warning without affecting valid inputs.

Note

Bug fix — ``moran_I_chi`` upper-bound clamp (v0.3.8).

Prior to v0.3.8, moran_I_chi could return values above +1.0 on structures whose cutoff graph was very sparse (fewer edges than atoms, i.e. W < N). The N / W prefactor in Moran’s I formula inflated the result when un-normalised binary weights were used. Both the C++ path (graph_metrics_cpp) and the Python fallback (compute_moran_I_chi) now clamp the result to min(raw, 1.0) before returning. Results for connected graphs (graph_lcc 1.0) are unaffected.

Note

Performance — real spherical harmonics fast-path for l=4,6,8 (v0.3.8, ④).

When l_values = [4, 6, 8] (the default), compute_steinhardt uses hardcoded Cartesian polynomial arithmetic instead of the associated-Legendre recurrence. Every real spherical harmonic S_lm(x,y,z) is a pure integer-coefficient polynomial on the unit sphere; SymPy joint CSE across all three l values yields 84 intermediates + 39 accumulation lines with no sqrt, no atan2, and no std::pow. Speedup: 1.4–1.6× at N = 100–1 000 vs. the ①②③ generic path. Other l combinations are unaffected.

Warning

When HAS_GRAPH = False, the five graph/ring/charge/Moran metrics fall back to a pure-Python path that builds a full N×N distance matrix (O(N²) memory and time). This is ~100× slower than the C++ path at N=500 (~100 ms vs. ~1 ms) and is intended only for environments where the C++ extension cannot be compiled. Reinstall with a C++17 compiler (pip install pybind11 && pip install -e .) to enable HAS_GRAPH = True.

Note

Distance cutoff.

All local metrics (H_spatial, RDF_dev, graph_*, Q*, ring_fraction, charge_frustration, moran_I_chi) share a single cutoff threshold. When cutoff=None (the default) it is auto-computed as 1.5 × median(rᵢ + rⱼ) over covalent radii.

For reproducible comparisons — especially with reference data — always pass cutoff= explicitly to StructureGenerator, StructureOptimizer, or compute_all_metrics().

When calling compute_all_metrics() directly, n_bins, w_atom, w_spatial, and cutoff all have sensible defaults (20, 0.5, 0.5, None), so the minimal explicit-cutoff form is:

from pasted._metrics import compute_all_metrics
metrics = compute_all_metrics(atoms, positions, cutoff=4.5)