stochkin.uncertainty

Monte Carlo uncertainty propagation for CTMC kinetics.

This module provides bootstrap-style uncertainty estimation: given credible intervals (or standard deviations) on the free energy F(s) and the diffusion coefficient D(s), it generates perturbed replicates, runs the full 1-D CTMC pipeline on each, and collects statistics (mean, std, confidence intervals) on rates, exit times, and branching probabilities.

Key functions

stochkin.uncertainty.bootstrap_ctmc_1d(s, F, ...)

Propagate F(s) / D(s) uncertainties through the 1-D CTMC pipeline.

stochkin.uncertainty.bootstrap_ctmc_with_hummer_D(...)

Propagate Hummer-posterior uncertainties through the 1-D CTMC pipeline.

stochkin.uncertainty.UncertaintyResult(...)

Container for bootstrap uncertainty estimates.

Detailed API

stochkin.uncertainty

Monte Carlo uncertainty propagation for CTMC kinetics on 1-D free-energy surfaces.

Strategy. Perturb the inputs F(s) and D(s) within their error bars, re-run the CTMC pipeline for each perturbed sample, and collect statistics (mean, standard deviation, percentile-based confidence intervals) on every predicted quantity (rates, exit times, branching probabilities).

Two perturbation models are provided:

  • Gaussian (additive) – suitable for free energies F(s).

  • Log-normal (multiplicative) – suitable for diffusion coefficients D(s), which must remain positive.

When the Hummer Bayesian estimator supplies posterior credible intervals (F_lo/F_hi, D_lo/D_hi), the module converts them to standard deviations automatically.

Main entry points

bootstrap_ctmc_1d

Propagate F and D uncertainties through run_1d_ctmc().

bootstrap_ctmc_with_hummer_D

Convenience wrapper that reads Hummer posterior intervals from CSV and propagates them through run_1d_ctmc_with_hummer_D().

class stochkin.uncertainty.UncertaintyResult(reference, n_bootstrap, n_failed, confidence_level, K_mean, K_std, K_ci_lo, K_ci_hi, K_samples, K_ps_mean, K_ps_std, K_ps_ci_lo, K_ps_ci_hi, exit_mean_mean, exit_mean_std, exit_mean_ci_lo, exit_mean_ci_hi, exit_mean_samples, k_out_mean, k_out_std, k_out_ci_lo, k_out_ci_hi, k_out_samples, p_branch_mean, p_branch_std, p_branch_ci_lo, p_branch_ci_hi, p_branch_samples)[source]

Bases: object

Container for bootstrap uncertainty estimates.

All *_mean, *_std, *_ci_lo, *_ci_hi arrays have the same shape as the corresponding quantity in the reference (unperturbed) result. The *_samples arrays have an extra leading axis of size n_bootstrap.

Parameters:
reference

Full result dictionary from the unperturbed run (see run_1d_ctmc()).

Type:

dict

n_bootstrap

Number of successful bootstrap replicates.

Type:

int

n_failed

Number of failed replicates (basin detection changed, solver diverged, etc.).

Type:

int

confidence_level

Confidence level for the reported CI (default 0.95).

Type:

float

K_mean, K_std, K_ci_lo, K_ci_hi

Statistics of the rate matrix K [1/time_unit].

Type:

ndarray

K_ps_mean, K_ps_std, K_ps_ci_lo, K_ps_ci_hi

Statistics of K in ps⁻¹.

Type:

ndarray

exit_mean_mean, exit_mean_std, exit_mean_ci_lo, exit_mean_ci_hi

Statistics of the mean exit time per basin [time_unit].

Type:

ndarray

k_out_mean, k_out_std, k_out_ci_lo, k_out_ci_hi

Statistics of the total exit rate per basin [1/time_unit].

Type:

ndarray

p_branch_mean, p_branch_std, p_branch_ci_lo, p_branch_ci_hi

Statistics of the branching-probability matrix.

Type:

ndarray

K_samples

Raw bootstrap samples of K.

Type:

ndarray, shape (n_bootstrap, n_basins, n_basins)

exit_mean_samples

Raw samples of exit times.

Type:

ndarray, shape (n_bootstrap, n_basins)

k_out_samples

Raw samples of exit rates.

Type:

ndarray, shape (n_bootstrap, n_basins)

p_branch_samples

Raw samples of branching probabilities.

Type:

ndarray, shape (n_bootstrap, n_basins, n_basins)

reference: dict
n_bootstrap: int
n_failed: int
confidence_level: float
K_mean: ndarray
K_std: ndarray
K_ci_lo: ndarray
K_ci_hi: ndarray
K_samples: ndarray
K_ps_mean: ndarray
K_ps_std: ndarray
K_ps_ci_lo: ndarray
K_ps_ci_hi: ndarray
exit_mean_mean: ndarray
exit_mean_std: ndarray
exit_mean_ci_lo: ndarray
exit_mean_ci_hi: ndarray
exit_mean_samples: ndarray
k_out_mean: ndarray
k_out_std: ndarray
k_out_ci_lo: ndarray
k_out_ci_hi: ndarray
k_out_samples: ndarray
p_branch_mean: ndarray
p_branch_std: ndarray
p_branch_ci_lo: ndarray
p_branch_ci_hi: ndarray
p_branch_samples: ndarray
summary(time_unit='')[source]

Return a human-readable summary string.

Parameters:

time_unit (str)

Return type:

str

stochkin.uncertainty.bootstrap_ctmc_1d(s, F, D, *, F_err=None, F_lo=None, F_hi=None, D_err=None, D_rel_err=None, D_lo=None, D_hi=None, n_bootstrap=200, ci_level=0.95, confidence=None, corr_length=None, seed=None, T=300.0, time_unit='ps', max_basins=None, core_fraction=0.05, init_weight='boltzmann', verbose=False)[source]

Propagate F(s) / D(s) uncertainties through the 1-D CTMC pipeline.

For each bootstrap replicate:

  1. Draw a perturbed F(s) from N(F, σ_F) (Gaussian, additive).

  2. Draw a perturbed D(s) from LogNormal(D, σ_log D) or N(D, σ_D) (always clamped to D > 0).

  3. Run run_1d_ctmc() on the perturbed inputs.

If the number of detected basins changes for a given replicate (different topology), that replicate is discarded.

Parameters:
  • s (array-like) – Central (best-estimate) grid, free energy, and diffusion coefficient. Same semantics as run_1d_ctmc().

  • F (array-like) – Central (best-estimate) grid, free energy, and diffusion coefficient. Same semantics as run_1d_ctmc().

  • D (array-like) – Central (best-estimate) grid, free energy, and diffusion coefficient. Same semantics as run_1d_ctmc().

  • F_err (float or array, optional) – Standard deviation of F(s). Scalar → uniform error.

  • F_lo (array, optional) – Lower / upper bounds of a credible interval on F. Used to compute σ_F when F_err is not given.

  • F_hi (array, optional) – Lower / upper bounds of a credible interval on F. Used to compute σ_F when F_err is not given.

  • D_err (float or array, optional) – Absolute standard deviation of D(s) (Gaussian perturbation).

  • D_rel_err (float or array, optional) – Relative error of D(s) (e.g. 0.3 = 30 %). Converted to a log-normal σ.

  • D_lo (array, optional) – Lower / upper bounds of a credible interval on D(s). Converted to a log-normal σ when neither D_err nor D_rel_err is given.

  • D_hi (array, optional) – Lower / upper bounds of a credible interval on D(s). Converted to a log-normal σ when neither D_err nor D_rel_err is given.

  • n_bootstrap (int) – Number of bootstrap replicates (default 200).

  • ci_level (float) – Credible-interval level used to interpret F_lo/F_hi and D_lo/D_hi (default 0.95 → 95 % CI).

  • confidence (float, optional) – Confidence level for the output intervals (defaults to ci_level).

  • corr_length (float, optional) – Spatial correlation length (in CV units) for the perturbation noise. When set, point-wise i.i.d. noise is smoothed by a Gaussian kernel of this width, producing correlated perturbations. None → independent noise at each grid point.

  • seed (int, optional) – Random seed for reproducibility.

  • T (float) – Forwarded to run_1d_ctmc().

  • time_unit (str) – Forwarded to run_1d_ctmc().

  • max_basins (int | None) – Forwarded to run_1d_ctmc().

  • core_fraction (float | None) – Forwarded to run_1d_ctmc().

  • init_weight (str) – Forwarded to run_1d_ctmc().

  • verbose (bool) – If True, print a progress counter.

Returns:

Dataclass with *_mean, *_std, *_ci_lo, *_ci_hi for every CTMC output, plus the full *_samples arrays and the reference (unperturbed) result.

Return type:

UncertaintyResult

Examples

>>> import numpy as np, stochkin as sk
>>> s = np.linspace(0, 1, 200)
>>> F = 5.0 * (1 - (2*s - 1)**2)**2; F -= F.min()
>>> res = sk.bootstrap_ctmc_1d(s, F, D=0.01, F_err=0.5,
...                            n_bootstrap=50, seed=42)
>>> print(res.summary("ps"))
stochkin.uncertainty.bootstrap_ctmc_with_hummer_D(fes_path, d_csv, *, fes_err_path=None, F_err=None, F_lo_col='F_lo', F_hi_col='F_hi', D_lo_col='D_lo', D_hi_col='D_hi', perturb_D=True, perturb_F=True, n_bootstrap=200, ci_level=0.95, confidence=None, corr_length=None, seed=None, T=300.0, time_unit='ps', d_xcol='x_interface', d_col='D_med', d_grid='interface', d_interface_mode='harmonic', d_time_unit='ps', d_interp_method='linear', crop=None, resample_n=500, s_col=0, F_col=1, max_basins=None, core_fraction=0.05, init_weight='boltzmann', verbose=False)[source]

Propagate Hummer-posterior uncertainties through the 1-D CTMC pipeline.

This is a convenience wrapper around bootstrap_ctmc_1d() that:

  1. Loads the PLUMED 1-D FES and Hummer D-profile CSV.

  2. Reads D_lo / D_hi columns (and optionally F_lo / F_hi) from the CSV to construct per-point error bars.

  3. Converts both inputs to the common uniform grid.

  4. Calls bootstrap_ctmc_1d() with the resolved σ arrays.

Parameters:
  • fes_path (str or Path) – PLUMED 1-D FES file.

  • d_csv (str or Path) – Hummer diffusion-profile CSV (must contain d_xcol, d_col, and – if perturb_DD_lo_col and D_hi_col).

  • fes_err_path (str or Path, optional) – Separate CSV with the FES credible interval. If given, must contain columns x_center, F_lo, F_hi (or names set by F_lo_col / F_hi_col). When the FES uncertainty lives in the same CSV as D, set this to the same file.

  • F_err (float or array, optional) – Uniform FES standard deviation. Overrides fes_err_path.

  • perturb_D (bool) – Toggle individual perturbation channels (default both True).

  • perturb_F (bool) – Toggle individual perturbation channels (default both True).

  • n_bootstrap (int) – Bootstrap and perturbation parameters — see bootstrap_ctmc_1d().

  • ci_level (float) – Bootstrap and perturbation parameters — see bootstrap_ctmc_1d().

  • confidence (float | None) – Bootstrap and perturbation parameters — see bootstrap_ctmc_1d().

  • corr_length (float | None) – Bootstrap and perturbation parameters — see bootstrap_ctmc_1d().

  • seed (int | None) – Bootstrap and perturbation parameters — see bootstrap_ctmc_1d().

  • T (float)

  • time_unit (str)

  • d_xcol (str)

  • d_col (str)

  • d_grid (str)

  • d_interface_mode (str)

  • d_time_unit (str)

  • F_lo_col (str)

  • F_hi_col (str)

  • D_lo_col (str)

  • D_hi_col (str)

  • d_interp_method (str)

  • crop (Tuple[float, float] | None)

  • resample_n (int)

  • s_col (int)

  • F_col (int)

  • max_basins (int | None)

  • core_fraction (float | None)

  • init_weight (str)

  • verbose (bool)

Return type:

UncertaintyResult

:param : :param d_interp_method: :param crop: :param resample_n: :param s_col: :param F_col: :param max_basins: :param : :param core_fraction: Forwarded to the underlying CTMC pipeline (same semantics as

Parameters:
Return type:

UncertaintyResult