Predictive Uncertainty¶
UFP uncertainty workflows build Bayesian posteriors for coefficient-linear
models. The posterior is exact for the flattened coefficient blocks exposed to
LinearFitter; nonlinear parameters are treated as fixed unless the workflow
first converts them into a coefficient-linear proxy.
Linear Coefficient Posterior¶
Use fit_linear_uncertainty_model when the model is ordinary coefficient-linear
or when you have built a coefficient-linear proxy for a trained model:
from ufp.leastsquares import LinearFitter
from ufp.uncertainty import fit_linear_uncertainty_model
fitter = LinearFitter(
model,
fit_energy=True,
fit_forces=True,
solver="normal_equation_direct",
ridge=1.0e-8,
dtype=dtype,
)
problem = fitter.build_problem(samples, batch_size=32, cache_directory=cache_dir)
posterior = fit_linear_uncertainty_model(
model,
samples,
fitter=fitter,
problem=problem,
refit_mean=False,
)
posterior.save_memmap(posterior_dir)
Passing a prebuilt problem is useful when a workflow needs to inspect row
counts, train an aleatoric noise head, or reuse a cached design assembly without
triggering another LinearFitter.build_problem call.
Sparse Prediction Rows¶
Prediction variances use sparse rows and the dense coefficient covariance:
from ufp.uncertainty import (
combine_total_energy_rows,
predict_with_uncertainty,
variance_for_energy_row,
)
prediction_a = predict_with_uncertainty(
model,
atoms_a,
posterior,
fitter=fitter,
return_rows=True,
)
prediction_b = predict_with_uncertainty(
model,
atoms_b,
posterior,
fitter=fitter,
return_rows=True,
)
delta_row = combine_total_energy_rows(
[
(prediction_a.rows.total_energy_row, 1.0),
(prediction_b.rows.total_energy_row, -1.0),
]
)
delta_variance = variance_for_energy_row(delta_row, posterior)
Atomic energy and force-component variances are computed from row diagonals. UFP
does not materialize an atom-by-atom covariance matrix. Use
variance_for_sparse_rows(rows, posterior, chunk_size=...) when evaluating many
atomic or force rows; it batches dense-covariance gathers without forming the
full row covariance.
Alchemical Models¶
fit_alchemical_uncertainty_model freezes non-identity alchemical mixing
weights, then builds one fixed-weight direct/proxy posterior. The posterior
covers direct coefficients and proxy coefficients; mixing weights remain point
estimates.
Aleatoric Noise¶
SplineAleatoricNoiseModel is a positive spline variance head using
softplus(raw) + variance_floor. The V2 path uses
SplineAleatoricNoiseBundle, which stores separate optional heads for
structure energy per atom, per-atom energy decomposition, and force components.
The default AleatoricFeatureSpec(kind="log_num_atoms") evaluates heads from
log1p(n_atoms), so prediction-time aleatoric variances can vary by structure
size without rebuilding the least-squares design matrix.
Pass an initialized bundle to fit_linear_uncertainty_model or
fit_alchemical_uncertainty_model:
from ufp.uncertainty import SplineAleatoricNoiseBundle, SplineAleatoricNoiseModel
noise_bundle = SplineAleatoricNoiseBundle(
energy_per_atom=SplineAleatoricNoiseModel(...),
force_component=SplineAleatoricNoiseModel(...),
)
posterior = fit_linear_uncertainty_model(
model,
samples,
fitter=fitter,
aleatoric_noise_bundle=noise_bundle,
aleatoric_steps=200,
)
save_uncertainty_prediction_bundle(
bundle_dir,
model=linearized_model,
posterior=posterior,
aleatoric_noise_bundle=noise_bundle,
)
make_predictions.py --uncertainty-bundle ... evaluates the serialized bundle
for each structure. It writes energy and per-atom aleatoric arrays by default,
and force-component aleatoric arrays when --uncertainty-forces is supplied.
The older scalar aleatoric_variance bundle field is still supported for
backward-compatible prediction files.
Prediction Bundles¶
Use save_uncertainty_prediction_bundle to persist the model, posterior
memmap, posterior layout, optional aleatoric noise bundle, optional calibration
state, and manifest needed for standalone prediction:
from ufp.uncertainty import save_uncertainty_prediction_bundle
save_uncertainty_prediction_bundle(
bundle_dir,
model=linearized_model,
posterior=posterior,
source_checkpoint=checkpoint_path,
aleatoric_noise_bundle=noise_bundle,
)
The bundle is a model artifact, not a least-squares cache. Prediction with
examples/make_predictions.py --uncertainty-bundle bundle_dir does not need
the training-set design cache; that cache only accelerates posterior fitting.
The manifest records hashes for the model checkpoint, posterior files,
serialized aleatoric artifacts, and calibration files when present, so stale
bundle members are rejected on load. Use
examples/inspect_uncertainty_bundle.py bundle_dir to print schema version,
posterior size/layout, aleatoric state, energy variance scale, source
checkpoint metadata, and validation status.
Variance Scaling¶
Calibration can fit a post-hoc multiplicative energy variance scale from prediction files:
python examples/calibrate_uncertainty.py \
predictions_holdout.npz \
--fit-energy-scale \
--save-scale-to-bundle path/to/uncertainty_bundle
When a bundle has a saved energy scale, make_predictions.py applies it to
energy epistemic, aleatoric, total, per-atom standard-deviation, and per-atom
energy variance arrays. Force variance scaling is not applied in V2; force
calibration is diagnostic-only.
Calibration¶
After writing uncertainty-enabled prediction files, run calibration diagnostics
on the split .npz outputs:
python examples/calibrate_uncertainty.py \
examples/02-tungsten/tungsten_holdout_predictions.npz \
--plot-dir examples/02-tungsten/uncertainty_plots
When examples/make_predictions.py is run with --uncertainty-bundle, it
prints the matching examples/calibrate_uncertainty.py command after writing
prediction files.
The calibration helper compares per-atom energy residuals with predicted
per-atom energy standard deviations derived from energy_total_variance by
default. It reports Gaussian NLL, normalized residual mean/std, empirical
coverage at common nominal intervals, a calibration slope, and the correlation
between absolute residual and predicted standard deviation. Use
--variance-key energy_epistemic_variance to inspect epistemic-only calibration.
Add --include-forces to compute the same diagnostics for
force_total_variance_components when prediction files were written with
make_predictions.py --uncertainty-forces. examples/plot_prediction_density.py --with-uncertainty can also write energy and force calibration plots next to
the usual density plots for prediction files that contain uncertainty arrays.
Minimal Alchemical Example¶
examples/alchemical_uncertainty_demo.py is a small synthetic fixed-weight
alchemical example. It fits an alchemical proxy posterior, saves a reusable
bundle, reloads it, and verifies prediction uncertainty without requiring an
external dataset.
Li-P-S Alchemical Example¶
examples/05-lips/alchemical_uncertainty.py is the real alchemical uncertainty
workflow. It loads examples/05-lips/lips_alchemical_uf23_model_lstsq.pt,
builds a fixed-weight direct/proxy posterior without rerunning ALS by default,
saves a reusable bundle under examples/05-lips/uncertainty_models/, and
prints follow-on make_predictions.py --uncertainty-bundle ... and
calibrate_uncertainty.py --include-forces ... commands. Use
--max-training-structures, --max-prediction-structures, and
--aleatoric-steps for bounded smoke runs.
Tungsten Example¶
examples/02-tungsten/uf23_constrained_wall_uncertainty_demo.py demonstrates
the full workflow on the constrained-wall tungsten training checkpoint. It loads
uf23_constrained_wall_training_best.pt, converts the constrained wall into an
equivalent ordinary spline pair term, fits the posterior over the resulting
coefficient-linear proxy, saves a reusable uncertainty bundle under
uncertainty_models/, and saves an ignored .npz summary of holdout
uncertainties. The script prints the follow-on make_predictions.py and
calibrate_uncertainty.py commands for full prediction-file calibration.