Fitting Workflows¶

UFP has two complementary fitting paths:

direct least squares for models that are linear in their fitted coefficients;
gradient training for general differentiable models.

This page keeps the detailed fitting examples in one place. For a shorter map of coefficient selection, coefficient copying, projection, freezing, staged workflows, checkpoints, and prepared-geometry caches, start with coefficient workflows and workflow management.

Direct Least Squares¶

For fixed geometry, one-body and spline coefficient blocks are linear in their parameters. UFP assembles weighted design blocks and target vectors from FitSample objects. In the simplest energy-only case,

\[ \mathbf A \mathbf c \approx \mathbf y, \]

where \(\mathbf c\) is the flattened coefficient vector and \(\mathbf y\) contains target energies. Force fitting adds rows corresponding to

\[ -\frac{\partial E}{\partial \mathbf R_i}, \]

which are still linear in spline coefficients because the basis derivatives are known once geometry is fixed.

The least-squares objective is conceptually

\[ \min_{\mathbf c} \left\| \mathbf W(\mathbf A\mathbf c - \mathbf y) \right\|_2^2 + \lambda \left\| \mathbf c \right\|_2^2 + \Omega(\mathbf c), \]

where \(\mathbf W\) contains target weights, \(\lambda\) is ridge regularization, and \(\Omega\) can include block-specific shape penalties.

Regularization¶

LinearFitter supports global ridge values and term-family-specific ridge values for one-body, two-body, and three-body blocks. Two-body spline shape penalties can discourage rough or unphysical coefficient curves by penalizing third finite differences, early positive slopes, initial curvature, or low first coefficients.

These penalties operate on coefficients, not on sampled energies. They are therefore cheap and stable, but they are priors on the representation rather than direct physical constraints at every distance.

When a selected two-body fit covers only part of a pair channel or knot range, third-difference shape penalties use boundary-aware partial semantics. Rows that touch both selected and fixed coefficients still contribute selected-coefficient terms to the compact solve. The fixed neighboring coefficients are moved to the regularization right-hand side, so refitting a middle knot range with nonzero outside knots preserves the same physical prior as the corresponding full-block penalty.

Data-Scaled Ridge Guesses¶

For ridge values, raw neighbor, pair, or triplet counts are only indirect proxies. They affect the scale and conditioning of the design matrix, but the quantity that matters for a coefficient group is the weighted block curvature:

\[ \lambda_g = \alpha \frac{\operatorname{trace}(\mathbf A_g^\mathsf T \mathbf A_g)} {n_g}, \]

where \(\mathbf A_g\) is the weighted least-squares design block for group \(g\) and \(n_g\) is that group’s parameter count. ufp.workflows exposes this as a semi-automated starting point:

from ufp.workflows import estimate_linear_ridge_scales

estimate = estimate_linear_ridge_scales(
    model,
    training_samples,
    fitter_kwargs={
        "fit_energy": True,
        "fit_forces": True,
        "solver": "normal_equation_direct",
        "dtype": dtype,
    },
    fit_kwargs={"batch_size": 64},
    subset_size=128,
    alpha=1.0e-6,
)

ridge_kwargs = estimate.candidate().as_fitter_kwargs()

For larger studies, use progressive regularization tuning. It estimates the same data-scaled seed, evaluates log-spaced candidates on deterministic training and validation subsets, then rechecks the best candidates on the full validation split:

from ufp.workflows import RegularizationSearchConfig, tune_linear_regularization

search = tune_linear_regularization(
    make_model,
    dataset,
    config=RegularizationSearchConfig(
        stage_subset_sizes=(64, 256),
        top_k_per_stage=5,
        refit_full=True,
    ),
    fitter_kwargs={
        "fit_energy": True,
        "fit_forces": True,
        "solver": "cg",
        "dtype": dtype,
    },
    fit_kwargs={"batch_size": 64},
)

best_model = search.final_model
best_ridges = search.best_candidate.as_fitter_kwargs()

The tuner only adjusts coefficient ridge weights in v1. It preserves any explicit two-body shape penalty passed through fitter_kwargs, but does not tune that deprecated penalty family.

Selective Coefficient Fitting¶

LinearFitter can fit a subset of coefficients while preserving the current values of all other fittable coefficients. Existing string selectors still select whole blocks or body-order groups, for example fit_blocks=["threebody"] or freeze_blocks=["twobody"]. For finer control, use CoefficientSelector:

from ufp.leastsquares import (
    CoefficientSelector,
    LinearFitter,
    ParameterLayout,
    resolve_coefficient_selection_summary,
)

# Fit only three-body coefficients.
LinearFitter(model, fit_blocks=["threebody"])

# Keep an existing two-body term fixed and fit the remaining coefficients as a
# correction. Explicit freeze selectors take precedence over fit selectors, and
# fixed coefficients are subtracted from the least-squares target internally.
LinearFitter(model, freeze_blocks=["twobody"])

# Fit one pair channel inside a categorized two-body term. Pair selectors are
# canonicalized with the term's symmetry convention.
LinearFitter(
    model,
    fit_blocks=[CoefficientSelector(block="twobody", channel=(1, 8))],
)

# Fit one source-distinguished three-body triplet channel.
LinearFitter(
    model,
    fit_blocks=[CoefficientSelector(block="threebody", channel=(8, 1, 1))],
)

# Refit only the first four knots for one pair channel.
LinearFitter(
    model,
    fit_blocks=[
        CoefficientSelector(
            block="twobody",
            channel=(1, 8),
            coeff_slice=slice(0, 4),
        )
    ],
)

# Inspect how selectors resolve before starting a fit.
layout = ParameterLayout.from_model(model, include_frozen=True)
summary = resolve_coefficient_selection_summary(
    layout,
    fit_blocks=[
        CoefficientSelector(
            block="twobody",
            channel=(1, 8),
            coeff_slice=slice(0, 4),
        )
    ],
)
summary.selected_block_labels
summary.fit_entries[0].original_indices
summary.fit_entries[0].compact_slice

Pair and two-body selectors use pair channels (Z_i, Z_j). Three-body and 2D-triplet selectors use source-distinguished triplet channels (Z_center, Z_neighbor_1, Z_neighbor_2), with the two neighbor entries canonicalized into the term’s symmetric order. Coefficient slices address the trailing spline dimensions: one dimension for pair terms, three dimensions for SplineThreeBodyTerm, and two dimensions for SplineTriplet2DTerm.

Partial refits preserve non-selected coefficients exactly and subtract their current linear contribution from the target internally. With the default assembly_contract="term", terms that implement selected-column assembly build only the requested compact columns. Unsupported blocks and assembly_contract="block" keep the full assembly plus slicing path as the compatibility fallback, so missing selected assemblers are a performance limitation rather than a correctness gap.

For example, this refits four middle pair knots while keeping nonzero boundary neighbors fixed. Legacy two-body shape penalties, when present in older scripts, continue to use boundary-aware partial semantics, but new workflows should start with ridge values and validation-driven tuning:

from ufp.leastsquares import CoefficientSelector, LinearFitter

fitter = LinearFitter(
    model,
    fit_blocks=[CoefficientSelector(block="pair", coeff_slice=slice(2, 6))],
    pair_ridge=1.0e-8,
)
fitter.fit(samples, batch_size=32)

Large partial three-body refits can use the same selector syntax. The default term assembly contract will use selected assembly when the term provides it and will otherwise fall back to full assembly plus slicing:

fitter = LinearFitter(
    model,
    fit_blocks=[
        CoefficientSelector(
            block="threebody",
            channel=(74, 74, 74),
            coeff_slice=(slice(0, 8), slice(0, 8), slice(0, 8)),
        )
    ],
    solver="cg",
    matrix_storage="column_chunked",
)
fitter.fit(samples, batch_size=8, cache_directory="cache/partial-w")

Coefficient Interchange¶

The ufp.coefficients namespace provides model-surgery helpers for compatible physical coefficient channels. They build on CoefficientSelector, validate channel identity, spline family, grid, cutoff, dtype, device, active masks, symmetry, and category ordering, then copy only compatible values.

A common staged workflow is to fit elemental pair models first, then initialize the matching channels of a mixed model while leaving cross terms at their template values:

from ufp.coefficients import (
    clone_model_with_zeroed_coefficients,
    copy_matching_coefficients,
)

mixed = clone_model_with_zeroed_coefficients(mixed_template)
hh_report = copy_matching_coefficients(hh_pair_model, mixed)
oo_report = copy_matching_coefficients(oo_pair_model, mixed)

hh_report.copied_count
oo_report.copied_count

Use copy_coefficient_channel() when the source and target selectors differ or when a workflow should fail unless one specific channel is compatible.

Offline Projection¶

The ufp.projection namespace contains offline helpers for turning prior functions or existing spline channels into coefficient blocks. Projection is a model-construction step: it writes ordinary spline coefficients and returns diagnostics, but it does not add projection logic to runtime term evaluation.

Use project_radial_function() to sample a callable pair prior and solve for a 1D spline row. Use project_pair_prior() when the source is one of the analytic pair priors from ufp.terms; it calls the prior’s radial_values(pair, distances) method and attaches the prior’s projection_metadata() payload to result.metadata. Both results store detailed sampled values on the result object and shared ProjectionDiagnostics under result.diagnostics:

import torch

from ufp import SplinePairTerm, SplineTwoBodyTerm, UFPModel
from ufp.coefficients import copy_matching_coefficients
from ufp.leastsquares import CoefficientSelector, LinearFitter
from ufp.projection import project_radial_function

hh_prior = project_radial_function(
    lambda r: 0.2 * r.square() + 0.05 * r + 0.3,
    coeff_size=8,
    full_support_start=0.0,
    cutoff=2.0,
    dtype=torch.float64,
    channel_label="pair[(1, 1)]",
)

hh_model = UFPModel(
    pair_terms=[
        SplinePairTerm(
            cutoff=2.0,
            pair=(1, 1),
            coeffs=hh_prior.coeffs.clone(),
            full_support_start=0.0,
        )
    ],
    atomic_types=[1],
)
mixed_model = UFPModel(
    pair_terms=[
        SplineTwoBodyTerm(
            cutoff=2.0,
            atomic_types=[1, 8],
            coeffs_by_pair=torch.zeros((3, 8), dtype=torch.float64),
            full_support_start=0.0,
        )
    ],
    atomic_types=[1, 8],
)

copy_matching_coefficients(hh_model, mixed_model)

LinearFitter(
    mixed_model,
    fit_blocks=[
        CoefficientSelector(
            block="twobody",
            channel=(1, 1),
            coeff_slice=slice(2, 6),
        )
    ],
    fit_forces=False,
    solver="normal_equation_direct",
    ridge=1.0e-8,
).fit(samples, batch_size=32)

hh_prior.diagnostics.summary.max_value_rmse

Spline-to-spline helpers such as project_pair_to_twobody(), project_triplet2d_channel(), and project_threebody_channel() support compatible channel projection between existing terms. Their diagnostics use the same channel-level objects and include derivative-error summaries when the projection includes derivative rows.

Analytic Priors and Residualized Corrections¶

Stage-wise workflows can combine nonlinear physical priors with linear spline corrections. Keep the three mechanisms distinct:

project_pair_prior() turns an analytic pair prior into ordinary spline coefficients. After projection, least squares sees only a spline block.
HybridLinearFitter subtracts frozen nonlinear terms on the fly, then delegates the remaining selected coefficient solve to LinearFitter.
materialize_residual_dataset() writes residual energy, force, or stress labels into an ASEAtomsDataset for longer training runs.

The public analytic prior names are InversePowerPairPrior, DampedInversePowerPairPrior, ExponentialRepulsionPairPrior, MorsePairPrior, and ScaledZBLPairPrior. CutoffEnvelope and the helper functions cutoff_envelope_values(), cutoff_envelope_derivatives(), normalize_cutoff_envelope(), and apply_cutoff_envelope() live under ufp.terms.

Project an analytic prior into a pair spline, then fine-tune only selected knots:

import torch

from ufp import ExponentialRepulsionPairPrior, SplinePairTerm, UFPModel
from ufp.leastsquares import CoefficientSelector, LinearFitter
from ufp.projection import project_pair_prior

prior = ExponentialRepulsionPairPrior(
    cutoff=2.0,
    atomic_types=[1],
    amplitudes_by_pair={(1, 1): 0.9},
    decay_rates_by_pair={(1, 1): 1.4},
    cutoff_envelope="none",
    trainable=False,
    dtype=torch.float64,
)
projection = project_pair_prior(
    prior,
    (1, 1),
    coeff_size=10,
    full_support_start=0.2,
)
model = UFPModel(
    pair_terms=[
        SplinePairTerm(
            cutoff=projection.cutoff,
            pair=(1, 1),
            coeffs=projection.coeffs.clone(),
            spline=projection.spline,
            full_support_start=projection.full_support_start,
        )
    ],
    atomic_types=[1],
)

LinearFitter(
    model,
    fit_blocks=[CoefficientSelector(block="pair", coeff_slice=slice(2, 6))],
    fit_forces=False,
).fit(samples, batch_size=32)

projection.metadata["prior"]["parameter_state_hash"]

Use HybridLinearFitter when the prior should stay nonlinear and frozen while a selected spline correction is solved. This is not the same as LinearFitter(freeze_blocks=...): freeze selectors only subtract fixed linear coefficient blocks already represented in the least-squares layout.

import torch

from ufp import ExponentialRepulsionPairPrior, SplineThreeBodyTerm, UFPModel
from ufp.leastsquares import CoefficientSelector, HybridLinearFitter

prior = ExponentialRepulsionPairPrior(
    cutoff=2.0,
    atomic_types=[1],
    amplitudes_by_pair={(1, 1): 0.4},
    decay_rates_by_pair={(1, 1): 0.8},
    cutoff_envelope="none",
    trainable=False,
    dtype=torch.float64,
)
threebody = SplineThreeBodyTerm(
    cutoff=2.0,
    atomic_types=[1],
    coeffs_by_triplet=torch.zeros((1, 8, 8, 8), dtype=torch.float64),
    full_support_start_xy=0.0,
    full_support_start_z=0.0,
)
model = UFPModel(
    pair_terms=[prior],
    threebody_terms=[threebody],
    atomic_types=[1],
)

HybridLinearFitter(
    model,
    frozen_terms=[prior],
    fit_blocks=[
        CoefficientSelector(
            block="threebody",
            channel=(1, 1, 1),
            coeff_slice=(slice(0, 4), slice(0, 4), slice(0, 4)),
        )
    ],
    fit_forces=False,
).fit(samples, batch_size=16, cache_directory="cache/hybrid")

For longer optimizer runs, materialize residual labels once and carry strict metadata with every sample:

from ufp.workflows import materialize_residual_dataset

residual = materialize_residual_dataset(
    model,
    dataset,
    selectors=("pair",),
    targets=("energy", "forces"),
    target_weights={"energy": 1.0, "forces": 0.1},
    metadata_key="frozen_prior_residuals",
)

residual.dataset[0].metadata["frozen_prior_residuals"]["metadata_hash"]
residual.metadata["frozen_terms_state_hash"]

Residual metadata hashes include the selected frozen term states, selectors, target keys, target weights, units, and optional analytic prior projection metadata. Hybrid cache directories use the same frozen-term state hash under HybridLinearFitter.frozen_terms_state_hash.

Optimizer-Time Freezing¶

Gradient training can freeze the same coefficient selections by building a mask-and-restore wrapper around a PyTorch optimizer:

import torch

from ufp.leastsquares import CoefficientSelector
from ufp.training import freeze_model_coefficients

freeze_state = freeze_model_coefficients(
    model,
    [
        CoefficientSelector(
            block="twobody",
            channel=(1, 8),
            coeff_slice=slice(0, 4),
        )
    ],
)

freeze_state.affected_parameter_names
freeze_state.frozen_counts
freeze_state.selector_metadata

optimizer = torch.optim.AdamW(model.parameters(), lr=1.0e-3, weight_decay=1.0e-2)
freeze_state.wrap_optimizer(optimizer)

The wrapper masks gradients, restores frozen entries after every optimizer step, and clears optimizer state entries for frozen coefficients. This prevents AdamW weight decay and stale moment buffers from drifting frozen coefficients. Calling wrap_optimizer() with the same CoefficientFreezeState is idempotent; applying a different freeze state to an already wrapped optimizer raises a clear error.

Workflow Stages and Checkpoints¶

The ufp.workflows namespace includes small stage objects for explicit staged fitting scripts. They are wrappers around existing primitives, not a hidden workflow runner: users still build a sequence, run each stage, and own the context mapping between stages.

Available stage objects are ProjectStage, LinearFitStage, ResidualizeStage, TrainStage, and ValidateStage. Each declares required context keys, produced context keys, and checkpoint-ready metadata. Use workflow_stage_metadata() to collect those stage records, then pass the result to save_workflow_checkpoint():

import torch

from ufp.leastsquares import CoefficientSelector
from ufp.projection import project_radial_function
from ufp.training import LossWeights
from ufp.workflows import (
    LinearFitStage,
    ProjectStage,
    TrainStage,
    ValidateStage,
    save_workflow_checkpoint,
    workflow_stage_metadata,
)

fit_selector = CoefficientSelector(block="pair", coeff_slice=slice(2, 6))
freeze_selector = CoefficientSelector(block="pair", coeff_slice=slice(0, 2))

context = {"model": model, "fit_samples": samples}
stages = [
    ProjectStage(projector=project_radial_function, projector_kwargs=projection_args),
    LinearFitStage(
        fitter_kwargs={
            "fit_blocks": [fit_selector],
            "fit_forces": False,
            "solver": "normal_equation_direct",
        },
        fit_kwargs={"batch_size": 16},
    ),
    TrainStage(
        freeze_selectors=[freeze_selector],
        fit_kwargs={
            "epochs": 5,
            "loss_weights": LossWeights(energy=1.0, forces=0.0),
            "dtype": torch.float64,
        },
    ),
    ValidateStage(evaluate_kwargs={"dtype": torch.float64}),
]

results = []
for stage in stages:
    result = stage.run(context)
    result.update_context(context)
    results.append(result)

save_workflow_checkpoint(
    "workflow.pt",
    model,
    fit_blocks=[fit_selector],
    freeze_blocks=[freeze_selector],
    stage_metadata=workflow_stage_metadata(results, name="pair-refit"),
    projection_diagnostics=context["projection_result"].diagnostics,
    validation_metrics=context["validation_metrics"],
)

Workflow checkpoints store the package version, model and term metadata, coefficient layout, selector metadata, fixed-coefficient hashes, stage metadata, projection diagnostics, validation metrics, user metadata, and the model state_dict. load_workflow_checkpoint() validates the schema and can reject model-layout mismatches before loading parameters.

ufp.workflows.prepared is an advanced prepared-geometry helper for workflow-level cache reuse. It can materialize tensorized geometry, neighbor lists, pair categories, optional triplet cache metadata, and strict source signatures from UFPInput, ASE batches, training batches, or least-squares prepared batches. It is intentionally not exported from top-level ufp.workflows and is not a runtime input path for model evaluation; import it directly only for workflows that need reusable prepared geometry.

Caching¶

Large three-body fits can spend significant time assembling design blocks. UFP can write assembled batches, normal-equation components, and dense three-body feature caches to disk. Cache manifests include enough metadata to reject incompatible requests when samples, targets, dtype, backend settings, or coefficient bases change. Assembled least-squares caches use semantic coefficient-channel metadata for built-in one-body, pair/two-body spline, and three-body spline terms, so a full non-alchemical cache can be streamed into a compatible alchemical fit without writing a second adapted cache. This cache format is intentionally not compatible with assembled caches written by older UFP versions.

Cache directories are settings-addressed. The library derives child directory names from a stable SHA-256 digest of the cache identity metadata, and each written cache also contains a cache_settings.json summary next to the manifest. That file is intended for humans: it records the cache kind, digest, manifest name, schema version, and the invalidating settings used to derive the folder. Example and workflow code can use the public ufp.cache.CacheIdentity API, or the top-level ufp.suggest_cache_directory() helper, when it needs a deterministic parent folder name from notebook or CLI parameters.

Training dense three-body feature caches use a separate V2 manifest with input signatures, dtype and batch size, atomic and triplet category order, active triplet indices, spline support/grid metadata, coefficient shape, and row semantics. feature_cache_mode="read" requires an exact cache or a compatible active-triplet superset and fails when the cache is missing or stale. feature_cache_mode="refresh" intentionally rebuilds the cache. The Li-P-S training examples use read by default, so first-time runs require --refresh-cache.

For selective fits, cache and CG checkpoint compatibility metadata includes the stable keys coefficient_selection, fixed_coefficients_signature, and regularization_semantics. coefficient_selection records the compact selected layout for each block, and fixed_coefficients_signature changes when preserved coefficient values outside the solve layout change. regularization_semantics changes when the meaning of partial shape regularization changes. CG checkpoints also store the concrete regularization weights in a regularization payload. Selected assembly does not need separate correctness metadata because selected-column hooks and full assembly plus slicing are required to produce the same compact problem.

Use disk-backed caches when repeated solves or checkpointed experiments reuse the same fixed geometries. Prefer ordinary in-memory assembly for small models and early debugging.

Gradient Training¶

The training stack keeps ASE structures in ASEAtomsDataset, collates them into ASEAtomsBatch, prepares UFPInput, and optimizes model parameters with PyTorch. Energy, forces, and stress can contribute to the loss when labels are available.

Training is more general than linear fitting because it can optimize nonlinear parameters and alchemical mixing weights directly. Least squares is often faster and more deterministic when the coefficient-linear assumption applies.