# Fitting Workflows UFP has two complementary fitting paths: - direct least squares for models that are linear in their fitted coefficients; - gradient training for general differentiable models. This page keeps the detailed fitting examples in one place. For a shorter map of coefficient selection, coefficient copying, projection, freezing, staged workflows, checkpoints, and prepared-geometry caches, start with [coefficient workflows](coefficient-workflows.md) and [workflow management](workflow-management.md). ## Direct Least Squares For fixed geometry, one-body and spline coefficient blocks are linear in their parameters. UFP assembles weighted design blocks and target vectors from `FitSample` objects. In the simplest energy-only case, $$ \mathbf A \mathbf c \approx \mathbf y, $$ where $\mathbf c$ is the flattened coefficient vector and $\mathbf y$ contains target energies. Force fitting adds rows corresponding to $$ -\frac{\partial E}{\partial \mathbf R_i}, $$ which are still linear in spline coefficients because the basis derivatives are known once geometry is fixed. The least-squares objective is conceptually $$ \min_{\mathbf c} \left\| \mathbf W(\mathbf A\mathbf c - \mathbf y) \right\|_2^2 + \lambda \left\| \mathbf c \right\|_2^2 + \Omega(\mathbf c), $$ where $\mathbf W$ contains target weights, $\lambda$ is ridge regularization, and $\Omega$ can include block-specific shape penalties. ## Regularization `LinearFitter` supports global ridge values and term-family-specific ridge values for one-body, two-body, and three-body blocks. Two-body spline shape penalties can discourage rough or unphysical coefficient curves by penalizing third finite differences, early positive slopes, initial curvature, or low first coefficients. These penalties operate on coefficients, not on sampled energies. They are therefore cheap and stable, but they are priors on the representation rather than direct physical constraints at every distance. When a selected two-body fit covers only part of a pair channel or knot range, third-difference shape penalties use boundary-aware partial semantics. Rows that touch both selected and fixed coefficients still contribute selected-coefficient terms to the compact solve. The fixed neighboring coefficients are moved to the regularization right-hand side, so refitting a middle knot range with nonzero outside knots preserves the same physical prior as the corresponding full-block penalty. ### Data-Scaled Ridge Guesses For ridge values, raw neighbor, pair, or triplet counts are only indirect proxies. They affect the scale and conditioning of the design matrix, but the quantity that matters for a coefficient group is the weighted block curvature: $$ \lambda_g = \alpha \frac{\operatorname{trace}(\mathbf A_g^\mathsf T \mathbf A_g)} {n_g}, $$ where $\mathbf A_g$ is the weighted least-squares design block for group $g$ and $n_g$ is that group's parameter count. `ufp.workflows` exposes this as a semi-automated starting point: ```python from ufp.workflows import estimate_linear_ridge_scales estimate = estimate_linear_ridge_scales( model, training_samples, fitter_kwargs={ "fit_energy": True, "fit_forces": True, "solver": "normal_equation_direct", "dtype": dtype, }, fit_kwargs={"batch_size": 64}, subset_size=128, alpha=1.0e-6, ) ridge_kwargs = estimate.candidate().as_fitter_kwargs() ``` For larger studies, use progressive regularization tuning. It estimates the same data-scaled seed, evaluates log-spaced candidates on deterministic training and validation subsets, then rechecks the best candidates on the full validation split: ```python from ufp.workflows import RegularizationSearchConfig, tune_linear_regularization search = tune_linear_regularization( make_model, dataset, config=RegularizationSearchConfig( stage_subset_sizes=(64, 256), top_k_per_stage=5, refit_full=True, ), fitter_kwargs={ "fit_energy": True, "fit_forces": True, "solver": "cg", "dtype": dtype, }, fit_kwargs={"batch_size": 64}, ) best_model = search.final_model best_ridges = search.best_candidate.as_fitter_kwargs() ``` The tuner only adjusts coefficient ridge weights in v1. It preserves any explicit two-body shape penalty passed through `fitter_kwargs`, but does not tune that deprecated penalty family. ## Selective Coefficient Fitting `LinearFitter` can fit a subset of coefficients while preserving the current values of all other fittable coefficients. Existing string selectors still select whole blocks or body-order groups, for example `fit_blocks=["threebody"]` or `freeze_blocks=["twobody"]`. For finer control, use `CoefficientSelector`: ```python from ufp.leastsquares import ( CoefficientSelector, LinearFitter, ParameterLayout, resolve_coefficient_selection_summary, ) # Fit only three-body coefficients. LinearFitter(model, fit_blocks=["threebody"]) # Keep an existing two-body term fixed and fit the remaining coefficients as a # correction. Explicit freeze selectors take precedence over fit selectors, and # fixed coefficients are subtracted from the least-squares target internally. LinearFitter(model, freeze_blocks=["twobody"]) # Fit one pair channel inside a categorized two-body term. Pair selectors are # canonicalized with the term's symmetry convention. LinearFitter( model, fit_blocks=[CoefficientSelector(block="twobody", channel=(1, 8))], ) # Fit one source-distinguished three-body triplet channel. LinearFitter( model, fit_blocks=[CoefficientSelector(block="threebody", channel=(8, 1, 1))], ) # Refit only the first four knots for one pair channel. LinearFitter( model, fit_blocks=[ CoefficientSelector( block="twobody", channel=(1, 8), coeff_slice=slice(0, 4), ) ], ) # Inspect how selectors resolve before starting a fit. layout = ParameterLayout.from_model(model, include_frozen=True) summary = resolve_coefficient_selection_summary( layout, fit_blocks=[ CoefficientSelector( block="twobody", channel=(1, 8), coeff_slice=slice(0, 4), ) ], ) summary.selected_block_labels summary.fit_entries[0].original_indices summary.fit_entries[0].compact_slice ``` Pair and two-body selectors use pair channels `(Z_i, Z_j)`. Three-body and 2D-triplet selectors use source-distinguished triplet channels `(Z_center, Z_neighbor_1, Z_neighbor_2)`, with the two neighbor entries canonicalized into the term's symmetric order. Coefficient slices address the trailing spline dimensions: one dimension for pair terms, three dimensions for `SplineThreeBodyTerm`, and two dimensions for `SplineTriplet2DTerm`. Partial refits preserve non-selected coefficients exactly and subtract their current linear contribution from the target internally. With the default `assembly_contract="term"`, terms that implement selected-column assembly build only the requested compact columns. Unsupported blocks and `assembly_contract="block"` keep the full assembly plus slicing path as the compatibility fallback, so missing selected assemblers are a performance limitation rather than a correctness gap. For example, this refits four middle pair knots while keeping nonzero boundary neighbors fixed. Legacy two-body shape penalties, when present in older scripts, continue to use boundary-aware partial semantics, but new workflows should start with ridge values and validation-driven tuning: ```python from ufp.leastsquares import CoefficientSelector, LinearFitter fitter = LinearFitter( model, fit_blocks=[CoefficientSelector(block="pair", coeff_slice=slice(2, 6))], pair_ridge=1.0e-8, ) fitter.fit(samples, batch_size=32) ``` Large partial three-body refits can use the same selector syntax. The default term assembly contract will use selected assembly when the term provides it and will otherwise fall back to full assembly plus slicing: ```python fitter = LinearFitter( model, fit_blocks=[ CoefficientSelector( block="threebody", channel=(74, 74, 74), coeff_slice=(slice(0, 8), slice(0, 8), slice(0, 8)), ) ], solver="cg", matrix_storage="column_chunked", ) fitter.fit(samples, batch_size=8, cache_directory="cache/partial-w") ``` ## Coefficient Interchange The `ufp.coefficients` namespace provides model-surgery helpers for compatible physical coefficient channels. They build on `CoefficientSelector`, validate channel identity, spline family, grid, cutoff, dtype, device, active masks, symmetry, and category ordering, then copy only compatible values. A common staged workflow is to fit elemental pair models first, then initialize the matching channels of a mixed model while leaving cross terms at their template values: ```python from ufp.coefficients import ( clone_model_with_zeroed_coefficients, copy_matching_coefficients, ) mixed = clone_model_with_zeroed_coefficients(mixed_template) hh_report = copy_matching_coefficients(hh_pair_model, mixed) oo_report = copy_matching_coefficients(oo_pair_model, mixed) hh_report.copied_count oo_report.copied_count ``` Use `copy_coefficient_channel()` when the source and target selectors differ or when a workflow should fail unless one specific channel is compatible. ## Offline Projection The `ufp.projection` namespace contains offline helpers for turning prior functions or existing spline channels into coefficient blocks. Projection is a model-construction step: it writes ordinary spline coefficients and returns diagnostics, but it does not add projection logic to runtime term evaluation. Use `project_radial_function()` to sample a callable pair prior and solve for a 1D spline row. Use `project_pair_prior()` when the source is one of the analytic pair priors from `ufp.terms`; it calls the prior's `radial_values(pair, distances)` method and attaches the prior's `projection_metadata()` payload to `result.metadata`. Both results store detailed sampled values on the result object and shared `ProjectionDiagnostics` under `result.diagnostics`: ```python import torch from ufp import SplinePairTerm, SplineTwoBodyTerm, UFPModel from ufp.coefficients import copy_matching_coefficients from ufp.leastsquares import CoefficientSelector, LinearFitter from ufp.projection import project_radial_function hh_prior = project_radial_function( lambda r: 0.2 * r.square() + 0.05 * r + 0.3, coeff_size=8, full_support_start=0.0, cutoff=2.0, dtype=torch.float64, channel_label="pair[(1, 1)]", ) hh_model = UFPModel( pair_terms=[ SplinePairTerm( cutoff=2.0, pair=(1, 1), coeffs=hh_prior.coeffs.clone(), full_support_start=0.0, ) ], atomic_types=[1], ) mixed_model = UFPModel( pair_terms=[ SplineTwoBodyTerm( cutoff=2.0, atomic_types=[1, 8], coeffs_by_pair=torch.zeros((3, 8), dtype=torch.float64), full_support_start=0.0, ) ], atomic_types=[1, 8], ) copy_matching_coefficients(hh_model, mixed_model) LinearFitter( mixed_model, fit_blocks=[ CoefficientSelector( block="twobody", channel=(1, 1), coeff_slice=slice(2, 6), ) ], fit_forces=False, solver="normal_equation_direct", ridge=1.0e-8, ).fit(samples, batch_size=32) hh_prior.diagnostics.summary.max_value_rmse ``` Spline-to-spline helpers such as `project_pair_to_twobody()`, `project_triplet2d_channel()`, and `project_threebody_channel()` support compatible channel projection between existing terms. Their diagnostics use the same channel-level objects and include derivative-error summaries when the projection includes derivative rows. ## Analytic Priors and Residualized Corrections Stage-wise workflows can combine nonlinear physical priors with linear spline corrections. Keep the three mechanisms distinct: - `project_pair_prior()` turns an analytic pair prior into ordinary spline coefficients. After projection, least squares sees only a spline block. - `HybridLinearFitter` subtracts frozen nonlinear terms on the fly, then delegates the remaining selected coefficient solve to `LinearFitter`. - `materialize_residual_dataset()` writes residual energy, force, or stress labels into an `ASEAtomsDataset` for longer training runs. The public analytic prior names are `InversePowerPairPrior`, `DampedInversePowerPairPrior`, `ExponentialRepulsionPairPrior`, `MorsePairPrior`, and `ScaledZBLPairPrior`. `CutoffEnvelope` and the helper functions `cutoff_envelope_values()`, `cutoff_envelope_derivatives()`, `normalize_cutoff_envelope()`, and `apply_cutoff_envelope()` live under `ufp.terms`. Project an analytic prior into a pair spline, then fine-tune only selected knots: ```python import torch from ufp import ExponentialRepulsionPairPrior, SplinePairTerm, UFPModel from ufp.leastsquares import CoefficientSelector, LinearFitter from ufp.projection import project_pair_prior prior = ExponentialRepulsionPairPrior( cutoff=2.0, atomic_types=[1], amplitudes_by_pair={(1, 1): 0.9}, decay_rates_by_pair={(1, 1): 1.4}, cutoff_envelope="none", trainable=False, dtype=torch.float64, ) projection = project_pair_prior( prior, (1, 1), coeff_size=10, full_support_start=0.2, ) model = UFPModel( pair_terms=[ SplinePairTerm( cutoff=projection.cutoff, pair=(1, 1), coeffs=projection.coeffs.clone(), spline=projection.spline, full_support_start=projection.full_support_start, ) ], atomic_types=[1], ) LinearFitter( model, fit_blocks=[CoefficientSelector(block="pair", coeff_slice=slice(2, 6))], fit_forces=False, ).fit(samples, batch_size=32) projection.metadata["prior"]["parameter_state_hash"] ``` Use `HybridLinearFitter` when the prior should stay nonlinear and frozen while a selected spline correction is solved. This is not the same as `LinearFitter(freeze_blocks=...)`: freeze selectors only subtract fixed linear coefficient blocks already represented in the least-squares layout. ```python import torch from ufp import ExponentialRepulsionPairPrior, SplineThreeBodyTerm, UFPModel from ufp.leastsquares import CoefficientSelector, HybridLinearFitter prior = ExponentialRepulsionPairPrior( cutoff=2.0, atomic_types=[1], amplitudes_by_pair={(1, 1): 0.4}, decay_rates_by_pair={(1, 1): 0.8}, cutoff_envelope="none", trainable=False, dtype=torch.float64, ) threebody = SplineThreeBodyTerm( cutoff=2.0, atomic_types=[1], coeffs_by_triplet=torch.zeros((1, 8, 8, 8), dtype=torch.float64), full_support_start_xy=0.0, full_support_start_z=0.0, ) model = UFPModel( pair_terms=[prior], threebody_terms=[threebody], atomic_types=[1], ) HybridLinearFitter( model, frozen_terms=[prior], fit_blocks=[ CoefficientSelector( block="threebody", channel=(1, 1, 1), coeff_slice=(slice(0, 4), slice(0, 4), slice(0, 4)), ) ], fit_forces=False, ).fit(samples, batch_size=16, cache_directory="cache/hybrid") ``` For longer optimizer runs, materialize residual labels once and carry strict metadata with every sample: ```python from ufp.workflows import materialize_residual_dataset residual = materialize_residual_dataset( model, dataset, selectors=("pair",), targets=("energy", "forces"), target_weights={"energy": 1.0, "forces": 0.1}, metadata_key="frozen_prior_residuals", ) residual.dataset[0].metadata["frozen_prior_residuals"]["metadata_hash"] residual.metadata["frozen_terms_state_hash"] ``` Residual metadata hashes include the selected frozen term states, selectors, target keys, target weights, units, and optional analytic prior projection metadata. Hybrid cache directories use the same frozen-term state hash under `HybridLinearFitter.frozen_terms_state_hash`. ## Optimizer-Time Freezing Gradient training can freeze the same coefficient selections by building a mask-and-restore wrapper around a PyTorch optimizer: ```python import torch from ufp.leastsquares import CoefficientSelector from ufp.training import freeze_model_coefficients freeze_state = freeze_model_coefficients( model, [ CoefficientSelector( block="twobody", channel=(1, 8), coeff_slice=slice(0, 4), ) ], ) freeze_state.affected_parameter_names freeze_state.frozen_counts freeze_state.selector_metadata optimizer = torch.optim.AdamW(model.parameters(), lr=1.0e-3, weight_decay=1.0e-2) freeze_state.wrap_optimizer(optimizer) ``` The wrapper masks gradients, restores frozen entries after every optimizer step, and clears optimizer state entries for frozen coefficients. This prevents AdamW weight decay and stale moment buffers from drifting frozen coefficients. Calling `wrap_optimizer()` with the same `CoefficientFreezeState` is idempotent; applying a different freeze state to an already wrapped optimizer raises a clear error. ## Workflow Stages and Checkpoints The `ufp.workflows` namespace includes small stage objects for explicit staged fitting scripts. They are wrappers around existing primitives, not a hidden workflow runner: users still build a sequence, run each stage, and own the context mapping between stages. Available stage objects are `ProjectStage`, `LinearFitStage`, `ResidualizeStage`, `TrainStage`, and `ValidateStage`. Each declares required context keys, produced context keys, and checkpoint-ready metadata. Use `workflow_stage_metadata()` to collect those stage records, then pass the result to `save_workflow_checkpoint()`: ```python import torch from ufp.leastsquares import CoefficientSelector from ufp.projection import project_radial_function from ufp.training import LossWeights from ufp.workflows import ( LinearFitStage, ProjectStage, TrainStage, ValidateStage, save_workflow_checkpoint, workflow_stage_metadata, ) fit_selector = CoefficientSelector(block="pair", coeff_slice=slice(2, 6)) freeze_selector = CoefficientSelector(block="pair", coeff_slice=slice(0, 2)) context = {"model": model, "fit_samples": samples} stages = [ ProjectStage(projector=project_radial_function, projector_kwargs=projection_args), LinearFitStage( fitter_kwargs={ "fit_blocks": [fit_selector], "fit_forces": False, "solver": "normal_equation_direct", }, fit_kwargs={"batch_size": 16}, ), TrainStage( freeze_selectors=[freeze_selector], fit_kwargs={ "epochs": 5, "loss_weights": LossWeights(energy=1.0, forces=0.0), "dtype": torch.float64, }, ), ValidateStage(evaluate_kwargs={"dtype": torch.float64}), ] results = [] for stage in stages: result = stage.run(context) result.update_context(context) results.append(result) save_workflow_checkpoint( "workflow.pt", model, fit_blocks=[fit_selector], freeze_blocks=[freeze_selector], stage_metadata=workflow_stage_metadata(results, name="pair-refit"), projection_diagnostics=context["projection_result"].diagnostics, validation_metrics=context["validation_metrics"], ) ``` Workflow checkpoints store the package version, model and term metadata, coefficient layout, selector metadata, fixed-coefficient hashes, stage metadata, projection diagnostics, validation metrics, user metadata, and the model `state_dict`. `load_workflow_checkpoint()` validates the schema and can reject model-layout mismatches before loading parameters. `ufp.workflows.prepared` is an advanced prepared-geometry helper for workflow-level cache reuse. It can materialize tensorized geometry, neighbor lists, pair categories, optional triplet cache metadata, and strict source signatures from `UFPInput`, ASE batches, training batches, or least-squares prepared batches. It is intentionally not exported from top-level `ufp.workflows` and is not a runtime input path for model evaluation; import it directly only for workflows that need reusable prepared geometry. ## Caching Large three-body fits can spend significant time assembling design blocks. UFP can write assembled batches, normal-equation components, and dense three-body feature caches to disk. Cache manifests include enough metadata to reject incompatible requests when samples, targets, dtype, backend settings, or coefficient bases change. Assembled least-squares caches use semantic coefficient-channel metadata for built-in one-body, pair/two-body spline, and three-body spline terms, so a full non-alchemical cache can be streamed into a compatible alchemical fit without writing a second adapted cache. This cache format is intentionally not compatible with assembled caches written by older UFP versions. Cache directories are settings-addressed. The library derives child directory names from a stable SHA-256 digest of the cache identity metadata, and each written cache also contains a `cache_settings.json` summary next to the manifest. That file is intended for humans: it records the cache kind, digest, manifest name, schema version, and the invalidating settings used to derive the folder. Example and workflow code can use the public `ufp.cache.CacheIdentity` API, or the top-level `ufp.suggest_cache_directory()` helper, when it needs a deterministic parent folder name from notebook or CLI parameters. Training dense three-body feature caches use a separate V2 manifest with input signatures, dtype and batch size, atomic and triplet category order, active triplet indices, spline support/grid metadata, coefficient shape, and row semantics. `feature_cache_mode="read"` requires an exact cache or a compatible active-triplet superset and fails when the cache is missing or stale. `feature_cache_mode="refresh"` intentionally rebuilds the cache. The Li-P-S training examples use `read` by default, so first-time runs require `--refresh-cache`. For selective fits, cache and CG checkpoint compatibility metadata includes the stable keys `coefficient_selection`, `fixed_coefficients_signature`, and `regularization_semantics`. `coefficient_selection` records the compact selected layout for each block, and `fixed_coefficients_signature` changes when preserved coefficient values outside the solve layout change. `regularization_semantics` changes when the meaning of partial shape regularization changes. CG checkpoints also store the concrete regularization weights in a `regularization` payload. Selected assembly does not need separate correctness metadata because selected-column hooks and full assembly plus slicing are required to produce the same compact problem. Use disk-backed caches when repeated solves or checkpointed experiments reuse the same fixed geometries. Prefer ordinary in-memory assembly for small models and early debugging. ## Gradient Training The training stack keeps ASE structures in `ASEAtomsDataset`, collates them into `ASEAtomsBatch`, prepares `UFPInput`, and optimizes model parameters with PyTorch. Energy, forces, and stress can contribute to the loss when labels are available. Training is more general than linear fitting because it can optimize nonlinear parameters and alchemical mixing weights directly. Least squares is often faster and more deterministic when the coefficient-linear assumption applies.