# Architecture Ownership This page describes module ownership and dependency direction for contributors. It is a refactor guardrail: performance-sensitive code is allowed to keep local specialization when that preserves measured behavior. ## Package Boundaries - `ufp.core` owns normalized inputs, outputs, potentials, device and dtype movement, and small disk-cache primitives. Other packages may depend on `ufp.core`; `ufp.core` should not depend on model terms, training, adapters, or workflows. - `ufp.cache` owns public settings-addressed cache identity helpers. It may delegate low-level digest and name formatting to `ufp.core._disk_cache`, but workflow modules should not define competing cache namespaces. - `ufp.neighbors` owns neighbor-list construction and conversion to `NeighborListData`. It is the boundary between ASE, `vesin`, `metatomic`, and the tensor layout consumed by `UFPInput`. - `ufp.splines` owns spline basis and stencil math. Term modules and fitting assembly may call these helpers directly, but spline modules should remain independent of model terms. - `ufp.terms` owns additive energy terms and their optional extension hooks. Public term classes live in literal modules such as `onebody.py`, `twobody.py`, and `threebody.py`; shared implementation details stay in underscore-prefixed modules. - `ufp.leastsquares` owns parameter layouts, design-matrix assembly, cache IO, normal equations, and linear solve orchestration. It may know about built-in terms for optimized assembly, but shared containers live in dependency-light modules such as `ufp.leastsquares._types`. - `ufp.coefficients` owns coefficient-channel discovery, compatibility checks, copying, zeroing, and staged-fit selector aliases. It may use least-squares layout metadata to identify blocks, but it should not own solvers or assembly code. - `ufp.projection` owns offline projection of analytic priors and existing spline channels into ordinary coefficient tensors. Projection is a model preparation tool; runtime term evaluation should not depend on projection modules. - `ufp.training` owns ASE-oriented datasets, batches, losses, metrics, and gradient training loops. It may warm term caches through the term contract but should not duplicate least-squares assembly policy. - `ufp.adapters` owns simulation-engine integration. Adapter modules should keep engine-specific objects at the boundary and pass normalized tensors or UFP models inward. - `ufp.workflows` owns high-level data, model, prediction, training, and runtime selection helpers. Workflows may combine lower layers but should not become the owner of kernels, cache formats, or term internals. - `ufp.analysis` and `ufp.benchmarks` own plotting, diagnostics, and benchmark harnesses. They may depend on the rest of the package but should not be imported by runtime hot paths. ## Dependency Direction The usual dependency direction is: ```text core + neighbors -> splines -> terms -> leastsquares + training -> coefficients + projection -> adapters + workflows -> examples + benchmarks ``` Private modules can be shared within a package when they clarify ownership, but cross-package imports should prefer public modules or explicit contract modules. For custom terms, import extension types from `ufp.terms.contracts` instead of `ufp.terms._base` or `ufp.terms._parameters`. ## Public And Internal APIs - Stable user-facing APIs are exported from package `__init__.py` files and documented in `docs/api.md`. - Extension-facing APIs that are not normal user entry points live in explicit contract namespaces such as `ufp.terms.contracts`; spline stencils, derivative rows, and fitting helpers in `ufp.splines` are also supported for term and fitting extensions. - Underscore-prefixed modules are implementation details unless a contract page explicitly says otherwise. - `ufp.workflows.cache` is a compatibility alias for `ufp.cache`, not a second public cache identity namespace. - Cache data classes, memmap layout helpers, and kernel dispatch functions are internal to their owning package. - `ufp.coefficients.CoefficientSelector` is an additive public alias of `ufp.leastsquares.CoefficientSelector`; keep both imports working because selectors are used by least squares, coefficient interchange, and training freeze helpers. UFP uses these API stability tiers: | Tier | Policy | | --- | --- | | Stable user API | Normal user imports intended to remain compatible across minor releases. | | Extension API | Supported hooks for third-party/custom terms and fitting integrations. | | Expert diagnostics | Importable debugging, profiling, benchmark, or inspection helpers. These remain supported for their documented use, but they may expose lower-level implementation details. | | Experimental workflow API | High-level orchestration helpers for examples and studies that can change faster than the core model and term APIs. | | Compatibility-only | Transition and compatibility tools, including `compat/`, that preserve migration paths without becoming new model-building APIs. | | Internal/private | Underscore modules, runtime dispatch helpers, cache layouts, and hot-path implementation details unless explicitly documented otherwise. | `ufp.benchmarks` is an expert benchmark API: it is installed and importable for speed gates and performance comparisons, but runtime packages should not depend on it. Three-body bucket/evaluator helpers exposed from `ufp.terms.threebody` are expert diagnostics for benchmark and debugging workflows, not normal model assembly entry points. ## Hot Paths Treat the following as performance-sensitive boundaries: - `UFPInput` pair, category, and atomic metadata caches; - pair and two-body spline evaluation; - three-body bucketing, evaluator dispatch, dense and sparse feature caches, and memmap cache loading; - least-squares assembly, normal-equation construction, and block-matrix matvec/rmatvec helpers; - cached ASE training batches and cache warming. Refactors in these areas should preserve dispatch decisions outside inner loops, avoid extra tensor materialization, preserve dtype and device metadata, and run the relevant speed-gate or benchmark checks listed in `docs/benchmarks.md`. ## Least-Squares Setup Split `LinearFitter` is the orchestration entry point, but setup-time responsibilities are separated from assembled-matrix and solver hot paths: ```text model + fitter options -> ParameterLayout -> selection plan and checkpoint/write-back metadata -> cache write/projection plan -> assembled batches or cached batches -> BlockLinearProblem.solve() ``` Selection planning lives in `ufp.leastsquares._setup`. It resolves fit/freeze selectors once during `LinearFitter.__init__`, stores primitive metadata such as selected block ids and compact size, and provides checkpoint signatures plus selected-vector read/write helpers. Cache layout planning remains in `ufp.leastsquares._cache_layout`, normal-equation cache construction remains in `ufp.leastsquares._normal_equations`, and block matrix operations remain in `ufp.leastsquares._block`. Do not pass setup objects into repeated row assembly, `BlockLinearProblem` `matvec`/`rmatvec`, normal-equation accumulation, or CG loops. Those paths should continue to consume concrete tensors, compact block matrices, primitive slices, and locally-bound functions. Additional split points should follow the same rule: normalize or plan before assembly starts, then unpack the result before the measured loop. ## Intentional Duplication Small duplicated loops or validation branches are acceptable when they keep a hot path direct, avoid a tensor allocation, or preserve TorchScript and native dispatch constraints. Before removing duplication in a hot path, identify the benchmark or speed-gate that protects it and compare behavior before and after the change. Prefer shared helpers for metadata hashing, public contract validation, docs, configuration parsing, and non-hot-path orchestration. Prefer local specialized code for kernel-adjacent tensor layout and high-frequency matrix operations. ## Known Technical Debt The least-squares package still reaches into selected term-private assembly helpers for optimized design-matrix construction. That dependency is deliberate for now because moving it would touch performance-sensitive fitting paths. Treat it as technical debt to document and measure before refactoring, not as a cleanup candidate for onboarding-only passes.