Runtime Backends and Integrations¶
Neighbor-List Backends¶
build_neighbor_list normalizes backend-specific neighbor-list data into
NeighborListData.
aseusesase.neighborlistand is always available with the base dependency set.vesinis optional and can be installed with.[vesin].autoprefersvesinwhen importable, then falls back to ASE.metatomicneighbor lists are consumed when running through the metatomic adapter.
ASE currently supports full lists in this package. Three-body terms also require full neighbor lists because source-centered neighbor pairs must be visible.
Native Three-Body Kernels¶
The default installation is pure Python and Torch. Native C++ and CUDA extensions are optional and are built only when requested:
UFP_BUILD_NATIVE=1 python setup.py build_ext --inplace
UFP_BUILD_NATIVE=1 UFP_BUILD_CUDA=1 python setup.py build_ext --inplace
At runtime, high-level helpers use auto backend selection and fall back to
Torch when native kernels are unavailable. Set environment variables when you
need explicit control:
UFP_THREEBODY_BACKENDcontrols dynamic three-body evaluation.UFP_THREEBODY_BUCKET_BACKENDcontrols source bucketing.UFP_THREEBODY_LSTSQ_BACKENDcontrols least-squares assembly.
Three-Body Runtime Map¶
Three-body execution deliberately keeps several specialized paths instead of a single generic visitor. The specialization keeps dispatch and cache decisions outside the innermost spline and matrix loops.
UFPInput with full neighbor list
-> SplineThreeBodyTerm._bucket_triplets()
-> preprocess_sources_native_or_torch()
-> Buckets plus optional tensor pattern plans
-> dynamic evaluation, feature-cache construction, or least-squares assembly
Source bucketing starts from supported center-neighbor pairs, then groups rows by
source atom and neighbor-category pattern. UFP_THREEBODY_BUCKET_BACKEND accepts
auto, native, python, or tensor; torch is accepted as an alias for
python. The native source-preprocessing backend is CPU-only. In auto, CUDA
inputs use the Python/Torch path for bucketing, while CPU inputs use native
preprocessing only when the extension and dtype/device contract are available.
Dynamic energy/force evaluation is selected by UFP_THREEBODY_BACKEND or the
resolved ThreeBodyRuntimeConfig. auto tries the native operator when the
optional extension is registered, the spline and dtype are supported, the device
has a kernel, and the tensors are not participating in autograd. Otherwise it
falls back to the Torch evaluator. Explicit native raises when those contracts
are not met; explicit torch bypasses native dispatch.
Feature caches use the same bucket representation but have their own storage
policy. feature_cache_mode="auto" loads a compatible disk cache when one is
available and builds one otherwise. read requires a compatible disk cache and
raises if none is found. refresh rebuilds and overwrites the settings-named
cache entry. CPU feature caches are held as dense Torch blocks; disk feature
caches are loaded as .npy memmaps through the V2 manifest format.
Dense and memmap cache compatibility is checked in ufp.terms._threebody_cache
against metadata that includes the input geometry signature, atomic/triplet
categories, coefficient shape, active triplets, spline support parameters,
row semantics, and cache schema version. A cache with a superset of active
triplets can satisfy a narrower request when the remaining metadata matches.
Native availability and fallback checks live in ufp.terms._threebody_kernels.
They test operator registration, device kernel availability, spline family,
dtype, autograd requirements, active-mask placement, and CPU-only constraints
for source preprocessing and dense cache construction. These checks happen at
dispatch boundaries before the hot tensor loops or native calls.
Least-squares assembly is selected separately by UFP_THREEBODY_LSTSQ_BACKEND
or LinearFitter(threebody_lstsq_backend=...). auto can use the native/CUDA
assembly operator when supported and otherwise uses Torch assembly. The cache
metadata written by LinearFitter records both the least-squares assembly
backend and the bucket backend so assembled-batch caches are invalidated when
backend choices change.
ASE¶
UFPASECalculator exposes UFP models through the ASE calculator interface. It
is the simplest integration path for prediction, relaxation, and example
workflows.
Metatomic¶
wrap_atomistic_model and UFPMetatomicModule convert between metatomic
systems, metatensor outputs, and UFP tensor inputs. Optional imports are guarded
so the base package can import without metatomic installed.
This adapter is a Python integration path for prototyping, torch-sim use, and tests. Production LAMMPS export uses the dedicated UF2+3 exporter instead:
from ufp.adapters.metatomic_export import export_uf23_checkpoint
export_uf23_checkpoint(
"best.pt",
model_factory=build_model_architecture,
output_path="exported-uf23.pt",
collect_extensions="torch-extensions",
)
The checkpoint must contain a state_dict or model_state_dict; the model
architecture is rebuilt by model_factory before loading weights. Legacy
single-element workflow checkpoints with onebody_energy are handled when the
factory returns either a compatible one-body term or a single-element interaction
model.
Install optional metatomic dependencies with ufp[metatomic]. For CUDA UF2+3
production runs with three-body terms, build the native extension with:
UFP_BUILD_NATIVE=1 UFP_BUILD_CUDA=1 python setup.py build_ext --inplace
The first LAMMPS target is NVE/NVT molecular dynamics with total energy and
direct non_conservative_forces. Direct stress/virial output is not implemented
yet, so production NPT workflows should wait for validated stress support.
Single-element LAMMPS usage maps the atom type to the exported atomic number:
pair_style metatomic exported-uf23.pt device cuda extensions ./torch-extensions non_conservative on
pair_coeff * * 74
For multi-element exports, list one atomic number per LAMMPS atom type in the same order used by the data file:
pair_coeff * * 6 8
Torch-Sim¶
build_torchsim_model prefers the metatomic-backed torch-sim path when
metatomic-torchsim is installed. An ASE-backed fallback is available for
debugging and compatibility, but it should not be treated as the high-performance
or fully differentiable path for large simulations.