Development¶
For the full contributor guide, see
CONTRIBUTING.md.
This page focuses on setup, validation, and documentation commands.
For implementation guidance on new interaction terms, see Adding Model Terms. That page covers input requirements, state plumbing, least-squares parameter blocks, cache descriptors, workflow schema codecs, and the tests expected for new term families.
First Day Setup¶
Install the package from the repository root with the development and docs extras:
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e ".[all,dev,docs]"
For narrower work, install only the extras you need:
python -m pip install -e .
python -m pip install -e ".[dev]"
python -m pip install -e ".[docs]"
Optional engine integrations and native extensions are intentionally opt-in.
Base imports should work without vesin, metatomic, metatomic-torchsim, or
native ufp._C.
Validation Commands¶
Run the main test suite:
python -m pytest
Run a deterministic CPU-only subset when local CUDA drivers are visible but not compatible with the installed PyTorch build:
tox -e tests-cpu -- tests/workflows tests/projection tests/coefficients
The tests-cpu tox environment hides CUDA with CUDA_VISIBLE_DEVICES=-1 and
filters PyTorch CUDA-driver initialization warnings that are irrelevant to
CPU-only validation. This is the recommended first validation target for
workflow, projection, coefficient, docs, and example refactors. CUDA/native
coverage remains in the normal test and speed paths.
Run targeted speed gates and benchmark smoke tests:
python -m pytest tests/speed
tox -e speed
Run docstring doctests:
tox -e docs-tests
Build documentation:
tox -e docs
Run lint and formatting checks:
tox -e lint
Run static type checks:
tox -e type
Type-Check Baseline¶
tox -e type is expected to pass. The mypy baseline is gradual: post-init
tensor-normalization internals plus dynamic term, least-squares, adapter, and
benchmark modules are explicitly baselined in pyproject.toml while the rest
of ufp is checked.
The current ignore_errors debt covers:
ufp.adapters.*ufp.benchmarks.*ufp.core._executionufp.core.inputufp.core.outputufp.leastsquares.*ufp.neighbors._dataufp.terms.*ufp.training.batch
New public APIs should be typed precisely even when their implementation lives near a baselined module. Remove this debt one module or subpackage at a time, with focused tests and without broad rewrites in runtime hot paths.
Apply formatting and safe Ruff fixes:
tox -e format
Generate or check local notebooks from the example scripts:
python examples/generate_notebooks.py
python examples/generate_notebooks.py --list
python examples/generate_notebooks.py --check
Edit .py example sources first. Generated .ipynb notebooks are local
presentation artifacts and are ignored by git.
Generated Artifacts¶
The following files are normally generated or local-only and should not be treated as source in reviews:
example notebooks generated from tracked
.pyscripts;example datasets, checkpoints, predictions, plots, cache arrays, and cache manifests;
Sphinx build output;
optional native extension build output.
Small benchmark .xyz files and split metadata under examples are tracked and
should remain stable unless a change intentionally updates the example data
contract.
Repository Audits¶
Use git ls-files for repository-wide source audits, ownership reviews, and
mechanical scans that should reflect tracked project surface:
git ls-files
git ls-files '*.py'
Ignored generated notebooks, notebook checkpoints, backup directories, local
planning notes, caches, checkpoints, and large experiment outputs may exist in a
developer workspace. They are not repository surface unless they are tracked by
git. Avoid broad find . scans for review decisions unless ignored local files
are intentionally part of the question.
Speed Gates¶
Performance-sensitive code includes pair evaluation, three-body bucketing and
feature caches, least-squares assembly, block-matrix matvecs, and cached
training batches. Keep refactors in those areas direct, avoid added tensor
materialization or diagnostic-only runtime checks, and run the matching speed
gate or benchmark smoke test from docs/benchmarks.md before changing
behavior.
Docs, examples, workflow orchestration, checkpoint serialization, and cache publishing utilities are better candidates for cleanup when the goal is team onboarding rather than numerical behavior changes.
Documentation Style¶
Use Markdown for narrative documentation. Sphinx is configured through MyST, so Markdown pages can include math, admonitions, and autodoc directives.
Python docstrings should use Google style compatible with Sphinx Napoleon:
concise summary line;
Args,Returns,Raises,Yields, andAttributessections when useful;examples only when they clarify nontrivial behavior;
no
selforclsparameter documentation.
Keep technical documentation factual and close to implementation. When a detail depends on benchmarks, optional native kernels, or external engines, state the scope and fallback behavior.