# Development For the full contributor guide, see [`CONTRIBUTING.md`](https://github.com/uf3c/ufp/blob/main/CONTRIBUTING.md). This page focuses on setup, validation, and documentation commands. For implementation guidance on new interaction terms, see [Adding Model Terms](adding-terms.md). That page covers input requirements, state plumbing, least-squares parameter blocks, cache descriptors, workflow schema codecs, and the tests expected for new term families. ## First Day Setup Install the package from the repository root with the development and docs extras: ```sh python -m venv .venv source .venv/bin/activate python -m pip install --upgrade pip python -m pip install -e ".[all,dev,docs]" ``` For narrower work, install only the extras you need: ```sh python -m pip install -e . python -m pip install -e ".[dev]" python -m pip install -e ".[docs]" ``` Optional engine integrations and native extensions are intentionally opt-in. Base imports should work without `vesin`, `metatomic`, `metatomic-torchsim`, or native `ufp._C`. ## Validation Commands Run the main test suite: ```sh python -m pytest ``` Run a deterministic CPU-only subset when local CUDA drivers are visible but not compatible with the installed PyTorch build: ```sh tox -e tests-cpu -- tests/workflows tests/projection tests/coefficients ``` The `tests-cpu` tox environment hides CUDA with `CUDA_VISIBLE_DEVICES=-1` and filters PyTorch CUDA-driver initialization warnings that are irrelevant to CPU-only validation. This is the recommended first validation target for workflow, projection, coefficient, docs, and example refactors. CUDA/native coverage remains in the normal test and speed paths. Run targeted speed gates and benchmark smoke tests: ```sh python -m pytest tests/speed tox -e speed ``` Run docstring doctests: ```sh tox -e docs-tests ``` Build documentation: ```sh tox -e docs ``` Run lint and formatting checks: ```sh tox -e lint ``` Run static type checks: ```sh tox -e type ``` ## Type-Check Baseline `tox -e type` is expected to pass. The mypy baseline is gradual: post-init tensor-normalization internals plus dynamic term, least-squares, adapter, and benchmark modules are explicitly baselined in `pyproject.toml` while the rest of `ufp` is checked. The current `ignore_errors` debt covers: - `ufp.adapters.*` - `ufp.benchmarks.*` - `ufp.core._execution` - `ufp.core.input` - `ufp.core.output` - `ufp.leastsquares.*` - `ufp.neighbors._data` - `ufp.terms.*` - `ufp.training.batch` New public APIs should be typed precisely even when their implementation lives near a baselined module. Remove this debt one module or subpackage at a time, with focused tests and without broad rewrites in runtime hot paths. Apply formatting and safe Ruff fixes: ```sh tox -e format ``` Generate or check local notebooks from the example scripts: ```sh python examples/generate_notebooks.py python examples/generate_notebooks.py --list python examples/generate_notebooks.py --check ``` Edit `.py` example sources first. Generated `.ipynb` notebooks are local presentation artifacts and are ignored by git. ## Generated Artifacts The following files are normally generated or local-only and should not be treated as source in reviews: - example notebooks generated from tracked `.py` scripts; - example datasets, checkpoints, predictions, plots, cache arrays, and cache manifests; - Sphinx build output; - optional native extension build output. Small benchmark `.xyz` files and split metadata under examples are tracked and should remain stable unless a change intentionally updates the example data contract. ## Repository Audits Use `git ls-files` for repository-wide source audits, ownership reviews, and mechanical scans that should reflect tracked project surface: ```sh git ls-files git ls-files '*.py' ``` Ignored generated notebooks, notebook checkpoints, backup directories, local planning notes, caches, checkpoints, and large experiment outputs may exist in a developer workspace. They are not repository surface unless they are tracked by git. Avoid broad `find .` scans for review decisions unless ignored local files are intentionally part of the question. ## Speed Gates Performance-sensitive code includes pair evaluation, three-body bucketing and feature caches, least-squares assembly, block-matrix matvecs, and cached training batches. Keep refactors in those areas direct, avoid added tensor materialization or diagnostic-only runtime checks, and run the matching speed gate or benchmark smoke test from `docs/benchmarks.md` before changing behavior. Docs, examples, workflow orchestration, checkpoint serialization, and cache publishing utilities are better candidates for cleanup when the goal is team onboarding rather than numerical behavior changes. ## Documentation Style Use Markdown for narrative documentation. Sphinx is configured through MyST, so Markdown pages can include math, admonitions, and autodoc directives. Python docstrings should use Google style compatible with Sphinx Napoleon: - concise summary line; - `Args`, `Returns`, `Raises`, `Yields`, and `Attributes` sections when useful; - examples only when they clarify nontrivial behavior; - no `self` or `cls` parameter documentation. Keep technical documentation factual and close to implementation. When a detail depends on benchmarks, optional native kernels, or external engines, state the scope and fallback behavior.