# Development

For the full contributor guide, see
[`CONTRIBUTING.md`](https://github.com/uf3c/ufp/blob/main/CONTRIBUTING.md).
This page focuses on setup, validation, and documentation commands.

For implementation guidance on new interaction terms, see
[Adding Model Terms](adding-terms.md). That page covers input requirements,
state plumbing, least-squares parameter blocks, cache descriptors, workflow
schema codecs, and the tests expected for new term families.

## First Day Setup

Install the package from the repository root with the development and docs
extras:

```sh
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e ".[all,dev,docs]"
```

For narrower work, install only the extras you need:

```sh
python -m pip install -e .
python -m pip install -e ".[dev]"
python -m pip install -e ".[docs]"
```

Optional engine integrations and native extensions are intentionally opt-in.
Base imports should work without `vesin`, `metatomic`, `metatomic-torchsim`, or
native `ufp._C`.

## Validation Commands

Run the main test suite:

```sh
python -m pytest
```

Run a deterministic CPU-only subset when local CUDA drivers are visible but not
compatible with the installed PyTorch build:

```sh
tox -e tests-cpu -- tests/workflows tests/projection tests/coefficients
```

The `tests-cpu` tox environment hides CUDA with `CUDA_VISIBLE_DEVICES=-1` and
filters PyTorch CUDA-driver initialization warnings that are irrelevant to
CPU-only validation. This is the recommended first validation target for
workflow, projection, coefficient, docs, and example refactors. CUDA/native
coverage remains in the normal test and speed paths.

Run targeted speed gates and benchmark smoke tests:

```sh
python -m pytest tests/speed
tox -e speed
```

Run docstring doctests:

```sh
tox -e docs-tests
```

Build documentation:

```sh
tox -e docs
```

Run lint and formatting checks:

```sh
tox -e lint
```

Run static type checks:

```sh
tox -e type
```

## Type-Check Baseline

`tox -e type` is expected to pass. The mypy baseline is gradual: post-init
tensor-normalization internals plus dynamic term, least-squares, adapter, and
benchmark modules are explicitly baselined in `pyproject.toml` while the rest
of `ufp` is checked.

The current `ignore_errors` debt covers:

- `ufp.adapters.*`
- `ufp.benchmarks.*`
- `ufp.core._execution`
- `ufp.core.input`
- `ufp.core.output`
- `ufp.leastsquares.*`
- `ufp.neighbors._data`
- `ufp.terms.*`
- `ufp.training.batch`

New public APIs should be typed precisely even when their implementation lives
near a baselined module. Remove this debt one module or subpackage at a time,
with focused tests and without broad rewrites in runtime hot paths.

Apply formatting and safe Ruff fixes:

```sh
tox -e format
```

Generate or check local notebooks from the example scripts:

```sh
python examples/generate_notebooks.py
python examples/generate_notebooks.py --list
python examples/generate_notebooks.py --check
```

Edit `.py` example sources first. Generated `.ipynb` notebooks are local
presentation artifacts and are ignored by git.

## Generated Artifacts

The following files are normally generated or local-only and should not be
treated as source in reviews:

- example notebooks generated from tracked `.py` scripts;
- example datasets, checkpoints, predictions, plots, cache arrays, and cache
  manifests;
- Sphinx build output;
- optional native extension build output.

Small benchmark `.xyz` files and split metadata under examples are tracked and
should remain stable unless a change intentionally updates the example data
contract.

## Repository Audits

Use `git ls-files` for repository-wide source audits, ownership reviews, and
mechanical scans that should reflect tracked project surface:

```sh
git ls-files
git ls-files '*.py'
```

Ignored generated notebooks, notebook checkpoints, backup directories, local
planning notes, caches, checkpoints, and large experiment outputs may exist in a
developer workspace. They are not repository surface unless they are tracked by
git. Avoid broad `find .` scans for review decisions unless ignored local files
are intentionally part of the question.

## Speed Gates

Performance-sensitive code includes pair evaluation, three-body bucketing and
feature caches, least-squares assembly, block-matrix matvecs, and cached
training batches. Keep refactors in those areas direct, avoid added tensor
materialization or diagnostic-only runtime checks, and run the matching speed
gate or benchmark smoke test from `docs/benchmarks.md` before changing
behavior.

Docs, examples, workflow orchestration, checkpoint serialization, and cache
publishing utilities are better candidates for cleanup when the goal is team
onboarding rather than numerical behavior changes.

## Documentation Style

Use Markdown for narrative documentation. Sphinx is configured through MyST, so
Markdown pages can include math, admonitions, and autodoc directives.

Python docstrings should use Google style compatible with Sphinx Napoleon:

- concise summary line;
- `Args`, `Returns`, `Raises`, `Yields`, and `Attributes` sections when useful;
- examples only when they clarify nontrivial behavior;
- no `self` or `cls` parameter documentation.

Keep technical documentation factual and close to implementation. When a detail
depends on benchmarks, optional native kernels, or external engines, state the
scope and fallback behavior.