Installation#
You need to have Python 3.12 or newer installed on your system. If you don’t have Python installed, we recommend installing uv.
PyPI#
Install the latest release of annbatch from PyPI:
pip install "annbatch[zarrs]"
Important
zarrs-python gives the necessary performance boost for the
sharded data produced by our preprocessing functions to be useful when loading data off a local
filesystem, so we recommend installing the zarrs extra and using it when working with local filesystems.
Otherwise, be sure to install the [remote] extra for zarr-python to be able to use zarr.storage.ObjectStore for top remote performance.
Optional dependencies#
annbatch ships several extras that you can mix and match:
Extra |
What it adds |
|---|---|
|
High-performance zarr codec pipeline via zarrs-python for local filesystems — strongly recommended. |
|
Yields batches as 0-copy |
|
GPU acceleration via |
|
GPU acceleration via |
cupy provides accelerated handling of the data via preload_to_gpu once it has been read off disk, and does not need to be used in conjunction with torch.
cupy is also compatible with rocm (AMD) devices, although we do not provide an extra for installing.
To install several extras at once:
pip install "annbatch[zarrs,torch,cupy-cuda13]"
(Replace cupy-cuda13 with the extra matching your local CUDA version.)
Important
Always quote the package specifier ("annbatch[zarrs,torch]") and do not put spaces between
the extras. Most shells (bash, zsh) treat the square brackets as glob patterns, so an unquoted
annbatch[zarrs,torch] — or one written as annbatch[zarrs, torch] — will fail to install.