annbatch.ChunkSampler#
- class annbatch.ChunkSampler(chunk_size, preload_nchunks, batch_size, *, replacement=False, num_samples=None, shuffle=False, drop_last=False, mask=None, rng=None)#
Chunk-based sampler for batched data access.
Deprecated since version 0.1.0: Use
RandomSampler(for shuffled access) orSequentialSampler(for ordered access) instead.This is the monolithic sampler that powers both
RandomSamplerandSequentialSampler. It supports with-replacement (shuffle/no-shuffle) and without-replacement sampling.- Parameters:
- batch_size
int Number of observations per batch.
- chunk_size
int Size of each chunk i.e. the range of each chunk yielded.
- mask
slice|None(default:None) A slice defining the observation range to sample from (start:stop).
- shuffle
bool(default:False) Whether to shuffle chunk and index order.
- preload_nchunks
int Number of chunks to load per iteration.
- drop_last
bool(default:False) Whether to drop the last incomplete batch.
- rng
Generator|None(default:None) Random number generator for shuffling. Note that
torch.manual_seed()has no effect on reproducibility here; pass a seedednumpy.random.Generatorto control randomness.- replacement
bool(default:False) If
True, draw random chunks with replacement, allowing the same observations to appear more than once.- num_samples
int|None(default:None) Total number of observations to draw. When
None(the default), equals the effective observation range. Must be positive when set and less than the number of observations to be yielded whenreplacement=False.
- batch_size
Attributes table#
The batch size for data loading. |
|
The observation range this sampler operates on. |
|
The random number generator used by this sampler. |
|
Whether data is shuffled. |
Methods table#
Attributes#
- ChunkSampler.batch_size#
- ChunkSampler.mask#
The observation range this sampler operates on.
- ChunkSampler.rng#
The random number generator used by this sampler.
- ChunkSampler.shuffle#
Methods#
- ChunkSampler.n_iters(n_obs)#
Return the number of batches.
- ChunkSampler.sample(n_obs)#
Sample load requests given the total number of observations.
Base implemention simply calls
validate()and then yields via_sample().- Parameters:
- n_obs
int The total number of observations available.
- n_obs
- Yields:
LoadRequest – Load requests for batching data.
- Return type:
- ChunkSampler.validate(n_obs)#
Validate the sampler configuration against the loader’s n_obs.
- Parameters:
- n_obs
int The total number of observations in the loader.
- n_obs
- Raises:
ValueError – If the sampler configuration is invalid for the given n_obs.
- Return type: