annbatch.samplers.RandomSampler#
- class annbatch.samplers.RandomSampler(chunk_size, preload_nchunks, batch_size, *, replacement=False, num_samples=None, drop_last=False, mask=None, rng=None)#
Shuffled chunk-based sampler for batched data access.
Chunks are drawn in random order. With
replacement=False(the default), every observation in the range is visited exactly once per epoch up todrop_last. Withreplacement=True, chunks are drawn independently at random andnum_samplescontrols the total number of observations drawn.When the observation range is smaller than
chunk_size, sampling without replacement works normally (a single smaller chunk is yielded). With replacement, this is only allowed whennum_samplesdoes not exceed the observation range.See
SequentialSamplerfor an ordered (non-shuffled) alternative.- Parameters:
- batch_size
int Number of observations per batch.
- chunk_size
int Size of each chunk i.e. the range of each chunk yielded.
- mask
slice|None(default:None) A slice defining the observation range to sample from (start:stop).
- preload_nchunks
int Number of chunks to load per iteration.
- drop_last
bool(default:False) Whether to drop the last incomplete batch.
- rng
Generator|None(default:None) Random number generator for shuffling. Note that
torch.manual_seed()has no effect on reproducibility here; pass a seedednumpy.random.Generatorto control randomness.- replacement
bool(default:False) If
True, draw random chunks with replacement, allowing the same observations to appear more than once.- num_samples
int|None(default:None) Total number of observations to draw. When
None(the default), equals the effective observation range. Must be positive when set and less than the number of observations to be yielded whenreplacement=False.
- batch_size
Attributes table#
The batch size for data loading. |
|
The observation range this sampler operates on. |
|
The random number generator used by this sampler. |
|
Whether data is shuffled. |
Methods table#
Attributes#
- RandomSampler.batch_size#
- RandomSampler.mask#
The observation range this sampler operates on.
- RandomSampler.rng#
The random number generator used by this sampler.
- RandomSampler.shuffle#
Methods#
- RandomSampler.n_iters(n_obs)#
Return the number of batches.
- RandomSampler.sample(n_obs)#
Sample load requests given the total number of observations.
Base implemention simply calls
validate()and then yields via_sample().- Parameters:
- n_obs
int The total number of observations available.
- n_obs
- Yields:
LoadRequest – Load requests for batching data.
- Return type:
- RandomSampler.validate(n_obs)#
Validate the sampler configuration against the loader’s n_obs.
- Parameters:
- n_obs
int The total number of observations in the loader.
- n_obs
- Raises:
ValueError – If the sampler configuration is invalid for the given n_obs.
- Return type: