You are implementing a mini data-loading component for model training.
Design a ResumableDataLoader that iterates over a dataset and yields mini-batches, but can also save its state and later resume from exactly where it left off.
dataset[0..N-1]
.
B
as lists of dataset items (or indices).
shuffle=True/False
.
seed
.
__iter__()
/
next()
(or equivalent) to iterate batches.
state_dict()
→ returns a serializable object capturing everything needed to resume.
load_state_dict(state)
→ restores the loader to continue iteration.
After saving state mid-epoch and restoring, the sequence of items produced must be identical to an uninterrupted run.
shuffle=True
, how do you ensure the shuffle order is reproducible across resume?
B
?
N
can be large; avoid storing unnecessary full copies of the dataset.