returnn.torch.data.returnn_dataset_wrapper

Wrapper for RETURNN datasets.

We make use of TorchData data pipelines.

class returnn.torch.data.returnn_dataset_wrapper.ReturnnDatasetResetDefaultEpochCounterCallback(dataset: Dataset)[source]

Default for reset_callback. Has an internal counter for the epoch, starting at epoch 1 (RETURNN convention).

class returnn.torch.data.returnn_dataset_wrapper.ReturnnDatasetResetMpSharedEpochCallback(dataset: Dataset, epoch_mp_shared: Value)[source]

Can be used as reset_callback.

class returnn.torch.data.returnn_dataset_wrapper.ReturnnDatasetIterDataPipe(returnn_dataset: Dataset, *, reset_callback: Callable[[], None] | None = None)[source]

Converts a RETURNN dataset into a PyTorch IterableDataset.

Parameters:
  • returnn_dataset – dataset to be wrapped

  • reset_callback – callback function to be called when the dataset is reset, e.g. to init the epoch. ReturnnDatasetResetDefaultEpochCounterCallback(returnn_dataset) is the default.

reset()[source]
Returns:

class returnn.torch.data.returnn_dataset_wrapper.ReturnnDatasetPerEpochMapDataPipe[source]

Converts a RETURNN dataset into a PyTorch map-style Dataset.

reset()[source]
Returns:

class returnn.torch.data.returnn_dataset_wrapper.ReturnnDatasetFullMapDataPipe[source]

Converts a RETURNN dataset into a PyTorch map-style Dataset. This is over the full dataset, using the default ordering. RETURNN-dataset-side sorting/shuffling is not supported here. Sorting/shuffling is intended to be done in the further PyTorch data pipeline.