returnn.datasets.stereo#

This file contains dataset implementations to have an easy to use interface for using RETURNN for regression. Applications are for example speech enhancement or mask estimations

class returnn.datasets.stereo.StereoDataset(partition_epoch=1, **kwargs)[source]#

The purpose of this dataset is to be a base dataset for datasets which have an easy to use interface for using RETURNN as a regression tool

constructor

initialize()[source]#

Does the main initialization before it can be used. This needs to be called before self.load_seqs() can be used.

property num_seqs[source]#

returns the number of sequences of the dataset

Return type:

int

property seqs_per_epoch[source]#
init_seq_order(epoch=None, seq_list=None, seq_order=None)[source]#
Parameters:
  • epoch (int|None) – epoch number

  • seq_list (list[str]|None) –

  • seq_order (list[int]|None) –

  • seq_list – only None is currently supported

Initialize lists:

self.seq_index # sorted seq idx

class returnn.datasets.stereo.StereoHdfDataset(hdfFile, num_outputs=None, normalizationFile=None, flag_normalizeInputs=True, flag_normalizeTargets=True, **kwargs)[source]#

A stereo dataset which needs an hdf file as input. The hdf file is supposed to always have group ‘inputs’ and for the training data it also needs to contain the group ‘outputs’. Each group is supposed to contain one dataset per sequence. The names of the datasets are supposed to be consecutive numbers starting at 0.

The datasets are 2D numpy arrays, where dimension 0 is the time axis and dimension 1 is the feature axis. Therefore dimension 0 of the ‘input’ dataset and the respective ‘output’ dataset need to be the same.

Constructor

Parameters:
  • hdfFile (str) – path to the hdf file. if a bundle file is given (*.bundle) all hdf files listed in the bundle file will be used for the dataset. :see: BundleFile.BundleFile

  • num_outputs (int) – this needs to be set if the stereo data hdf file only contains ‘inputs’ data (e.g. for the extraction process). Only if no ‘outputs’ data exists in the hdf file num_outputs is used.

  • normalizationFile (str | None) – path to a HDF file with normalization data. The file is optional: if it is not provided then no normalization is performed. :see: NormalizationData.NormalizationData

  • flag_normalizeInputs (bool) – if True then inputs will be normalized provided that the normalization HDF file has necessary datasets (i.e. mean and variance)

  • flag_normalizeTargets (bool) – if True then targets will be normalized provided that the normalization HDF file has necessary datasets (i.e. mean and variance)

get_data_dim(key)[source]#

This is copied from CachedDataset2 but the assertion is removed (see CachedDataset2.py)

Return type:

int

Returns:

number of classes, no matter if sparse or not

property num_seqs[source]#

Returns the number of sequences of the dataset

Return type:

int

Returns:

the number of sequences of the dataset.

class returnn.datasets.stereo.DatasetWithTimeContext(hdfFile, tau=1, **kwargs)[source]#

This dataset composes a context feature by stacking together time frames.

Constructor

Parameters:
  • hdfFile (string) – see the StereoHdfDataset

  • tau (int) – how many time frames should be on the left and on the right. E.g. if tau = 2 then the context feature will be created by stacking two neighboring time frames from left and two neighboring time frames from right: newInputFeature = [ x_{t-2}, x_{t-1}, x_t, x_{t+1}, x_{t+2} ]. In general new feature will have shape (2 * tau + 1) * originalFeatureDimensionality Output features are not changed.

  • kwargs (dictionary) – the rest of the arguments passed to the StereoHdfDataset