returnn.datasets.raw_wav#

Provide RawWavDataset.

class returnn.datasets.raw_wav.RawWavDataset(listFile, frameLength, frameShift, num_outputs=None, **kwargs)[source]#

This dataset returns the raw waveform information of wav files as sequence input data It uses temporary hdf files to buffer the data, to avoid repeatedly reading the wav files.

constructor

Parameters:
  • listFile (string) – path to the file containing a list of wav file pathes (on path per line) each line needs to contain exactly one wav file which is considered a sequence

  • frameLength (int) – length of one frame in samples

  • frameShift (int) – shift length of frame in samples

  • num_outputs (int) – this needs to be set if the data set is used with only input data (e.g. for the extraction process).

get_data_dim(key)[source]#

This is copied from CachedDataset2 but the assertion is removed (see CachedDataset2.py)

Return type:

int

Returns:

number of classes, no matter if sparse or not

init_seq_order(epoch=None, seq_list=None, seq_order=None)[source]#
Parameters:
  • epoch (int|None) – epoch number

  • seq_list (list[str]|None) –

  • seq_order (list[int]|None) –

  • seq_list – only None is currently supported

Initialize lists:

self.seq_index # sorted seq idx

property num_seqs[source]#

returns the number of sequences of the dataset

Return type:

int