Provide RawWavDataset.

class RawWavDataset.RawWavDataset(listFile, frameLength, frameShift, num_outputs=None, **kwargs)[source]

This dataset returns the raw waveform information of wav files as sequence input data It uses temporary hdf files to buffer the data, to avoid repeatedly reading the wav files.


  • listFile (string) – path to the file containing a list of wav file pathes (on path per line) each line needs to contain exactly one wav file which is considered a sequence
  • frameLength (int) – length of one frame in samples
  • frameShift (int) – shift length of frame in samples
  • num_outputs (int) – this needs to be set if the data set is used with only input data (e.g. for the extraction process).
get_data_dim(self, key)[source]

This is copied from CachedDataset2 but the assertion is removed (see CachedDataset2.py)

Return type:int
Returns:number of classes, no matter if sparse or not
init_seq_order(self, epoch=None, seq_list=None)[source]
  • epoch (int|None) – epoch number
  • seq_list (list[str] | None seq_list: In case we want to set a predefined order.) – only None is currently supported
Initialize lists:
self.seq_index # sorted seq idx

returns the number of sequences of the dataset

Return type:int