CachedDataset

class CachedDataset.CachedDataset(cache_byte_size=0, **kwargs)[source]
alloc_interval_index(ids)[source]
Parameters:ids (int) – sorted seq idx

:return index in self.alloc_intervals :rtype: int

batch_set_generator_cache_whole_epoch()[source]
delete(nframes)[source]
Parameters:nframes (int|None) – how much frames to delete max. Note that this limit is not strict. We can end up deleting more than nframes.
Returns:number of frames deleted
Return type:int
get_ctc_targets(sorted_seq_idx)[source]
get_data_dim(key)[source]
get_input_data(sorted_seq_idx)[source]
get_seq_length(seq_idx)[source]
Return type:NumbersDict
get_seq_length_2d(sorted_seq_idx)[source]
Return type:(int,int)
get_seq_start(sorted_seq_idx)[source]
Return type:(int,int)
get_tag(sorted_seq_idx)[source]
get_target_list()[source]
get_targets(target, sorted_seq_idx)[source]
get_times(sorted_seq_idx)[source]
has_ctc_targets()[source]
init_seq_order(epoch=None, seq_list=None)[source]
Parameters:| None seq_list (list[str]) – In case we want to set a predefined order.
Initialize lists:
self.seq_index # sorted seq idx
initialize()[source]
insert_alloc_interval(start, end=None)[source]
is_cached(start, end)[source]
Parameters:
  • start (int) – like in load_seqs(), sorted seq idx
  • end (int) – like in load_seqs(), sorted seq idx
Return type:

bool

:returns whether we have the full range (start,end) of sorted seq idx
cached in self.alloc_intervals (end is exclusive).
load_seqs(start, end, with_cache=True)[source]

Load data sequences. As a side effect, will modify / fill-up:

self.alloc_intervals self.targets

This does some extra logic for the cache and calls self._load_seqs() for the real loading.

Parameters:
  • start (int) – start sorted seq idx
  • end (int) – end sorted seq idx
  • with_cache (bool) – handle cache
num_seqs[source]
remove_alloc_interval(start, end=None)[source]