CachedDataset2

class CachedDataset2.CachedDataset2(**kwargs)[source]

Somewhat like CachedDataset, but different. Simpler in some sense. And more generic. Caching might be worse.

If you derive from this class: - you must override _collect_single_seq - you must set num_inputs (dense-dim of “data” key) and num_outputs (dict key -> dim, ndim-1) - you should set labels - handle seq ordering by overriding init_seq_order - you can set _estimated_num_seqs - you can set _num_seqs or _num_timesteps if you know them in advance

init_seq_order(epoch=None, seq_list=None)[source]
Parameters:
  • epoch (int|None) –
  • | None seq_list (list[str]) – In case we want to set a predefined order.
Return type:

bool

:returns whether the order changed (True is always safe to return)

This is called when we start a new epoch, or at initialization. Call this when you reset the seq list.

is_cached(start, end)[source]
Parameters:
  • start (int) – like in load_seqs(), sorted seq idx
  • end (int) – like in load_seqs(), sorted seq idx
Return type:

bool

:returns whether we have the full range (start,end) of sorted seq idx.

num_seqs[source]
is_less_than_num_seqs(n)[source]
Return type:bool

:returns whether n < num_seqs. In case num_seqs is not known in advance, it will wait until it knows that n is behind the end or that we have the seq.

get_num_timesteps()[source]
get_seq_length(sorted_seq_idx)[source]
Return type:int
get_data(seq_idx, key)[source]
Parameters:
  • seq_idx (int) – sorted seq idx
  • key (str) – data-key, e.g. “data” or “classes”
Return type:

numpy.ndarray

Returns features or targets:
 

format 2d (time,feature) (float)

get_input_data(seq_idx)[source]
Return type:numpy.ndarray
Returns features:
 format 2d (time,feature) (float)
get_targets(target, seq_idx)[source]
Return type:numpy.ndarray
Returns targets:
 format 1d (time) (int: idx of output-feature)
get_ctc_targets(sorted_seq_idx)[source]
get_tag(sorted_seq_idx)[source]
Parameters:sorted_seq_idx (int) –
Return type:str
get_data_keys()[source]
get_target_list()[source]
is_data_sparse(key)[source]
Parameters:key (str) – e.g. “data” or “classes”
Return type:bool
get_data_dim(key)[source]
Parameters:key (str) – e.g. “data” or “classes”
Return type:int
Returns:number of classes, no matter if sparse or not
get_data_dtype(key)[source]
Parameters:key (str) – e.g. “data” or “classes”
Returns:dtype as str, e.g. “int32” or “float32”
Return type:str