HDFDataset

class HDFDataset.HDFDataset(*args, **kwargs)[source]
files = None[source]
Type:list[str]
file_seq_start = None[source]
Type:list[list[int]]
file_index = None[source]
Type:list[int]
data_dtype = None[source]
Type:dict[str,str]
data_sparse = None[source]
Type:dict[str,bool]
add_file(filename)[source]
Setups data:
self.seq_lengths self.file_index self.file_start self.file_seq_start

Use load_seqs() to load the actual data. :type filename: str

get_tag(sorted_seq_idx)[source]
Parameters:sorted_seq_idx (int) –
Return type:str
is_data_sparse(key)[source]
Parameters:key (str) – e.g. “data” or “classes”
Returns:whether the data is sparse
Return type:bool
get_data_dtype(key)[source]
Parameters:key (str) – e.g. “data” or “classes”
Returns:dtype as str, e.g. “int32” or “float32”
Return type:str
len_info()[source]
Return type:str

:returns a string to present the user as information about our len. Depending on our implementation, we can give some more or some less information.

class HDFDataset.StreamParser(seq_names, stream)[source]
get_data(seq_name)[source]
get_seq_length(seq_name)[source]
get_dtype()[source]
class HDFDataset.FeatureSequenceStreamParser(*args, **kwargs)[source]
get_data(seq_name)[source]
get_seq_length(seq_name)[source]
class HDFDataset.SparseStreamParser(*args, **kwargs)[source]
get_data(seq_name)[source]
get_seq_length(seq_name)[source]
class HDFDataset.SegmentAlignmentStreamParser(*args, **kwargs)[source]
get_data(seq_name)[source]
get_seq_length(seq_name)[source]
class HDFDataset.NextGenHDFDataset(input_stream_name, partition_epoch=1, *args, **kwargs)[source]
parsers = {'feature_sequence': <class 'HDFDataset.FeatureSequenceStreamParser'>, 'segment_alignment': <class 'HDFDataset.SegmentAlignmentStreamParser'>, 'sparse': <class 'HDFDataset.SparseStreamParser'>}[source]
add_file(path)[source]
initialize()[source]

Does the main initialization before it can be used. This needs to be called before self.load_seqs() can be used.

init_seq_order(epoch=None, seq_list=None)[source]
Parameters:| None seq_list (list[str]) – In case we want to set a predefined order.
get_data_dtype(key)[source]
Parameters:key (str) – e.g. “data” or “classes”
Returns:dtype as str, e.g. “int32” or “float32”
Return type:str