`returnn.engine.batch`¶

Defines BatchSeqCopyPart and other batch related helpers. This is shared across different backends.

class returnn.engine.batch.BatchSeqCopyPart(seq_idx, seq_start_frame, seq_end_frame, batch_slice, batch_frame_offset)[source]¶

A batch used for training in RETURNN can consist of several parts from sequences,: ordered in various ways. The dataset, depending on the configuration, can generate these. For the non-recurrent case, we usually concatenate them together into one slice. For the recurrent case, we have a single slice per sequence, or even multiple slices for a sequence in case of chunking.
This class represents one single such part and where it is going to: be stored in the batch.

property frame_length[source]¶

Return type:: NumbersDict

class returnn.engine.batch.Batch[source]¶

A batch can consists of several sequences (= segments). This is basically just a list of BatchSeqCopyPart.

try_sequence_as_slice(length)[source]¶

Parameters:: length (NumbersDict) – number of (time) frames
Returns:: new shape which covers the old shape and one more data-batch, format (time,batch)
Return type:: (NumbersDict,int)

add_sequence_as_slice(seq_idx, seq_start_frame, length)[source]¶

Adds one data-batch in an additional slice.

Parameters:

seq_idx (int)
seq_start_frame (NumbersDict|int)
length (NumbersDict) – number of (time) frames

add_frames(seq_idx, seq_start_frame, length, frame_dim_corresponds=True)[source]¶

Adds frames to all data-batches. Will add one data-batch if we don’t have one yet.

Parameters:

seq_idx (int)
seq_start_frame (NumbersDict|int)
length (NumbersDict) – number of (time) frames
frame_dim_corresponds (bool) – if the batch frame offset should always be the same (max value) for all keys

init_with_one_full_sequence(seq_idx, dataset)[source]¶

Parameters:

seq_idx (int)
dataset (Dataset.Dataset)

get_all_slices_num_frames()[source]¶

Note that this is only an upper limit in case of data_shape[1] > 1 because data_shape[0] is the max frame len of all seqs.

Returns:: related to the data-key with max length
Return type:: NumbersDict

get_total_num_frames()[source]¶

Return type:: NumbersDict

property start_seq[source]¶

Return type:: int|None

property end_seq[source]¶

Return type:: int|None

get_num_seqs()[source]¶

Return type:: int

class returnn.engine.batch.BatchSetGenerator(dataset, generator, shuffle_batches=False, cache_whole_epoch=True)[source]¶

This will give you the next batches (list[Batch]) such that you can use them for assign_dev_data(). We get those batches from a generator, i.e. lazily on-the-fly. This is the whole point of BatchSetGenerator - that we must not know the whole list of batches in advance. As assign_dev_data() can fail for various reasons, we buffer the list of batches and you call self.advance() explicitly to go forward to next batches.

Parameters:

shuffle_batches (bool)
cache_whole_epoch (bool)

reset()[source]¶: Call this after one epoch to reuse the previously cached batches.

peek_next_n(n)[source]¶

Return type:: list[Batch]

:returns it might return less. There is no way to know in advance. If self.has_more() is True, it will at least return one.

advance(n)[source]¶

completed_frac()[source]¶

Return type:: float

:returns 0-1, >0

has_more()[source]¶

This would also try to advance further in the dataset, thus it might block. If it returns False, no more data is available in the dataset.

Return type:: bool

get_current_batch_idx()[source]¶

Return type:: int

returnn.engine.batch¶

`returnn.engine.batch`¶