Shape and Type Modification

Cast Layer

class returnn.tf.layers.basic.CastLayer(dtype, output, **kwargs)[source]

Cast to some other dtype.

Parameters:
  • dtype (str) –
  • output (Data) –
layer_class = 'cast'[source]
classmethod get_out_data_from_opts(dtype, **kwargs)[source]
Parameters:dtype (str) –
Return type:Data

Expand Dimensions Layer

class returnn.tf.layers.basic.ExpandDimsLayer(axis, dim=1, **kwargs)[source]

Adds some axis.

Parameters:
  • axis (str|int) – axis to add, e.g. “F”|”feature” or “spatial”|”time”|”T”. if this is an integer, the input data is first converted into batch-major mode, and then this is counted with batch-dim.
  • dim (int) – dimension of new axis (1 by default)
layer_class = 'expand_dims'[source]
classmethod get_out_data_from_opts(name, axis, dim=1, sources=(), **kwargs)[source]
Parameters:
  • name (str) –
  • axis (str) –
  • dim (int) –
  • sources (list[LayerBase]) –
Return type:

Data

Merge Dimensions Layer

class returnn.tf.layers.basic.MergeDimsLayer(axes, n_out=None, **kwargs)[source]

Merges a list of axes into a single one. (Flatten the dims.) E.g. input is (batch, width, height, dim) and axes=(1,2), then we get (batch, width*height, dim). Or input is (batch, time, height, dim) and axes=”except_time”, then we get (batch, time, height*dim). See also CombineDimsLayer. When batch and time got merged, SplitBatchTimeLayer can undo this.

Parameters:
  • axes (str|list[str]|list[int]) – see Data.get_axes_from_description(), e.g. “except_time”
  • n_out (int|None) –
layer_class = 'merge_dims'[source]
classmethod get_out_data_from_opts(name, axes, sources=(), n_out=<class 'returnn.util.basic.NotSpecified'>, out_type=None, **kwargs)[source]
Parameters:
  • name (str) –
  • axes (str|list[str]) –
  • sources (list[LayerBase]) –
  • n_out (int|None|NotSpecified) –
  • out_type (None|dict[str]) –
Return type:

Data

Length Layer

class returnn.tf.layers.basic.LengthLayer(add_time_axis=False, dtype='int32', sparse=False, **kwargs)[source]

Returns the length of sources as (B,), via input size_placeholder.

Parameters:
  • add_time_axis (bool) –
  • dtype (str) –
  • sparse (bool) –
layer_class = 'length'[source]
classmethod get_out_data_from_opts(name, sources, add_time_axis=False, dtype='int32', sparse=False, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
  • add_time_axis (bool) –
  • dtype (str) –
  • sparse (bool) –
Return type:

Data

Pad Layer

class returnn.tf.layers.basic.PadLayer(axes, padding, value=0, mode='constant', **kwargs)[source]

Adds (e.g. zero) padding in some axis or axes.

Parameters:
  • axes (str|list[str]) – e.g. “F” etc. see Dataset.get_axes_from_description().
  • padding (list[(int,int)]|(int,int)|int) – how much to pad left/right in each axis
  • value (int|float) – what constant value to pad, with mode==”constant”
  • mode (str) – “constant”, “reflect” or “symmetric”
layer_class = 'pad'[source]
classmethod get_out_data_from_opts(name, axes, padding, sources=(), **kwargs)[source]
Parameters:
  • name (str) –
  • axes (str|list[str]) –
  • padding (list[(int,int)]|(int,int)|int) –
  • sources (list[LayerBase]) –
Return type:

Data

Postfix (in Time) Layer

class returnn.tf.layers.basic.PostfixInTimeLayer(postfix=0.0, repeat=1, **kwargs)[source]

Adds some postfix in time dimension.

Parameters:
  • postfix (float|int) – constant
  • repeat (int) – how often to repeat the postfix
layer_class = 'postfix_in_time'[source]
recurrent = True[source]
classmethod get_out_data_from_opts(name, sources, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
Return type:

Data

Prefix (in Time) Layer

class returnn.tf.layers.basic.PrefixInTimeLayer(prefix=0.0, repeat=1, size_base=None, **kwargs)[source]

Adds some prefix in time dimension. This is kind of the reverse of SliceNdLayer does.

Parameters:
  • prefix (float|str) – either some constant or another layer
  • repeat (int|LayerBase) – how often to repeat the postfix
  • size_base (LayerBase|None) – copy seq-lens from here
layer_class = 'prefix_in_time'[source]
recurrent = True[source]
get_dep_layers()[source]
Return type:list[LayerBase]
classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:
  • d (dict[str]) – will modify inplace
  • network (returnn.tf.network.TFNetwork) –
  • -> LayerBase) get_layer (((str)) – function to get or construct another layer
classmethod get_out_data_from_opts(name, sources, size_base=None, repeat=None, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
  • size_base (LayerBase|None) –
  • repeat (LayerBase|int|None) –
Return type:

Data

Resize Layer

class returnn.tf.layers.basic.ResizeLayer(factor, axis, kind='nn', fill_value=None, fill_dropout=None, **kwargs)[source]

Resizes the input, i.e. upsampling or downsampling. Supports different kinds, such as linear interpolation or nearest-neighbor.

Parameters:
  • factor (int) –
  • axis (str|int) – the axis to resize, counted with batch-dim. can also be “T” for time
  • kind (str) – “linear”, “nn”/”nearest_neighbor”, “cubic”, “fill”
  • fill_value (None|int|float) – if kind==”fill”
  • fill_dropout (float) – if set, will dropout in the same axis
layer_class = 'resize'[source]
classmethod get_out_data_from_opts(factor, axis, sources, name, **kwargs)[source]
Parameters:
  • factor (int) –
  • axis (str) –
  • sources (list[LayerBase]) –
  • name (str) –
Return type:

Data

Reinterpret Data Layer

class returnn.tf.layers.basic.ReinterpretDataLayer(switch_axes=None, size_base=None, set_axes=None, enforce_batch_major=False, enforce_time_major=False, set_sparse=None, set_sparse_dim=<class 'returnn.util.basic.NotSpecified'>, increase_sparse_dim=None, **kwargs)[source]

Acts like the CopyLayer but reinterprets the role of some axes or data.

Parameters:
  • switch_axes (str|list[str]) – e.g. “bt” to switch batch and time axes
  • size_base (LayerBase|None) – copy the size_placeholder from the given layer
  • set_axes (dict[str,int|str]) – the key is “B”,”T”,”F”, value is via Data.get_axis_from_description()
  • enforce_batch_major (bool) –
  • enforce_time_major (bool) –
  • set_sparse (bool|None) – if bool, set sparse value to this
  • set_sparse_dim (int|None|NotSpecified) – set sparse dim to this. assumes that it is sparse
  • increase_sparse_dim (int|None) – add this to the dim. assumes that it is sparse
layer_class = 'reinterpret_data'[source]
get_dep_layers()[source]
Return type:list[LayerBase]
classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:
classmethod get_out_data_from_opts(name, sources, switch_axes=None, size_base=None, set_axes=None, enforce_batch_major=False, enforce_time_major=False, set_sparse=None, set_sparse_dim=<class 'returnn.util.basic.NotSpecified'>, increase_sparse_dim=None, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
  • switch_axes (str|list[str]) – e.g. “bt” to switch batch and time axes
  • size_base (LayerBase|None) – similar as size_target
  • set_axes (dict[str,int]) –
  • enforce_batch_major (bool) –
  • enforce_time_major (bool) –
  • set_sparse (bool|None) – if bool, set sparse value to this
  • set_sparse_dim (int|None|NotSpecified) – set sparse dim to this. assumes that it is sparse
  • increase_sparse_dim (int|None) – add this to the dim. assumes that it is sparse

Scatter n-dim Layer

class returnn.tf.layers.basic.ScatterNdLayer(position, position_axis, output_dim_via_time_from, filter_invalid_indices=False, **kwargs)[source]

The inverse of GatherNdLayer. Mostly a wrapper for tf.scatter_nd.

The input to the layer are the updates, the indices are via the position argument. The indices are into the newly constructed output dimension. The output shape is constructed via the common shape of the input, the position, and the the unique common axis (if not unique, we would need to introduce an option to specify it) is replaced by the given output dimension (currently via output_dim_via_time_from).

Examples:

position (indices): (B,eTs)
input (updates): (eTs,D) or (B,eTs,D) -> expanded to (B,eTs,D)
output shape: (B,eT,D)

position (indices): (B,dT,eTs)
input (updates): (eTs,D) -> expanded to (B,dT,eTs,D)
output shape: (B,dT,eT,D)

position (indices): (dT,eTs)
input (updates): (eTs,D) -> expanded to (dT,eTs,D)
output shape: (dT,eTs,D)

position (indices): (dT,eTs)
input (updates): (B,eTs,D) -> expanded to (dT,eTs,B,D)
output shape: (dT,eT,B,D)

In all these examples, output_dim_via_time_from is (B,eT,F), and eTs gets replaced by eT.

Parameters:
  • position (LayerBase) – indices into first axis (excluding batch) of the output
  • position_axis (str|int) – axis in position to replace by the output-dim
  • output_dim_via_time_from (LayerBase) – use the time-dim from this layer as the output-dim
  • filter_invalid_indices (bool) – allow for indices <0 or >= output_dim, which will be discarded in the output
layer_class = 'scatter_nd'[source]
get_dep_layers()[source]
Return type:list[LayerBase]
classmethod get_out_data_from_opts(name, sources, position, position_axis, output_dim_via_time_from, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
  • position (LayerBase) –
  • position_axis (str|int) – axis in position to replace by the output-dim
  • output_dim_via_time_from (LayerBase) –
Return type:

Data

classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:

ShiftAxisLayer

class returnn.tf.layers.basic.ShiftAxisLayer(axis, amount, pad=True, adjust_size_info=True, **kwargs)[source]

Shifts the dimensions in an axis around. This layer may change the axis-dimension.

This name might be confusing. No axis will be shifted here. See SwapAxesLayer for that.

Parameters:
  • axis (str|int) – single axis to shift
  • amount (int) – number of elements to shift (<0 for left-shift, >0 for right-shift)
  • pad (bool) – preserve shape by padding
  • adjust_size_info (bool) – whether to adjust the size_placeholder
layer_class = 'shift_axis'[source]
classmethod get_out_data_from_opts(name, amount, axis, pad, sources=(), **kwargs)[source]
Parameters:
  • name (str) –
  • amount (int) –
  • axis (str) –
  • pad (bool) –
  • sources (list[LayerBase]) –
Return type:

Data

Slice Layer

class returnn.tf.layers.basic.SliceLayer(axis, slice_start=None, slice_end=None, slice_step=None, **kwargs)[source]

Slicing on the input, i.e. x[start:end:step] in some axis. See also SliceNdLayer.

Parameters:
  • axis (int|str) –
  • axis_kind (str|None) – “T” for time, “B” for batch, “F” for feature
  • slice_start (int|None) –
  • slice_end (int|None) –
  • slice_step (int|None) –
layer_class = 'slice'[source]
classmethod get_out_data_from_opts(name, axis, sources=(), slice_start=None, slice_end=None, slice_step=None, **kwargs)[source]
Parameters:
  • name (str) –
  • axis (str) –
  • sources (list[LayerBase]) –
  • slice_start (int|None) –
  • slice_end (int|None) –
  • slice_step (int|None) –
Return type:

Data

Slice n-dim Layer

class returnn.tf.layers.basic.SliceNdLayer(start, size, min_size=None, **kwargs)[source]

This takes out a slice-range from some axis, e.g. x[start:start + size]. This layers allows a different start slice point for each batch, in contrast to SliceLayer, and the start is variable. See also GatherNdLayer. PrefixInTimeLayer can recover the original shape (by zero-padding).

Parameters:
  • start (LayerBase) –
  • size (int|None) – if None, it uses the max possible size, and it becomes a dynamic axis
  • min_size (int|None) – if size is None, but we want to have a min-size, set this
layer_class = 'slice_nd'[source]
recurrent = True[source]
get_dep_layers()[source]
Return type:list[LayerBase]
classmethod get_out_data_from_opts(name, sources=(), start=None, size=None, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
  • start (LayerBase|None) –
  • size (int|None) –
Return type:

Data

classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:

Split Batch Time Layer

class returnn.tf.layers.basic.SplitBatchTimeLayer(base, **kwargs)[source]

A very specific layer which expects to get input of shape (batch * time, …) and converts it into (batch, time, …), where it recovers the seq-lens from some other layer. See SplitDimsLayer for a more generic layer.

Parameters:base (LayerBase) – used to recover the seq-lens
layer_class = 'split_batch_time'[source]
get_dep_layers()[source]
Return type:list[LayerBase]
classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:
classmethod get_out_data_from_opts(name, base, sources=(), **kwargs)[source]
Parameters:
Return type:

Data

Split Dimensions Layer

class returnn.tf.layers.basic.SplitDimsLayer(axis, dims, **kwargs)[source]

Splits one axis into multiple axes. E.g. if you know that your feature-dim is composed by a window, i.e. the input is (batch, time, window * feature), you can set axis=”F”, dims=(window, -1), and you will get the output (batch, time, window, feature). Also see SplitBatchTimeLayer.

Parameters:
  • axis (str) – e.g. “F”
  • dims (tuple[int]) – what the axis should be split into. e.g. (window, -1)
layer_class = 'split_dims'[source]
classmethod get_out_data_from_opts(name, axis, dims, sources=(), **kwargs)[source]
Parameters:
  • name (str) –
  • axis (str|int) –
  • dims (tuple[int]) –
  • sources (list[LayerBase]) –
Return type:

Data

Squeeze Layer

class returnn.tf.layers.basic.SqueezeLayer(axis, enforce_batch_dim_axis=None, allow_no_op=False, **kwargs)[source]

Removes an axis with dimension 1. This is basically a wrapper around tf.squeeze.

Parameters:
  • axis (int|list[int]|str) – one axis or multiple axis to squeeze. this is counted with batch-dim, which by default is axis 0 (see enforce_batch_dim_axis). it also accepts the special tokens “B”|”batch”, “spatial”, “spatial_except_time”, or “F”|”feature”
  • enforce_batch_dim_axis (int|None) –
  • allow_no_op (bool) –
layer_class = 'squeeze'[source]
classmethod get_out_data_from_opts(axis, enforce_batch_dim_axis=None, allow_no_op=False, sources=(), **kwargs)[source]
Parameters:
  • axis (int|list[int]|str) –
  • enforce_batch_dim_axis (int|None) –
  • allow_no_op (bool) –
  • sources (list[LayerBase]) –
Return type:

Data

Stack Layer

class returnn.tf.layers.basic.StackLayer(**kwargs)[source]

Stacks multiple inputs together using tf.stack().

layer_class = 'stack'[source]
classmethod get_out_data_from_opts(name, sources, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
Return type:

Data

Swap Axes Layer

class returnn.tf.layers.basic.SwapAxesLayer(axis1, axis2, **kwargs)[source]

Swaps two axes. Basically a wrapper around TFUtil.swapaxes(). See also ReinterpretDataLayer.

Parameters:
  • axis1 (int|str) –
  • axis2 (int|str) –
layer_class = 'swap_axes'[source]
classmethod get_out_data_from_opts(name, sources, axis1, axis2, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
  • axis1 (int|str) –
  • axis2 (int|str) –
Return type:

Data

Time Chunking Layer

class returnn.tf.layers.basic.TimeChunkingLayer(chunk_size, chunk_step, **kwargs)[source]

Performs chunking in time. See TFNativeOp.chunk().

Parameters:
  • chunk_size (int) –
  • chunk_step (int) –
layer_class = 'time_chunking'[source]
recurrent = True[source]
classmethod get_out_data_from_opts(name, sources, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
Return type:

Data

Time Un-Chunking Layer

class returnn.tf.layers.basic.TimeUnChunkingLayer(chunking_layer, **kwargs)[source]

Performs chunking in time. See TFNativeOp.chunk().

Parameters:chunking_layer (TimeChunkingLayer) –
layer_class = 'time_unchunking'[source]
recurrent = True[source]
get_dep_layers()[source]
Return type:list[LayerBase]
classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:
classmethod get_out_data_from_opts(name, sources, **kwargs)[source]
Parameters:
  • name (str) –
  • sources (list[LayerBase]) –
Return type:

Data