Shape and Type Modification¶
Cast Layer¶
- class returnn.tf.layers.basic.CastLayer(dtype, output, **kwargs)[source]¶
Cast to some other dtype.
- Parameters:
dtype (str)
output (Data)
- classmethod get_out_data_from_opts(dtype, **kwargs)[source]¶
- Parameters:
dtype (str)
- Return type:
Data
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Expand Dimensions Layer¶
- class returnn.tf.layers.basic.ExpandDimsLayer(axis, dim=1, **kwargs)[source]¶
Adds some axis.
- Parameters:
axis (str|int) – axis to add, e.g. “F”|”feature” or “spatial”|”time”|”T”. if this is an integer, the input data is first converted into batch-major mode, and then this is counted with batch-dim.
dim (int|Dim) – dimension of new axis (1 by default)
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Gather Layer¶
- class returnn.tf.layers.basic.GatherLayer(position: LayerBase | int, axis: Dim | str, clip_to_valid: bool = False, **kwargs)[source]¶
Gathers slices on a specified axis from the input layer using indices from a
position
layer. If the input is a layer of the shape[B,D,F1]
, and position of shape[B,F2]
, this will yield output of the shape[B,F2,F1]
whereoutput[b,f2,f1] = input[b,position[b,f2],f1]
(if
D
is the axis to gather from). In general, all shared axes of the input and the positions will be considered as batch-axes.The
position
argument can also be anint
. In this case, this simply givesinput[position]
one the specifiedaxis
.It’s basically a wrapper around
tf.gather
. It provides the same functionality as the deprecatedGatherNdLayer
, but is more generic. See alsoGatherNdLayer
.- Parameters:
position – indices used to select the slices of the input from. If another layer, must be of type
int32
orint64
. Can also specify a constantint
.axis – The axis into which we gather the indices into
clip_to_valid – if True, the indices will be clipped to the valid range of the input Also taking seq lengths into account.
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Merge Dimensions Layer¶
- class returnn.tf.layers.basic.MergeDimsLayer(axes, keep_order=<class 'returnn.util.basic.NotSpecified'>, n_out=None, out_dim=None, **kwargs)[source]¶
Merges a list of axes into a single one. (Flatten the dims.) E.g. input is (batch, width, height, dim) and axes=(1,2), then we get (batch, width*height, dim). Or input is (batch, time, height, dim) and axes=”except_time”, then we get (batch, time, height*dim). See also
CombineDimsLayer
. When batch and time got merged,SplitBatchTimeLayer
can undo this. When you want to merge batch and time, but remove the padding efficiently, i.e. flatten it, seeFlattenBatchLayer
.- Parameters:
axes (Sequence[Dim|str]) – see
Data.get_axis_from_description()
keep_order (bool|NotSpecified) – The old default was: the axes are sorted, and then merged. Thus, the order of incoming axes will influence the result. E.g. inputs [B,S,F] and [B,F,S], with
axes=["S","F"]
, will get different results, although the output shape is [B,S*F] in both cases. This is bad: In general, other layers in RETURNN might reorder the axes for various reasons, and all layers should behave in the same way, no matter the order. It is recommended to setkeep_order=True
, such that the order defined inaxes
defines the behavior, and not the incoming axis order. Since behavior version 6, this is already the case.n_out (int|None)
out_dim (Dim|None)
- classmethod get_out_data_from_opts(name, axes, keep_order=<class 'returnn.util.basic.NotSpecified'>, sources=(), n_out=<class 'returnn.util.basic.NotSpecified'>, out_type=None, out_dim=None, **kwargs)[source]¶
- Parameters:
name (str)
axes (Sequence[Dim|str])
keep_order (bool|NotSpecified)
sources (list[LayerBase])
n_out (int|None|NotSpecified)
out_type (None|dict[str])
out_dim (Dim|None)
- Return type:
Data
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Length Layer¶
- class returnn.tf.layers.basic.LengthLayer(axis='T', add_time_axis=False, dtype='int32', sparse=False, **kwargs)[source]¶
Returns the length of sources as (B,), via input size_placeholder.
- Parameters:
axis (str|Dim)
add_time_axis (bool) – should not be used
dtype (str)
sparse (bool)
- classmethod get_out_data_from_opts(name, sources, axis='T', add_time_axis=False, dtype='int32', sparse=False, **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Pad Layer¶
- class returnn.tf.layers.basic.PadLayer(*, axes: Dim | str | Sequence[Dim | str], padding: int | Dim | Tuple[int | Dim, int | Dim] | Sequence[Tuple[int | Dim, int | Dim]], out_dims: Dim | Sequence[Dim] | None = None, handle_dynamic_dims: bool | None = None, value: int | float = 0, mode: str = 'constant', **kwargs)[source]¶
Adds (e.g. zero) padding in some axis or axes. Also see
PrefixInTimeLayer
for dynamic dims.- Parameters:
axes – e.g. “F” etc. see
Data.get_axes_from_description()
.padding – how much to pad left/right in each axis
out_dims
handle_dynamic_dims – True: when doing right padding on a dynamic dim, value will be added after the seq end, not at the end of the dimension. False: value will be added at the end of the dimension. By default, in behavior version >=21, this is True, in older versions, this is False.
value – what constant value to pad, with mode==”constant”
mode – “constant”, “reflect”, “symmetric” and “replication”
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Postfix (in Time) Layer¶
- class returnn.tf.layers.basic.PostfixInTimeLayer(axis='T', out_dim=None, postfix=0.0, repeat=1, **kwargs)[source]¶
Adds some postfix in time dimension. Also see
PrefixInTimeLayer
.- Parameters:
- classmethod get_out_data_from_opts(name, sources, axis='T', out_dim=None, postfix=0.0, repeat=1, **kwargs)[source]¶
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Prefix (in Time) Layer¶
- class returnn.tf.layers.basic.PrefixInTimeLayer(axis='T', out_dim=None, prefix=0.0, repeat=1, size_base=None, **kwargs)[source]¶
Adds some prefix in time dimension. This is kind of the reverse of
SliceNdLayer
does. Also seePadLayer
for static dimensions. Also seePostfixInTimeLayer
.- Parameters:
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str]) – will modify inplace
network (returnn.tf.network.TFNetwork)
get_layer (((str) -> LayerBase)) – function to get or construct another layer
- classmethod get_out_data_from_opts(name, sources, axis='T', out_dim=None, size_base=None, repeat=1, **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Reinterpret Data Layer¶
- class returnn.tf.layers.basic.ReinterpretDataLayer(switch_axes=None, size_base=None, batch_dim_base=None, set_axes=None, set_dim_tags=None, enforce_batch_major=False, enforce_time_major=False, set_sparse=None, set_sparse_dim=<class 'returnn.util.basic.NotSpecified'>, increase_sparse_dim=None, **kwargs)[source]¶
Acts like the
CopyLayer
but reinterprets the role of some axes or data.- Parameters:
switch_axes (str|list[str]) – e.g. “bt” to switch batch and time axes
size_base (LayerBase|None) – copy the size_placeholder from the given layer
batch_dim_base (LayerBase|None) – copy the batch dim from this layer
set_axes (dict[str,Dim|str|None]) – This can be used to overwrite the special axes like time_dim_axis or feature_dim_axis. For that, use keys “B”,”T” or “F”, and a value via
Data.get_axis_from_description()
.set_dim_tags (dict[str|Dim,Dim]|Sequence[Tuple[Dim,Dim]]|None) – axis -> new dim tag. assigns new dim tags. If the passed dim tag is yet undefined, this will not use same_dim_tags_as (declare_same_as) but create a new dim tag. This option is useful for generalized self attention (https://github.com/rwth-i6/returnn/issues/391).
enforce_batch_major (bool)
enforce_time_major (bool)
set_sparse (bool|None) – if bool, set sparse value to this
set_sparse_dim (Dim|int|None|NotSpecified) – set sparse dim to this. assumes that it is sparse
increase_sparse_dim (int|None) – add this to the dim. assumes that it is sparse
- output_before_activation: Optional[OutputWithActivation][source]¶
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer
- classmethod get_out_data_from_opts(name, sources, switch_axes=None, size_base=None, batch_dim_base=None, set_axes=None, set_dim_tags=None, enforce_batch_major=False, enforce_time_major=False, set_sparse=None, set_sparse_dim=<class 'returnn.util.basic.NotSpecified'>, increase_sparse_dim=None, **kwargs)[source]¶
- Parameters:
name (str)
sources (list[LayerBase])
switch_axes (str|list[str]) – e.g. “bt” to switch batch and time axes
size_base (LayerBase|None) – similar as size_target
batch_dim_base (LayerBase|None)
set_axes (dict[str,Dim|str|None])
set_dim_tags (dict[str|Dim,Dim]|Sequence[Tuple[Dim,Dim]]|None)
enforce_batch_major (bool)
enforce_time_major (bool)
set_sparse (bool|None) – if bool, set sparse value to this
set_sparse_dim (Dim|int|None|NotSpecified) – set sparse dim to this. assumes that it is sparse
increase_sparse_dim (int|None) – add this to the dim. assumes that it is sparse
- search_choices: Optional[SearchChoices][source]¶
Repeat Layer¶
- class returnn.tf.layers.basic.RepeatLayer(repetitions, axis='T', out_dim=None, **kwargs)[source]¶
A wrapper around tf.repeat, but supports an additional batch axis for the durations The sum of the repetitions has to be non-zero for each sequence in the batch.
This layer can only be used with Tensorflow 1.15.0 or newer.
- Parameters:
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer
- classmethod get_out_data_from_opts(name, sources, axis, repetitions, out_dim=None, **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Resize Layer¶
- class returnn.tf.layers.basic.ResizeLayer(factor, axis, out_dim=None, kind='nn', fill_value=None, fill_dropout=None, **kwargs)[source]¶
Resizes the input, i.e. upsampling or downsampling. Supports different kinds, such as linear interpolation or nearest-neighbor.
- Parameters:
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer ((str)->LayerBase)
- classmethod get_out_data_from_opts(factor, axis, sources, name, fill_dropout=None, out_dim=None, **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Scatter n-dim Layer¶
- class returnn.tf.layers.basic.ScatterNdLayer(position, position_axis, output_dim_via_time_from=None, out_spatial_dim=None, filter_invalid_indices=False, **kwargs)[source]¶
The inverse of
GatherNdLayer
. Mostly a wrapper fortf.scatter_nd
.Note that “nd” is maybe a bit misleading. While we operate on N-D tensors, the indices (
position
) are into a single new dimension.The input to the layer are the
updates
, theindices
are via theposition
argument. The indices are into the newly constructed output dimension. The output shape is constructed via the common shape of the input, the position, and the unique common axis (if not unique, we would need to introduce an option to specify it) is replaced by the given output dimension (currently viaoutput_dim_via_time_from
).Examples:
position (indices): (B,eTs) input (updates): (eTs,D) or (B,eTs,D) -> expanded to (B,eTs,D) output shape: (B,eT,D) position (indices): (B,dT,eTs) input (updates): (eTs,D) -> expanded to (B,dT,eTs,D) output shape: (B,dT,eT,D) position (indices): (dT,eTs) input (updates): (eTs,D) -> expanded to (dT,eTs,D) output shape: (dT,eTs,D) position (indices): (dT,eTs) input (updates): (B,eTs,D) -> expanded to (dT,eTs,B,D) output shape: (dT,eT,B,D)
In all these examples, output_dim_via_time_from is (B,eT,F), and eTs gets replaced by eT.
- Parameters:
position (LayerBase) – indices into first axis (excluding batch) of the output
position_axis (Dim|str) – axis in position to replace by the output-dim
output_dim_via_time_from (LayerBase|None) – use the time-dim from this layer as the output-dim
out_spatial_dim (Dim|None)
filter_invalid_indices (bool) – allow for indices <0 or >= output_dim, which will be discarded in the output
- classmethod get_out_data_from_opts(name, sources, position, position_axis, output_dim_via_time_from=None, out_spatial_dim=None, **kwargs)[source]¶
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer ((str)->LayerBase)
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
ShiftAxisLayer¶
- class returnn.tf.layers.basic.ShiftAxisLayer(axis, amount, pad=True, pad_value=0, adjust_size_info=True, **kwargs)[source]¶
Shifts the dimensions in an axis around by slicing and optional padding. This layer may change the axis-dimension.
This name might be confusing. No axis will be shifted here. See
SwapAxesLayer
for that.Also see
SliceLayer
.- Parameters:
axis (str|Dim|int) – single axis to shift
amount (int) – number of elements to shift (<0 for left-shift, >0 for right-shift)
pad (bool) – preserve shape by padding
pad_value (int|float|bool) – padding value
adjust_size_info (bool) – whether to adjust the size_placeholder
- classmethod get_out_data_from_opts(name, sources, amount, axis, pad=True, adjust_size_info=True, **kwargs)[source]¶
- Parameters:
name (str)
sources (list[LayerBase])
amount (int)
axis (str)
pad (bool)
adjust_size_info (bool)
- Return type:
Data
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Slice Layer¶
- class returnn.tf.layers.basic.SliceLayer(axis, slice_start=None, slice_end=None, slice_step=None, out_dim=None, **kwargs)[source]¶
Slicing on the input, i.e. x[start:end:step] in some axis. See also
SliceNdLayer
, for variable start. See alsoGatherLayer
, for one single position.Note that __getitem__ on a TF tensor (or also Numpy ND array) is more generic, and supports slices in multiple axes, as well as adding new dimensions, etc. It even allows to get boolean values, and then applies a boolean mask. See TF _slice_helper (== tf.Tensor.__getitem__) for a generic implementation, which calls tf.strided_slice. If we ever need such more generic support, we might consider adding a new layer, like
GenericSliceLayer
, which gets asplice_spec
, just like_slice_helper
(argument to__getitem__
). But any such a slice can already be constructed with multiple individual layers, which perform individual slices (per axis).We just support slicing in a single axis here, with optional striding (slice_step).
- Parameters:
- classmethod get_out_data_from_opts(name, axis, sources=(), slice_start=None, slice_end=None, slice_step=None, out_dim=None, **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Slice n-dim Layer¶
- class returnn.tf.layers.basic.SliceNdLayer(size, start=None, min_size=None, axis='T', out_spatial_dim=None, **kwargs)[source]¶
This takes out a slice-range from the time axis, e.g.
x[start:start + size]
. If the input is of shape (B,T,F) and start is of shape (B,), then the output will be of shape (B,size,F). If the input is of shape (B,T,F) and start is of shape (B,T), then the output will be of shape (B,T,size,F). This layer allows a different start slice point for each batch, in contrast toSliceLayer
, and the start is variable. See alsoGatherNdLayer
.PrefixInTimeLayer
can recover the original shape (by zero-padding).- Parameters:
start (int|LayerBase|None) – (B,…)
size (int|LayerBase|Dim|None) – We assume that this is >=0. If this might not be the case, use
min_size=0
. If None, it uses the max possible size, and it becomes a dynamic axis.min_size (int|None) – if size is None, but we want to have a min-size
axis (Dim|str)
out_spatial_dim (Dim|None)
- classmethod get_out_data_from_opts(name, sources=(), start=None, size=None, axis='T', out_spatial_dim=None, **kwargs)[source]¶
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Split Batch Time Layer¶
- class returnn.tf.layers.basic.SplitBatchTimeLayer(base, **kwargs)[source]¶
A very specific layer which expects to get input of shape (batch * time, …) and converts it into (batch, time, …), where it recovers the seq-lens from some other layer. See
SplitDimsLayer
for a more generic layer.- Parameters:
base (LayerBase) – used to recover the seq-lens
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Split Dimensions Layer¶
- class returnn.tf.layers.basic.SplitDimsLayer(axis, dims, pad_to_multiples=None, pad_value=0, **kwargs)[source]¶
Splits one axis into multiple axes. E.g. if you know that your feature-dim is composed by a window, i.e. the input is (batch, time, window * feature), you can set axis=”F”, dims=(window, -1), and you will get the output (batch, time, window, feature).
If the split axis has a dynamic length, exactly one of the axes that we split into need to also have a dynamic length. You can e.g. use this to split the input dimension into smaller “chunks” of a fixed window size. E.g. you could have input (batch, time, feature) and set axis=”T”, dims=(-1, window), to get output (batch, split_time, window, feature). In this case, the exact sequence lengths are lost and everything is padded to multiples of the window size using the given padding value. Use
ReinterpretDataLayer
to receive back the original sequence lengths after merging.Also see
SplitBatchTimeLayer
. Also seeMergeDimsLayer
which can undo this operation.- Parameters:
axis (Dim|str) – e.g. “F”
dims (tuple[Dim|int]|list[Dim|int]) – what the axis should be split into. e.g. (window, -1)
pad_to_multiples (bool|None) – If true, input will be padded to the next multiple of the product of the static dims, such that splitting is actually possible. By default this is done iff the axis has a dynamic size
pad_value (int|float) – What pad value to use for pad_to_multiples
- classmethod get_out_data_from_opts(name, axis, dims, pad_to_multiples=None, sources=(), **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Squeeze Layer¶
- class returnn.tf.layers.basic.SqueezeLayer(axis, enforce_batch_dim_axis=None, allow_no_op=False, **kwargs)[source]¶
Removes an axis with dimension 1. This is basically a wrapper around tf.squeeze.
- Parameters:
axis (Dim|int|list[int]|str) – one axis or multiple axis to squeeze. this is counted with batch-dim, which by default is axis 0 (see enforce_batch_dim_axis). it also accepts the special tokens “B”|”batch”, “spatial”, “spatial_except_time”, or “F”|”feature”
enforce_batch_dim_axis (int|None)
allow_no_op (bool)
- classmethod get_out_data_from_opts(axis, enforce_batch_dim_axis=None, allow_no_op=False, sources=(), **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Stack Layer¶
- class returnn.tf.layers.basic.StackLayer(axis=None, out_spatial_dim=None, **kwargs)[source]¶
Stacks multiple inputs together using
tf.stack()
. This creates a new dimension for the stack.For concatenation (in feature dimension), see
CopyLayer
.- Parameters:
axis (int|None) – new axis. If not given, will use Data.get_default_new_axis_for_dim_tag(<spatial>), i.e. some reasonable default for a new spatial axis.
out_spatial_dim (Dim|None)
- classmethod get_out_data_from_opts(name, sources, axis=None, out_spatial_dim=None, **kwargs)[source]¶
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Swap Axes Layer¶
- class returnn.tf.layers.basic.SwapAxesLayer(axis1, axis2, **kwargs)[source]¶
Swaps two axes. Basically a wrapper around
returnn.tf.util.basic.swapaxes()
. Note that usually, this should not be needed, and it is recommended not to be used, as this will be unnecessarily inefficient. Normally, all RETURNN layers will automatically transpose the input data into whatever format they need.All axes always have a special meaning (e.g. feature dim or time dim) or dimension tag (e.g. for time axes, including dyn seq lengths). If you need to change the meaning (and not actually transpose / swap axes), you need to use
ReinterpretDataLayer
.See also
TransposeLayer
for a more generic variant.See also
ReinterpretDataLayer
, which does not swap/transpose axes, but allows to reinterpret their meaning / dim tags.- Parameters:
axis1 (int|str)
axis2 (int|str)
- classmethod get_out_data_from_opts(name, sources, axis1, axis2, **kwargs)[source]¶
- Parameters:
name (str)
sources (list[LayerBase])
axis1 (int|str)
axis2 (int|str)
- Return type:
Data
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Time Chunking Layer¶
- class returnn.tf.layers.basic.TimeChunkingLayer(chunk_size, chunk_step, axis='T', out_dim=None, **kwargs)[source]¶
Performs chunking in time. See
returnn.tf.native_op.chunk()
. See alsoWindowLayer
andTimeUnChunkingLayer
. It’s very similar toWindowLayer
, but we have this case more optimized, and also it modifies the batch dim. The output is of shape (chunk_size, n_batch * n_chunks, …).- Parameters:
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Time Un-Chunking Layer¶
- class returnn.tf.layers.basic.TimeUnChunkingLayer(chunking_layer, **kwargs)[source]¶
Performs chunking in time. See
TFNativeOp.chunk()
. SeeTimeChunkingLayer
.- Parameters:
chunking_layer (TimeChunkingLayer)
- classmethod transform_config_dict(d, network, get_layer)[source]¶
- Parameters:
d (dict[str])
network (returnn.tf.network.TFNetwork)
get_layer
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶
Window Layer¶
- class returnn.tf.layers.basic.WindowLayer(window_size=None, window_dim=None, window_left=None, window_right=None, axis='T', out_spatial_dim=None, padding='same', stride=1, _use_opt_dim_order=None, **kwargs)[source]¶
Adds a window dimension. By default, uses the time axis and goes over it with a sliding window. The new axis for the window is created right after the time axis. In PyTorch, this is called
unfold
. We sometimes call this “chunking”. There is also the similarTimeChunkingLayer
.E.g. if the input is (batch, time, dim), the output is (batch, time, window_size, dim). If you want to merge the (window_size, dim) together to (window_size * dim,), you can use the MergeDimsLayer, e.g. {“class”: “merge_dims”, “axes”: “except_time”}.
Use stride==window_size and window_right=window_size - 1 in combination with a MergeDimsLayer to achieve feature stacking with right-hand zero padding.
This is not to take out a single window from the time-dimension. See
SliceLayer
orSliceNdLayer
.The inverse layer is
FoldLayer
.- Parameters:
- classmethod get_out_data_from_opts(name, network, sources, window_size=None, window_dim=None, axis='T', out_spatial_dim=None, padding='same', stride=1, _use_opt_dim_order=None, **kwargs)[source]¶
- Parameters:
name (str)
network (returnn.tf.network.TFNetwork)
sources (list[LayerBase])
window_size (int|None)
window_dim (Dim|None)
axis (Dim|str)
out_spatial_dim (Dim|None)
padding (str)
stride (int)
_use_opt_dim_order (bool|None)
- Return type:
Data
- classmethod get_rec_initial_extra_outputs(network, batch_dim, rec_layer, window_size=None, window_dim=None, axis='T', sources=(), **kwargs)[source]¶
- Parameters:
network (returnn.tf.network.TFNetwork)
batch_dim (tf.Tensor)
rec_layer (returnn.tf.layers.rec.RecLayer|LayerBase)
window_size (int|None)
window_dim (Dim|None)
axis (Dim|str)
sources (list[LayerBase])
- Return type:
dict[str,tf.Tensor]
- output_before_activation: Optional[OutputWithActivation][source]¶
- search_choices: Optional[SearchChoices][source]¶