returnn.frontend.array_

Array (Tensor) functions

returnn.frontend.array_.convert_to_tensor(value: Tensor | T | int | float | complex | number | ndarray | bool | str, *, dims: Sequence[Dim] | None = None, dtype: str | None = None, sparse_dim: Dim | None = None, shape: Sequence[Dim] | None = None, device: str | None = None, keep_scalar_on_cpu: bool = False, name: str | None = None, _backend: Type[Backend] | None = None) Tensor[T][source]
Parameters:
  • value – tensor, or scalar raw tensor or some other scalar value

  • dims

  • dtype

  • sparse_dim

  • shape – alias for dims, for some older code

  • name

  • device

  • keep_scalar_on_cpu – if the value is already on the CPU, keep it there, even if device is sth else

  • _backend

Returns:

tensor

returnn.frontend.array_.constant(value: Tensor | T | int | float | complex | number | ndarray | bool | str, *, dims: Sequence[Dim] | None = None, dtype: str | None = None, sparse_dim: Dim | None = None, shape: Sequence[Dim] | None = None, device: str | None = None, keep_scalar_on_cpu: bool = False, name: str | None = None, _backend: Type[Backend] | None = None) Tensor[T][source]
Parameters:
  • value – tensor, or scalar raw tensor or some other scalar value

  • dims

  • dtype

  • sparse_dim

  • shape – alias for dims, for some older code

  • name

  • device

  • keep_scalar_on_cpu – if the value is already on the CPU, keep it there, even if device is sth else

  • _backend

Returns:

tensor

returnn.frontend.array_.copy(tensor: Tensor) Tensor[source]
Parameters:

tensor

Returns:

copy of tensor. In eager-based frameworks, it is really a copy. In graph-based frameworks, it might be just a copied reference if it would be immutable. This is really only relevant when operating on tensors which can conceptually be mutated, such as variables (Parameter).

returnn.frontend.array_.cast(tensor: Tensor, dtype: str) Tensor[source]
Parameters:
  • tensor

  • dtype

Returns:

tensor with the same data, but with a different dtype

returnn.frontend.array_.merge_dims(source: Tensor, *, dims: Sequence[Dim], out_dim: Dim | None = None) Tuple[Tensor, Dim][source]

Merges a list of axes into a single one. (Flatten the dims.) E.g. input is (batch, width, height, dim) and dims=(width,height), then we get (batch, width*height, dim). Or input is (batch, time, height, dim) and axes=(height,dim), then we get (batch, time, height*dim).

rf.split_dims() is the reverse operation.

Parameters:
  • source

  • dims

  • out_dim

Returns:

tensor, out_dim

returnn.frontend.array_.split_dims(source: Tensor, *, axis: Dim, dims: Sequence[Dim], pad_to_multiples: bool | None = None, pad_value: None | int | float = None) Tensor[source]

Splits one axis into multiple axes. E.g. if you know that your feature-dim is composed by a window, i.e. the input is (batch, time, window * feature), you can set axis=”F”, dims=(window, -1), and you will get the output (batch, time, window, feature).

If the split axis has a dynamic length, exactly one of the axes that we split into need to also have a dynamic length. You can e.g. use this to split the input dimension into smaller “chunks” of a fixed window size. E.g. you could have input (batch, time, feature) and set axis=”T”, dims=(-1, window), to get output (batch, split_time, window, feature). In this case, the exact sequence lengths are lost and everything is padded to multiples of the window size using the given padding value. Use ReinterpretDataLayer to receive back the original sequence lengths after merging.

Also see rf.merge_dims() which can undo this operation.

Parameters:
  • source

  • axis – e.g. “F”

  • dims – what the axis should be split into. e.g. (window, -1)

  • pad_to_multiples – If true, input will be padded to the next multiple of the product of the static dims, such that splitting is actually possible. By default this is done iff the axis has a dynamic size

  • pad_value – What pad value to use for pad_to_multiples

Returns:

source with axis replaced by dims

returnn.frontend.array_.reshape(source: Tensor, in_dims: Sequence[Dim], out_dims: Sequence[Dim]) Tensor[source]

Wraps tf.reshape.

You should use split_dims() or merge_dims() when you want to split or merge dimensions. This here is for doing any other kind of reshape. This can be used for clever indexing, slicing, padding tricks.

Parameters:
  • source – e.g. (…, old_dims, …)

  • in_dims – the old dims which should be reshaped into new_dims. This should only cover those dims which should be reshaped, not all the dims of the source.

  • out_dims – the new dims which should be reshaped from old_dims. This is excluding any of the other dims in the source.

Returns:

e.g. (…, new_dims, …)

returnn.frontend.array_.split(source: Tensor, *, axis: Dim, out_dims: Sequence[Dim]) Tuple[Tensor, ...][source]

Split the input on the specified axis (by default feature). Basically a wrapper around tf.split.

Parameters:
  • source – {…, axis}

  • axis – some static axis

  • out_dims – list of dims where sum(out_dims) == axis

Returns:

tuple of tensors, same amount as out_dims, with the same shape as source, but with the specified axis replaced by the out_dims

returnn.frontend.array_.expand_dim(source: Tensor, dim: Dim) Tensor[source]

Expand the source by the given dimension.

Note that this is never needed for broadcasting. All broadcasting should always happen automatically.

This might be needed for convolution or concatenation.

returnn.frontend.array_.squeeze(source: Tensor, axis: Dim) Tensor[source]

Removes the axis with dimension of extend 1 from the source.

returnn.frontend.array_.window(source: Tensor, *, spatial_dim: Dim, window_dim: Dim, window_right: Dim | int | None = None, window_left: Dim | int | None = None, padding: str = 'same', pad_value: None | int | float = None, stride: int = 1) Tuple[Tensor, Dim][source]

Follows the same idea as RETURNN tf_util.windowed, using clever padding and reshaping.

Parameters:
  • source

  • spatial_dim

  • window_dim

  • window_left

  • window_right

  • padding – “same” or “valid”

  • pad_value

  • stride

Returns:

out, out_spatial_dim

returnn.frontend.array_.concat(*sources: Tuple[Tensor, Dim], allow_broadcast: bool = False, out_dim: Dim | None = None) Tuple[Tensor, Dim][source]

Concatenates multiple sources in the specified dimension.

returnn.frontend.array_.concat_features(*sources: Tensor, allow_broadcast=False) Tensor[source]

Concatenates multiple sources, using feature_dim of each source, so make sure that the feature_dim is correctly set.

returnn.frontend.array_.pad(source: Tensor, *, axes: Sequence[Dim], padding: Sequence[Tuple[Dim | int, Dim | int]], out_dims: Sequence[Dim] | None = None, mode: str = 'constant', value: int | float | complex | number | ndarray | bool | str | Tensor | None = None, handle_dynamic_dims: bool | None = None) Tuple[Tensor, Sequence[Dim]][source]

Pad values left/right in the specified axes.

Parameters:
  • source

  • axes – which axes to add padding to

  • padding – list of (left, right) padding for each axis

  • out_dims – (optional) predefined out dims for each padded dim in axes. will automatically create if not given

  • mode – ‘constant’, ‘reflect’, ‘replicate’ or ‘circular’

  • value – (optional) value to pad with in “constant” mode

  • handle_dynamic_dims – True: when doing right padding on a dynamic dim, value will be added after the seq end, not at the end of the dimension. False: value will be added at the end of the dimension. By default, in behavior version >=21, this is True, in older versions, this is False.

Returns:

padded tensor, out_dims. out dims are for each dim in axes

returnn.frontend.array_.cum_concat_step(source: Tensor, *, prev_accum: Tensor, axis: Dim, out_spatial_dim: Dim | None = None) Tuple[Tensor, Dim][source]

Concatenates all previous frames over a time-axis. See RETURNN CumConcatLayer for details.

Parameters:
  • source – same dims as prev_accum except for the accum axis

  • prev_accum – previous accumulated tensor, shape {…, axis}

  • axis – the axis to accumulate over

  • out_spatial_dim – if given, the spatial dim of the output will be this dim. axis+1.

Returns:

(accumulated, out_spatial_dim). accumulated shape {…, out_spatial_dim}, same shape as prev_accum with axis replaced by out_spatial_dim.

returnn.frontend.array_.masked_select(tensor: Tensor, *, mask: Tensor, dims: Sequence[Dim], out_dim: Dim | None = None) Tuple[Tensor, Dim][source]

In TF, this is boolean_mask. The inverse of this is masked_scatter().

Parameters:
  • tensor

  • mask

  • dims – the order of the dims defines the format. those dims should be exactly the dims of the mask.

  • out_dim

Returns:

tensor where all dims in mask/dims are removed and replaced by a new dim. the new dim is also returned. if mask==True for all elements, the returned tensor would be simply the flattened input tensor.

returnn.frontend.array_.masked_scatter(source: Tensor, *, mask: Tensor, dims: Sequence[Dim], in_dim: Dim) Tensor[source]

The inverse of masked_select().

Parameters:
  • source – [in_dim, F…]

  • mask – [dims…] -> bool (e.g. [B,T])

  • dims – the order of the dims defines the format. those dims should be exactly the dims of the mask.

  • in_dim – the dim of the source which should be scattered into the mask.

Returns:

[dims…, F…]

returnn.frontend.array_.sequence_mask(dims: Dim | Sequence[Dim], *, device: str | None = None) Tensor[source]
Parameters:
  • dims

  • device

returnn.frontend.array_.pack_padded(source: Tensor, *, dims: Sequence[Dim], enforce_sorted: bool = False, out_dim: Dim | None = None) Tuple[Tensor, Dim][source]

Like pack_padded_sequence. Usually the sequences are padded when they have different lengths. Packing means to only store the non-padded frames. This uses masked_select() internally based on the mask of non-masked frames.

Parameters:
  • source

  • dims – dims in source to pack. the order defines the format. first dim is major, etc. if there are no padded frames, e.g. dims=[B,T] would just result in the [B*T,…] reshaped tensor.

  • enforce_sorted – seqs in the dims are reordered (stable sort) such that longest seqs come first.

  • out_dim

Returns:

packed tensor, new packed dim

returnn.frontend.array_.gather(source: Tensor, *, indices: Tensor | int, axis: Dim | None = None, clip_to_valid: bool = False) Tensor[source]

Gathers slices on a specified axis from the source using indices. If the source is of the shape [B,D,F1], and indices of shape [B,F2], this will yield output of the shape [B,F2,F1] where

output[b,f2,f1] = source[b,indices[b,f2],f1]

(if D is the axis to gather from). In general, all shared axes of the input and the positions will be considered as batch-axes.

The indices argument can also be an int. In this case, this simply gives source[indices] on the specified axis.

scatter() is the inverse.

Parameters:
  • source – [batch_dims…, axis, feature_dims…]

  • indices – [batch_dims…, indices_dims…] indices used to select the slices of the source from. If another tensor, must be of type int32 or int64. Can also specify a constant int. Batch dims are automatically determined as the common dims of source and indices.

  • axis – The axis into which we gather the indices into. If not given, indices must be a tensor and the sparse_dim will be used.

  • clip_to_valid – if True, the indices will be clipped to the valid range of the input Also taking seq lengths into account.

Returns:

[batch_dims…, indices_dims…, feature_dims…] gathered values

returnn.frontend.array_.scatter(source: Tensor, *, indices: Tensor, indices_dim: Dim | Sequence[Dim], out_dim: Dim | Sequence[Dim] | None = None) Tensor[source]

Scatters into new zero-tensor. If entries in indices are duplicated, the corresponding values in source will be added together (scatter_add in PyTorch). (TF segment_sum can be implemented via this.)

Parameters:
  • source – [batch_dims…, indices_dim(s)…, feature_dims…]

  • indices – [batch_dims…, indices_dim(s)…] -> out_dim

  • indices_dim

  • out_dim – The indices target dim. If not given, will be automatically determined as the sparse_dim from indices. If multiple out dims, use indices into the merged out dims, and then we use rf.split_dims() afterwards.

Returns:

[batch_dims…, out_dim(s)…, feature_dims…]

returnn.frontend.array_.slice(source: Tensor, *, axis: Dim, start: int | Tensor | None = None, end: int | Tensor | None = None, step: int | Tensor | None = None, size: int | Tensor | Dim | None = None, out_dim: Dim | None = None) Tuple[Tensor, Dim][source]

Slicing on the input, i.e. x[start:end:step] in some axis.

If size is given, it takes out a slice-range like x[start:start + size].

This function allows a non-scalar start points.

Parameters:
  • source

  • axis

  • start

  • end

  • step

  • size

  • out_dim

Returns:

tensor, out_dim

returnn.frontend.array_.shift_right(source: Tensor, *, axis: Dim, pad_value: int | float | complex | number | ndarray | bool | str | Tensor, amount: int = 1) Tensor[source]

shift right by amount, pad left with left_pad

returnn.frontend.array_.reverse_sequence(tensor: Tensor, *, axis: Dim) Tensor[source]

Similar as tf.reverse_sequence, or Torch flip (but taking seq lengths into account).

Parameters:
  • tensor

  • axis

Returns:

reversed tensor, same dims

returnn.frontend.array_.where(cond: Tensor | int | float | complex | number | ndarray | bool | str, true_: Tensor | int | float | complex | number | ndarray | bool | str, false_: Tensor | int | float | complex | number | ndarray | bool | str, *, allow_broadcast_all_sources: bool = False) Tensor[source]

Wraps tf.where, which is SwitchLayer in RETURNN.

Returns:

true_ if cond else false_, elemwise.

returnn.frontend.array_.sparse_to_dense(labels: Tensor | int | float | complex | number | ndarray | bool | str, *, label_value: Tensor | int | float | complex | number | ndarray | bool | str, other_value: Tensor | int | float | complex | number | ndarray | bool | str, axis: Dim | None = None) Tensor[source]

Converts a sparse tensor to a dense one.

This is a more generic variant of “one_hot”.

Note that usually this is not needed as most other functions should handle sparse tensors just fine and much more efficiently than they would be with dense tensors.

returnn.frontend.array_.one_hot(source: Tensor) Tensor[source]

one_hot. special case of sparse_to_dense().

Note that usually this is not needed as most other functions should handle sparse tensors just fine and much more efficiently than they would be with dense tensors.