returnn.frontend._backend

Backends for the frontend API

class returnn.frontend._backend.Backend[source]

Abstract base class for the backend, operating on tensor type T, i.e. Tensor[T].

This class and instances do not have any state, and all functions are staticmethod (or classmethod).

name: str | None = None[source]
RawTensorType: Type[T][source]
is_tensorflow: bool = False[source]
is_backend_raw_tensor_dim_tag_independent: bool = True[source]
static executing_eagerly() bool[source]
Returns:

whether we are in eager execution mode

static get_tensor_dependencies(x: Tensor) Sequence[Tensor][source]
Parameters:

x – tensor

Returns:

list of all tensors which are inputs to x, ancestor tensors, dependencies. E.g. tf.Tensor.op.inputs(). This mostly makes sense for graph-based frameworks but eager-based frameworks might have this too with enabled gradient tape, as they should know the inputs.

static get_tensor_consumers(x: Tensor) Sequence[Tensor][source]
Parameters:

x – tensor

Returns:

list of all tensors depending on x, descendant tensors, used by. E.g. tf.Tensor.consumers(). This mostly makes sense for graph-based frameworks but eager-based frameworks might have this too with enabled gradient tape, as they should know the consumers.

static cond(pred: Tensor, true_fn: Callable, false_fn: Callable)[source]

cond: conditional execution.

Note that this does not need an implementation for eager-based frameworks (executing_eagerly() returns True), as the returnn.frontend.cond() function already covers that case.

static while_loop(cond: Callable[[S], bool | Tensor], body: Callable[[S], S], initial: S) S[source]

while loop

static set_random_seed(seed: int)[source]
Parameters:

seed

static get_random_state() Dict[str, bytes][source]
Returns:

random state

static set_random_state(state: Dict[str, bytes])[source]
Parameters:

state – as returned by get_random_state(). This might not always be successful (e.g. different hardware, different backend version), so the calling code should always have called set_random_seed before to have the random generators in a reasonable fallback state.

static get_dtype_name_raw(raw_tensor: T) str[source]
Returns:

dtype of raw tensor, as string

static as_dtype_raw(dtype_name: str) Any[source]
Parameters:

dtype_name – e.g. “float32”

Returns:

dtype object

static get_ndim_raw(raw_tensor: T) int[source]
Returns:

ndim of raw tensor. assumes it is known

static get_shape_raw(raw_tensor: T) T | Tuple[int | T][source]
Returns:

shape of raw tensor

static get_shape_tuple_raw(raw_tensor: T) Tuple[int | T][source]
Returns:

shape of raw tensor. assumes that ndim is known. In eager frameworks, all dims are int.

static get_known_shape_raw(raw_tensor: T) Tuple[int | None][source]
Returns:

shape of raw tensor, int for static known, None otherwise. assumes that ndim is known. This will not create any ops. In eager frameworks, all dims are known.

static set_known_shape_raw(raw_tensor: T, shape: Tuple[int | None]) None[source]

Sets the known shape of the raw tensor. This is only supported in graph-based frameworks, and just performs a check in eager frameworks.

static get_new_dim_raw(raw_tensor: T, axis: int, *, name: str) Dim[source]
Parameters:
  • raw_tensor

  • axis

  • name

Returns:

dim tag of axis

static get_device(x: Tensor) str | None[source]
Parameters:

x

Returns:

device, or none if unknown or logic not supported

static copy_to_device(x: Tensor, device: str | None) Tensor[source]
Parameters:
  • x – tensor

  • device – e.g. “cpu” or “gpu”

Returns:

tensor on device

static fill_raw(shape: Sequence[int | T] | T, value: Any | T) T[source]
Parameters:
  • shape – shape

  • value – scalar value to fill

Returns:

raw tensor filled with value everywhere

static compare_raw(a: T, kind: str, b: T) T[source]
Parameters:
  • a

  • kind – “equal”, “less”, “less_equal”, “greater”, “greater_equal”, “not_equal”

  • b

Returns:

a kind b

static combine_raw(a: T, kind: str, b: T) T[source]
Parameters:
  • a

  • kind – “add”, “sub”, “mul”, “truediv”, “floordiv”, “mod”, “pow”, “maximum”, “minimum”, “logical_and”, “logical_or”, “squared_difference”

  • b

Returns:

a kind b

static reshape_raw(raw_tensor: T, shape: Sequence[int | T] | T) T[source]
Parameters:
  • raw_tensor – raw tensor

  • shape – new shape

Returns:

reshaped raw tensor

classmethod squeeze_raw(raw_tensor: T, axes: Sequence[int]) T[source]
Parameters:
  • raw_tensor – raw tensor

  • axes – axes to squeeze

Returns:

squeezed raw tensor

static transpose_raw(raw_tensor: T, perm: Sequence[int]) T[source]
Parameters:
  • raw_tensor – raw tensor

  • perm – permutation

Returns:

transposed raw tensor

static make_output_tensor(tensor: Tensor, dims: Sequence[Dim], *, name: str) Tensor[source]
Parameters:
  • tensor

  • dims

  • name

Returns:

tensor with dims order like in dims

static expand_dims_raw(raw_tensor: T, axis: int) T[source]
Parameters:
  • raw_tensor

  • axis

Returns:

raw tensor with new axis

static expand_raw(raw_tensor: T, axis: int, dim: int | T) T[source]
Parameters:
  • raw_tensor

  • axis – shape[axis] must be 1

  • dim – the new dim for shape[axis]

Returns:

shape[axis] expands to dim. in PyTorch or other frameworks which support custom strides, this is an efficient view and not a copy.

static copy(tensor: Tensor) Tensor[source]
static cast_raw(raw_tensor: T, dtype: str) T[source]
Parameters:
  • raw_tensor

  • dtype – e.g. “float32”

Returns:

raw tensor with dtype casted

static cast(tensor: Tensor, dtype: str) Tensor[source]
Parameters:
  • tensor

  • dtype – e.g. “float32”

Returns:

tensor with dtype casted

static set_requires_gradient(tensor: Tensor)[source]
Parameters:

tensor

static gradient(y: Tensor, x: Tensor) Tensor[source]
Parameters:
  • y

  • x

Returns:

gradient of y w.r.t. x

static stop_gradient(tensor: Tensor) Tensor[source]
Parameters:

tensor

Returns:

tensor with stopped gradient

static scaled_gradient(tensor: Tensor, scale: float | Tensor) Tensor[source]
Parameters:
  • tensor

  • scale

Returns:

tensor with scaled gradient

static scaled_gradient_ext(x: Tensor, *, scale: float | Tensor = 1.0, shift: float | Tensor | None = None, scale_shift_by_sum_over_axis: Dim | None = None)[source]
Parameters:
  • x

  • scale – will scale gradient by this value

  • shift – will shift gradient by this value

  • scale_shift_by_sum_over_axis – if given, will scale and shift by the sum over the given axis

Returns:

just x, but gradient in backward pass will be transformed accordingly

static merge_dims(source: Tensor, *, dims: Sequence[Dim], out_dim: Dim | None = None) Tuple[Tensor, Dim][source]

Merges a list of axes into a single one. (Flatten the dims.) E.g. input is (batch, width, height, dim) and dims=(width,height), then we get (batch, width*height, dim). Or input is (batch, time, height, dim) and axes=(height,dim), then we get (batch, time, height*dim).

Parameters:
  • source

  • dims

  • out_dim

Returns:

tensor, out_dim

static split_dims(source: Tensor, *, axis: Dim, dims: Sequence[Dim], pad_to_multiples: bool | None = None, pad_value: None | int | float = None) Tensor[source]
Parameters:
  • source

  • axis

  • dims

  • pad_to_multiples

  • pad_value

Returns:

source with axis replaced by dims

static reshape(source: Tensor, in_dims: Sequence[Dim], out_dims: Sequence[Dim]) Tensor[source]
Parameters:
  • source – e.g. (…, old_dims, …)

  • in_dims – the old dims which should be reshaped into new_dims. This should only cover those dims which should be reshaped, not all the dims of the source.

  • out_dims – the new dims which should be reshaped from old_dims. This is excluding any of the other dims in the source.

Returns:

e.g. (…, new_dims, …)

static split(source: Tensor, *, axis: Dim, out_dims: Sequence[Dim]) Tuple[Tensor, ...][source]

Split the input on the specified axis (by default feature). Basically a wrapper around tf.split.

Parameters:
  • source – {…, axis}

  • axis – some static axis

  • out_dims – list of dims where sum(out_dims) == axis

Returns:

tuple of tensors, same amount as out_dims, with the same shape as source, but with the specified axis replaced by the out_dims

static expand_dim(source: Tensor, dim: Dim) Tensor[source]
Parameters:
  • source

  • dim

Returns:

source with dim added

static squeeze(source: Tensor, axis: Dim) Tensor[source]
Parameters:
  • source

  • axis

Returns:

source with axis removed

static concat(*sources: Tuple[Tensor, Dim], allow_broadcast: bool = False, out_dim: Dim) Tensor[source]
static pad(source: Tensor, *, axes: Sequence[Dim], padding: Sequence[Tuple[Dim | int, Dim | int]], out_dims: Sequence[Dim], handle_dynamic_dims: bool, mode: str = 'constant', value: int | float | complex | number | ndarray | bool | str | Tensor | None = None) Tensor[source]
Parameters:
  • source

  • axes

  • padding

  • out_dims

  • handle_dynamic_dims

  • mode

  • value

Returns:

padded tensor

static cum_concat_step(source: Tensor, *, prev_accum: Tensor, axis: Dim, out_spatial_dim: Dim) Tensor[source]

Concatenates all previous frames over a time-axis. See RETURNN CumConcatLayer for details.

Parameters:
  • source – same dims as prev_accum except for the accum axis

  • prev_accum – previous accumulated tensor, shape {…, axis}

  • axis – the axis to accumulate over

  • out_spatial_dim – the spatial dim of the output will be this dim. like axis+1.

Returns:

accumulated. accumulated shape {…, out_spatial_dim}, same shape as prev_accum with axis replaced by out_spatial_dim.

static activation(tensor: Tensor, func: str) Tensor[source]
Parameters:
  • tensor

  • func – “tanh”, “sigmoid”, “relu”, …

Returns:

tensor with elementwise activation applied

static activation_raw(raw_tensor: T, func: str) T[source]
Parameters:
  • raw_tensor

  • func – “tanh”, “sigmoid”, “relu”, …

Returns:

raw tensor with elementwise activation applied

static safe_log(tensor: Tensor, *, eps: float) Tensor[source]
Parameters:
  • tensor

  • eps

Returns:

log(tensor + eps) in the default case. but some backends might do more things, like if tensor = softmax(logits), then this would be log_softmax(logits) instead.

static softmax(tensor: Tensor, *, axis: Dim, use_mask: bool = True) Tensor[source]
Parameters:
  • tensor

  • axis

  • use_mask

Returns:

softmax over axis

static log_softmax(tensor: Tensor, *, axis: Dim, use_mask: bool = True) Tensor[source]
Parameters:
  • tensor

  • axis

  • use_mask

Returns:

log_softmax over axis

static softmax_cross_entropy_with_logits(*, logits: Tensor, targets: Tensor, axis: Dim)[source]

Efficient cross entropy.

Parameters:
  • logits – target estimates given as inputs to softmax (i.e. unnormalized)

  • targets – probabilities, i.e. normalized, can also be sparse

  • axis – class labels dim over which softmax is computed

Returns:

cross entropy (same Dims as ‘logits’ but without ‘axis’)

static ctc_loss(*, logits: Tensor, targets: Tensor, input_spatial_dim: Dim, targets_spatial_dim: Dim, blank_index: int, max_approx: bool = False) Tensor[source]

Calculates the CTC loss.

static have_sequence_mask_raw() bool[source]
Returns:

whether we have a sequence_mask_raw implementation

static sequence_mask_raw(lengths: T, *, batch_major: bool = True) T[source]

Like tf.sequence_mask().

Parameters:
  • lengths – shape (batch,)

  • batch_major

Returns:

tensor mask of shape (batch,maxlen) if batch_major else (maxlen,batch) of type bool

static name_scope_raw(name: str) Any[source]

Default implementation for eager-based frameworks: Do nothing, tensors do not have a name.

Parameters:

name

Returns:

context manager

static control_dependencies_raw(dependencies: Sequence[Any]) Any[source]

Default implementation for eager-based frameworks: Do nothing, we expect that the dependencies are already executed.

Parameters:

dependencies – raw tensors or ops

Returns:

context manager

static identity_with_control_dependencies_raw(raw_tensor: T, dependencies: Sequence[Any]) T[source]

Default implementation for eager-based frameworks: Do nothing, we expect that the dependencies are already executed.

Parameters:
  • raw_tensor – raw tensor

  • dependencies – raw tensors or ops

Returns:

raw tensor

static create_placeholder_raw(tensor: Tensor) T[source]
Returns:

tf.placeholder in TF

This is really only for TensorFlow for the deprecated option auto_create_placeholders and should not be used in other backends, even in graph-based backends. Rather, the logic to create placeholders should be done elsewhere.

static create_parameter_raw(tensor: Parameter, *, device: str | None = None) T[source]
Returns:

parameter (by default trainable)

static set_parameter_initial_value(param: Parameter, value: None | Tensor | int | float | complex | number | ndarray | bool | str) None[source]
Parameters:
  • param – parameter

  • value – initial value

static set_parameter_trainable(param: Parameter, trainable: bool) None[source]
Parameters:
  • param – parameter

  • trainable – whether the parameter should be trainable

static parameter_assign(param: Parameter, value: Tensor, *, op: str = 'assign') None[source]
Parameters:
  • param – parameter

  • value – new value

  • op – “assign” or “add”

static parameter_assign_key(param: Parameter, key: int | float | complex | number | ndarray | bool | str | Tensor | slice | Sequence[int | float | complex | number | ndarray | bool | str | Tensor | slice], value: Tensor, *, op: str = 'assign', axis: Dim | Sequence[Dim] | None = None, key_dim: None | Dim | Sequence[None | Dim] = None) None[source]
Parameters:
  • param – parameter

  • key – optional key for slice assign, like var[key] = value or var[key] += value.

  • value – new value

  • op – “assign” or “add”

  • axis – if key is given, this axis is used. if key are indices (without specified sparse_dim), axis must be specified.

  • key_dim – resulting dim after slicing with key

static parameter_move_to(param: Parameter, *, device: str | None = None, dtype: str | None = None)[source]

Updates param inplace, but param.raw_tensor might be a new instance.

Parameters:
  • param

  • device

  • dtype

static runtime_sanity_checks(tensor: Tensor) Any[source]

Checks whether the tensor.raw_tensor is consistent with the tensor metadata.

In graph-based frameworks (TF graph), we return some operation here. In eager frameworks, we would not return anything but instead directly perform the checks.

static is_valid_in_current_graph(tensor: Tensor) bool[source]
Returns:

whether the raw tensor is valid in the current graph. In eager-mode frameworks, this is always true – there is no graph.

static format_graph_output(raw_tensor: T, *, max_depth: int | None = None) str[source]
Returns:

the computation graph leading to this tensor formatted. In eager-mode frameworks, this is not supported and returns None.

static convert_to_tensor(value: Tensor | T | int | float | complex | number | ndarray | bool | str, *, dims: Sequence[Dim], dtype: str, sparse_dim: Dim | None = None, device: str | None = None, name: str | None = None) Tensor[T][source]
Parameters:
  • value – tensor, or scalar raw tensor or some other scalar value

  • dims

  • dtype

  • sparse_dim

  • device

  • name

Returns:

tensor

static full(dims: Sequence[Dim], fill_value: int | float | complex | number | ndarray | bool | str | Tensor, *, dtype: str, device: str | None = None, sparse_dim: Dim | None = None, feature_dim: Dim | None = None) Tensor[source]

https://data-apis.org/array-api/latest/API_specification/generated/array_api.full.html

Parameters:
  • dims

  • fill_value

  • dtype

  • device

  • sparse_dim

  • feature_dim

Returns:

tensor

classmethod compare(a: Tensor | int | float | complex | number | ndarray | bool | str, kind: str, b: Tensor | int | float | complex | number | ndarray | bool | str, *, allow_broadcast_all_sources: bool | None = None, dim_order: Sequence[Dim] | None = None) Tensor[source]

compare, default implementation using compare_raw

classmethod combine(a: Tensor | int | float | complex | number | ndarray | bool | str, kind: str, b: Tensor | int | float | complex | number | ndarray | bool | str, *, allow_broadcast_all_sources: bool | None = None, dim_order: Sequence[Dim] | None = None) Tensor[source]

combine, default implementation using combine_raw

static gather(source: Tensor, *, indices: Tensor | int, axis: Dim, clip_to_valid: bool = False) Tensor[source]

Gathers slices on a specified axis from the source using indices. If the source is of the shape [B,D,F1], and indices of shape [B,F2], this will yield output of the shape [B,F2,F1] where

output[b,f2,f1] = source[b,indices[b,f2],f1]

(if D is the axis to gather from). In general, all shared axes of the input and the positions will be considered as batch-axes.

The indices argument can also be an int. In this case, this simply gives source[indices] on the specified axis.

Parameters:
  • source

  • indices – indices used to select the slices of the source from. If another tensor, must be of type int32 or int64. Can also specify a constant int.

  • axis – The axis into which we gather the indices into

  • clip_to_valid – if True, the indices will be clipped to the valid range of the input Also taking seq lengths into account.

Returns:

gathered values

static scatter(source: Tensor, *, indices: Tensor, indices_dim: Dim | Sequence[Dim], out_dim: Dim | Sequence[Dim]) Tensor[source]

Scatters into new zero-tensor. If entries in indices are duplicated, the corresponding values in source will be added together (scatter_add in PyTorch). (TF segment_sum can be implemented via this.)

Parameters:
  • source – [batch_dims…, indices_dim(s)…, feature_dims…]

  • indices – [batch_dims…, indices_dim(s)…] -> out_dim

  • indices_dim

  • out_dim

Returns:

[batch_dims…, out_dim, feature_dims…]

static slice(source: Tensor, *, axis: Dim, start: int | Tensor | None = None, end: int | Tensor | None = None, step: int | Tensor | None = None, size: int | Tensor | Dim | None = None, out_dim: Dim) Tensor[source]
static where(cond: Tensor, true_: Tensor | int | float | complex | number | ndarray | bool | str, false_: Tensor | int | float | complex | number | ndarray | bool | str, *, allow_broadcast_all_sources: bool = False) Tensor[source]
static clip_by_value(x: Tensor, clip_value_min: Tensor | int | float | complex | number | ndarray | bool | str, clip_value_max: Tensor | int | float | complex | number | ndarray | bool | str, *, allow_broadcast_all_sources: bool = False) Tensor[source]

clip by value

static matmul(a: Tensor[T], b: Tensor[T], *, reduce: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

This performs a batched matmul of two sources a and b (non-batched matmul and dot product are special cases). The underlying operation is a batched matmul (shared…, I, J) * (shared…, J, K) -> (shared…, I, K). The inputs a and b are transformed internally into the required shapes in the following way: The axis J is specified via the Dim given as ‘reduce’. If multiple reduce Dims are given the corresponding axes are merged into one before the matmul via a reshape. All other matching Dims in a and b will be treated as batch dimensions (‘shared…’). Dims unique to a and b define the axes I and K, respectively. (Multiple or no unique axes in a and b are supported too.)

Depending on which Dims exist in a, b and reduce this dot operation can be used to compute scaling, scalar product, outer product, matrix-vector multiplication, matrix-matrix multiplication etc. (all possibly batched).

Parameters:
  • a

  • b

  • reduce – Dims over which to perform the product, have to be present in both a and b

  • use_mask – If the reduction is over dynamic axes, to get the correct sum reduction, we need to apply masking to one of the inputs. This is done automatically. By disabling this flag, this would be disabled.

Returns:

result of dot product, Dim order: common axes as sorted in a, unique axes of a (in order), unique axes of b (in order)

static range_over_dim(dim: Dim, *, dtype: str | None = None, device: str | None = None) Tensor[T][source]
Parameters:
  • dim

  • dtype

  • device

Returns:

tensor with shape [dim]

static replace_dim(source: Tensor, *, in_dim: Dim, out_dim: Dim) Tensor[source]
Parameters:
  • source

  • in_dim

  • out_dim

Returns:

source with in_dim replaced by out_dim.

static reduce(source: Tensor[T], *, mode: str, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • mode – “sum”, “max”, “min”, “mean”, “logsumexp”, “any”, “all”, “argmin”, “argmax”

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

static top_k(source: Tensor, *, axis: Dim | Sequence[Dim], k: int | Tensor, k_dim: Dim | None = None, sorted: bool = True) Tuple[Tensor, Tensor | Sequence[Tensor], Dim][source]

top_k. see top_k()

static random(*, dims: Sequence[Dim], dtype: str, device: str | None = None, sparse_dim: Dim | None = None, feature_dim: Dim | None = None, distribution: str, mean: int | float | Tensor | None = None, stddev: int | float | Tensor | None = None, bound: int | float | Tensor | None = None, minval: int | float | Tensor | None = None, maxval: int | float | Tensor | None = None, seed: int | Sequence[int] | ndarray | None = None, algorithm: str | None = None, explicit_state: Tensor | None = None, auto_update_state: bool | None = None, static: bool | None = None, out: Tensor | None = None) Tensor[source]

random. See rf.random for details.

static masked_select(tensor: Tensor, *, mask: Tensor, dims: Sequence[Dim], out_dim: Dim | None = None) Tuple[Tensor, Dim][source]
Parameters:
  • tensor

  • mask

  • dims – the order of the dims defines the format. those dims should be exactly the dims of the mask.

  • out_dim

Returns:

tensor where all dims in mask/dims are removed and replaced by a new dim. the new dim is also returned. if mask==True for all elements, the returned tensor would be simply the flattened input tensor.

static masked_scatter(source: Tensor, *, mask: Tensor, dims: Sequence[Dim], in_dim: Dim) Tensor[source]

The inverse of masked_select().

Parameters:
  • source – [in_dim, F…]

  • mask – [dims…] -> bool (e.g. [B,T])

  • dims – the order of the dims defines the format. those dims should be exactly the dims of the mask.

  • in_dim – the dim of the source which should be scattered into the mask.

Returns:

[dims…, F…]

static batch_norm(source: Tensor, *, in_dim: Dim | Sequence[Dim], running_mean: Tensor, running_variance: Tensor, gamma: Tensor | None, beta: Tensor | None, epsilon: float, momentum: float, affine: bool, use_mask: bool) Tensor[source]
Parameters:
  • source

  • in_dim

  • running_mean

  • running_variance

  • gamma

  • beta

  • epsilon

  • momentum

  • affine

  • use_mask

Returns:

static conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, bias: Tensor | None = None) Tuple[Tensor, Sequence[Dim]][source]

convolution

static transposed_conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, bias: Tensor | None = None) Tuple[Tensor, Sequence[Dim]][source]

transposed convolution

static pool(source: Tensor, *, mode: str, pool_size: Sequence[int], padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: Sequence[int], in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None) Tuple[Tensor, Sequence[Dim]][source]

pooling

static stft(x: Tensor, *, in_spatial_dim: Dim, frame_step: int, frame_length: int, fft_length: int, window_use_frame_length: bool = True, align_window_left: bool = True, window_enforce_even: bool = True, out_spatial_dim: Dim, out_dim: Dim) Tensor[source]

stft. see stft() for details.

static lstm(source: Tensor, *, state_h: Tensor, state_c: Tensor, ff_weight: Tensor, rec_weight: Tensor, bias: Tensor, spatial_dim: Dim, in_dim: Dim, out_dim: Dim) Tuple[Tensor, Tuple[Tensor, Tensor]][source]

Functional LSTM.

Parameters:
  • source – Tensor of shape [*, in_dim].

  • state_c

  • state_h

  • ff_weight – Parameters for the weights of the feed-forward part.

  • rec_weight – Parameters for the weights of the recurrent part.

  • bias – Parameters for the bias.

  • spatial_dim – Dimension in which the LSTM operates.

  • in_dim

  • out_dim

Returns:

output, (state_h, state_c)

TensorArrayType[source]

alias of List[Tensor]

classmethod tensor_array_create() TensorArrayType[source]
Returns:

empty TensorArray

static tensor_array_unstack(tensor: Tensor, *, axis: Dim) TensorArrayType[source]
Parameters:
  • tensor

  • axis

Returns:

list of tensors

static tensor_array_stack(tensor_array: TensorArrayType, *, axis: Dim, tensor_template: Tensor) Tensor[source]
Parameters:
  • tensor_array

  • axis

  • tensor_template – per element shape, excluding axis

Returns:

tensor

classmethod tensor_array_push_back(tensor_array: TensorArrayType, value: Tensor) TensorArrayType[source]
Parameters:
  • tensor_array

  • value

Returns:

tensor_array

classmethod tensor_array_get_item(tensor_array: TensorArrayType, index: int | Tensor) Tensor[source]
Parameters:
  • tensor_array

  • index

Returns:

tensor

returnn.frontend._backend.select_backend(name: str)[source]

Select backend by name.

Parameters:

name – “torch”, “tf”, “returnn_layers_tf”, “numpy”

returnn.frontend._backend.get_selected_backend() str | None[source]
Returns:

the selected backend name, or None if not selected

returnn.frontend._backend.is_executing_eagerly() bool[source]
Returns:

whether the current selected backend is executing eagerly

returnn.frontend._backend.select_backend_tf()[source]

Selects the RETURNN layers backend (based on TF).

returnn.frontend._backend.select_backend_returnn_layers_tf()[source]

Selects the RETURNN layers backend (based on TF).

returnn.frontend._backend.select_backend_torch()[source]

Selects the PyTorch (low-level) backend.

returnn.frontend._backend.get_backend_by_tensor(tensor: Tensor, *, fallback: T2 | None = None) Type[Backend[T]] | T2[source]
Parameters:
  • tensor

  • fallback

returnn.frontend._backend.get_backend_by_raw_tensor_type(tensor_type: Type[T]) Type[Backend[T]][source]
Parameters:

tensor_type

returnn.frontend._backend.register_backend_by_tensor_type(tensor_type: Type[T], backend: Type[Backend[T]])[source]
Parameters:
  • tensor_type

  • backend