Utilities for dimension tags, dimensions, axes.

returnn.frontend.dims.range_over_dim(dim: Dim, *, dtype: str | None = None, device: str | None = None) Tensor[T][source]#
  • dim

  • dtype

:param device, :return: tensor with shape [dim]

returnn.frontend.dims.range_over_dims(dims: Sequence[Dim], *, dtype: str | None = None, device: str | None = None) Tensor[T][source]#

This is if you want to index into a merged dim. Related: rf.merge_dims().

  • dims

  • dtype

  • device


tensor with shape [dim_0, …, dim_n] -> sparse_dim = merged_dim, where merged_dim = dim_0 * … * dim_n

returnn.frontend.dims.replace_dim(source: Tensor, *, in_dim: Dim, out_dim: Dim | None = None) Tuple[Tensor, Dim][source]#

Also see: rf.merge_dims(), rf.split_dims().

  • source

  • in_dim

  • out_dim


source with in_dim replaced by out_dim, and new out_dim. this does not work for the sparse_dim. see set_sparse_dim() for that case.

returnn.frontend.dims.dim_match_priority_when_needed(dim: Dim, *other_dims: Dim) Dim[source]#

maybe copy of dim with higher match_priority if needed to distinguish from other_dims

Why or when is this needed?

For activation values, this should never be needed, and all dims should be unique.

In case of self-attention, the standard way is to create a separate distinct dim to perform the attention reduction over. See SelfAttention.

However, in case of weight matrices, it is not unusual to have the same dim for both the input and output, so a square weight matrix. When reduction is performed in matmul(), we want to match the input feature dim to the dim in the weight matrix with higher match priority.

So dim_match_priority_when_needed() would be applied on the input feature dim.

returnn.frontend.dims.num_elements_of_shape(dims: Sequence[Dim]) int | Tensor[source]#



num elements of a tensor of shape dims, properly considering masking