returnn.frontend.reduce

Reduce

returnn.frontend.reduce.reduce(source: Tensor[T], *, mode: str, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • mode – “sum”, “max”, “min”, “mean”, “logsumexp”, “any”, “all”, “argmin”, “argmax”

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_sum(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_max(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_min(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_mean(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_logsumexp(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_logmeanexp(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_any(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_all(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_argmin(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_argmax(source: Tensor[T], *, axis: Dim | Sequence[Dim], use_mask: bool = True) Tensor[T][source]

Reduce the tensor along the given axis

Parameters:
  • source

  • axis

  • use_mask – if True (default), use the time mask (part of dim tag) to ignore padding frames

Returns:

tensor with axis removed

returnn.frontend.reduce.reduce_out(source: Tensor, *, mode: str, num_pieces: int, out_dim: Dim | None = None) Tensor[source]

Combination of SplitDimsLayer applied to the feature dim and ReduceLayer applied to the resulting feature dim. This can e.g. be used to do maxout.

Parameters:
  • source

  • mode – “sum” or “max” or “mean”

  • num_pieces – how many elements to reduce. The output dimension will be input.dim // num_pieces.

  • out_dim

Returns:

out, with feature_dim set to new dim

class returnn.frontend.reduce.RunningMean(in_dim: Dim | Sequence[Dim], *, alpha: float, dtype: str | None = None, is_prob_distribution: bool | None = None, update_only_in_train: bool = True)[source]

Running mean, using exponential moving average, using the formula:

# E.g. for some input [B,T,F], reduce to [F], when the mean vector is [F].
new_value = reduce_mean(new_value, axis=[d for d in x.dims if d not in mean.dims])

new_mean = alpha * new_value + (1 - alpha) * old_mean
         = old_mean + alpha * (new_value - old_mean)  # more numerically stable

(Like the TF AccumulateMeanLayer.) (Similar is also the running mean in BatchNorm.)

Parameters:
  • in_dim – the dim of the mean vector, or the shape.

  • alpha – factor for new_value. 0.0 means no update, 1.0 means always the new value. Also called momentum. E.g. 0.1 is a common value, or less, like 0.001.

  • dtype – the dtype of the mean vector

  • is_prob_distribution – if True, will initialize the mean vector with 1/in_dim.

  • update_only_in_train – if True (default), will only update the mean vector in training mode. False means it will always update.

returnn.frontend.reduce.top_k(source: Tensor, *, axis: Dim | Sequence[Dim], k: int | Tensor | None = None, k_dim: Dim | None = None, sorted: bool = True) Tuple[Tensor, Tensor | Sequence[Tensor], Dim][source]

Basically wraps tf.nn.top_k. Returns the top_k values and the indices.

For an input [B,D] with axis=D, the output and indices values are shape [B,K].

It’s somewhat similar to reduce() with max and argmax. The axis dim is reduced and then a new dim for K is added.

Axis can also cover multiple axes, such as [beam,classes]. In that cases, there is not a single “indices” sub-layer, but sub-layers “indices0” .. “indices{N-1}” corresponding to each axis, in the same order.

All other axes are treated as batch dims.

Parameters:
  • source

  • axis – the axis to do the top_k on, which is reduced, or a sequence of axes

  • k – the “K” in “TopK”

  • k_dim – the new axis dim for K. if not provided, will be automatically created.

  • sorted

Returns:

values, indices (sequence if axis is a sequence), k_dim