returnn.frontend.conv

Convolution, transposed convolution, pooling

returnn.frontend.conv.conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, bias: Tensor | None = None) Tuple[Tensor, Sequence[Dim]][source]

convolution

class returnn.frontend.conv.Conv1d(in_dim: Dim, out_dim: Dim, filter_size: int | Dim, *, padding: str, strides: int | None = None, dilation_rate: int | None = None, groups: int | None = None, with_bias: bool = True)[source]

1D convolution

Parameters:
  • in_dim (Dim)

  • out_dim (Dim)

  • filter_size (int|Dim)

  • padding (str) – “same” or “valid”

  • strides (int|None) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.

  • dilation_rate (int|None) – dilation for the spatial dims

  • groups (int) – grouped convolution

  • with_bias (bool) – if True, will add a bias to the output features

nd: int | None = 1[source]
class returnn.frontend.conv.Conv2d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim] | int | Dim, *, padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, with_bias: bool = True)[source]

2D convolution

Parameters:
  • in_dim (Dim)

  • out_dim (Dim)

  • filter_size – (width,), (height,width) or (depth,height,width) for 1D/2D/3D conv. the input data ndim must match, or you can add dimensions via input_expand_dims or input_add_feature_dim. it will automatically swap the batch-dim to the first axis of the input data.

  • padding (str) – “same” or “valid”

  • strides (int|Sequence[int]) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.

  • dilation_rate (int|Sequence[int]) – dilation for the spatial dims

  • groups (int) – grouped convolution

  • with_bias (bool) – if True, will add a bias to the output features

nd: int | None = 2[source]
class returnn.frontend.conv.Conv3d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim] | int | Dim, *, padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, with_bias: bool = True)[source]

3D convolution

Parameters:
  • in_dim (Dim)

  • out_dim (Dim)

  • filter_size – (width,), (height,width) or (depth,height,width) for 1D/2D/3D conv. the input data ndim must match, or you can add dimensions via input_expand_dims or input_add_feature_dim. it will automatically swap the batch-dim to the first axis of the input data.

  • padding (str) – “same” or “valid”

  • strides (int|Sequence[int]) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.

  • dilation_rate (int|Sequence[int]) – dilation for the spatial dims

  • groups (int) – grouped convolution

  • with_bias (bool) – if True, will add a bias to the output features

nd: int | None = 3[source]
returnn.frontend.conv.transposed_conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, bias: Tensor | None = None) Tuple[Tensor, Sequence[Dim]][source]

transposed conv

class returnn.frontend.conv.TransposedConv1d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]

1D transposed convolution

Parameters:
  • in_dim (Dim)

  • out_dim (Dim)

  • filter_size (list[int])

  • strides (list[int]|None) – specifies the upscaling. by default, same as filter_size

  • padding (str) – “same” or “valid”

  • remove_padding (list[int]|int)

  • output_padding (list[int|None]|int|None)

  • with_bias (bool) – whether to add a bias. enabled by default

nd: int | None = 1[source]
class returnn.frontend.conv.TransposedConv2d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]

2D transposed convolution

Parameters:
  • in_dim (Dim)

  • out_dim (Dim)

  • filter_size (list[int])

  • strides (list[int]|None) – specifies the upscaling. by default, same as filter_size

  • padding (str) – “same” or “valid”

  • remove_padding (list[int]|int)

  • output_padding (list[int|None]|int|None)

  • with_bias (bool) – whether to add a bias. enabled by default

nd: int | None = 2[source]
class returnn.frontend.conv.TransposedConv3d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]

3D transposed convolution

Parameters:
  • in_dim (Dim)

  • out_dim (Dim)

  • filter_size (list[int])

  • strides (list[int]|None) – specifies the upscaling. by default, same as filter_size

  • padding (str) – “same” or “valid”

  • remove_padding (list[int]|int)

  • output_padding (list[int|None]|int|None)

  • with_bias (bool) – whether to add a bias. enabled by default

nd: int | None = 3[source]
returnn.frontend.conv.pool(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim] | Dim, out_spatial_dims: Dim | Sequence[Dim] | None = None, nd: int | None = None) Tuple[Tensor, Sequence[Dim]][source]

A generic N-D pooling layer. This would usually be done after a convolution for down-sampling.

Parameters:
  • source (Tensor)

  • nd

  • mode (str) – “max” or “avg”

  • pool_size (tuple[int]) – shape of the window of each reduce

  • padding (str) – “valid” or “same”

  • dilation_rate (tuple[int]|int)

  • strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size

  • in_spatial_dims (Sequence[Dim])

  • out_spatial_dims (Sequence[Dim]|None)

Returns:

layer, out_spatial_dims

returnn.frontend.conv.max_pool(source: Tensor, *, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim] | Dim, out_spatial_dims: Dim | Sequence[Dim] | None = None) Tuple[Tensor, Sequence[Dim]][source]

max-pool

returnn.frontend.conv.max_pool1d(source: Tensor, *, pool_size: int, padding: str = 'valid', dilation_rate: int = 1, strides: int | None = None, in_spatial_dim: Dim, out_spatial_dim: Dim | None = None) Tuple[Tensor, Dim][source]

max pool

returnn.frontend.conv.pool1d(source: Tensor, *, mode: str, pool_size: int, padding: str = 'valid', dilation_rate: int = 1, strides: int | None = None, in_spatial_dim: Dim, out_spatial_dim: Dim | None = None) Tuple[Tensor, Dim][source]

1D pooling.

Parameters:
  • source (Tensor)

  • mode (str) – “max” or “avg”

  • pool_size (tuple[int]) – shape of the window of each reduce

  • padding (str) – “valid” or “same”

  • dilation_rate (tuple[int]|int)

  • strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size

  • in_spatial_dim (Sequence[Dim])

  • out_spatial_dim (Sequence[Dim]|None)

Returns:

layer, out_spatial_dim

returnn.frontend.conv.pool2d(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None) Tuple[Tensor, Sequence[Dim]][source]

2D pooling.

Parameters:
  • source (Tensor)

  • mode (str) – “max” or “avg”

  • pool_size (tuple[int]) – shape of the window of each reduce

  • padding (str) – “valid” or “same”

  • dilation_rate (tuple[int]|int)

  • strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size

  • in_spatial_dims (Sequence[Dim])

  • out_spatial_dims (Sequence[Dim]|None)

Returns:

layer, out_spatial_dims

returnn.frontend.conv.pool3d(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None) Tuple[Tensor, Sequence[Dim]][source]

3D pooling.

Parameters:
  • source (Tensor)

  • mode (str) – “max” or “avg”

  • pool_size (tuple[int]) – shape of the window of each reduce

  • padding (str) – “valid” or “same”

  • dilation_rate (tuple[int]|int)

  • strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size

  • in_spatial_dims (Sequence[Dim])

  • out_spatial_dims (Sequence[Dim]|None)

Returns:

layer, out_spatial_dims

returnn.frontend.conv.make_conv_out_spatial_dims(in_spatial_dims: Sequence[Dim], *, filter_size: Sequence[int | Dim] | int | Dim, padding: str, strides: Sequence[int] | int = 1, dilation_rate: Sequence[int] | int = 1, description_prefix: str | None = None) Sequence[Dim][source]

create out spatial dims from in spatial dims