`returnn.frontend.conv`¶

Convolution, transposed convolution, pooling

returnn.frontend.conv.conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, bias: Tensor | None = None) → Tuple[Tensor, Sequence[Dim]][source]¶: convolution

class returnn.frontend.conv.Conv1d(in_dim: Dim, out_dim: Dim, filter_size: int | Dim, *, padding: str, strides: int | None = None, dilation_rate: int | None = None, groups: int | None = None, with_bias: bool = True)[source]¶

1D convolution

Parameters:

in_dim (Dim)
out_dim (Dim)
filter_size (int|Dim)
padding (str) – “same” or “valid”
strides (int|None) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.
dilation_rate (int|None) – dilation for the spatial dims
groups (int) – grouped convolution
with_bias (bool) – if True, will add a bias to the output features

nd: int | None = 1[source]¶

2D convolution

Parameters:

in_dim (Dim)
out_dim (Dim)
filter_size – (width,), (height,width) or (depth,height,width) for 1D/2D/3D conv. the input data ndim must match, or you can add dimensions via input_expand_dims or input_add_feature_dim. it will automatically swap the batch-dim to the first axis of the input data.
padding (str) – “same” or “valid”
strides (int|Sequence[int]) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.
dilation_rate (int|Sequence[int]) – dilation for the spatial dims
groups (int) – grouped convolution
with_bias (bool) – if True, will add a bias to the output features

nd: int | None = 2[source]¶

3D convolution

Parameters:

in_dim (Dim)
out_dim (Dim)
filter_size – (width,), (height,width) or (depth,height,width) for 1D/2D/3D conv. the input data ndim must match, or you can add dimensions via input_expand_dims or input_add_feature_dim. it will automatically swap the batch-dim to the first axis of the input data.
padding (str) – “same” or “valid”
strides (int|Sequence[int]) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.
dilation_rate (int|Sequence[int]) – dilation for the spatial dims
groups (int) – grouped convolution
with_bias (bool) – if True, will add a bias to the output features

nd: int | None = 3[source]¶

returnn.frontend.conv.transposed_conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, bias: Tensor | None = None) → Tuple[Tensor, Sequence[Dim]][source]¶: transposed conv

class returnn.frontend.conv.TransposedConv1d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]¶

1D transposed convolution

Parameters:

in_dim (Dim)
out_dim (Dim)
filter_size (list[int])
strides (list[int]|None) – specifies the upscaling. by default, same as filter_size
padding (str) – “same” or “valid”
remove_padding (list[int]|int)
output_padding (list[int|None]|int|None)
with_bias (bool) – whether to add a bias. enabled by default

nd: int | None = 1[source]¶

class returnn.frontend.conv.TransposedConv2d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]¶

2D transposed convolution

Parameters:

in_dim (Dim)
out_dim (Dim)
filter_size (list[int])
strides (list[int]|None) – specifies the upscaling. by default, same as filter_size
padding (str) – “same” or “valid”
remove_padding (list[int]|int)
output_padding (list[int|None]|int|None)
with_bias (bool) – whether to add a bias. enabled by default

nd: int | None = 2[source]¶

class returnn.frontend.conv.TransposedConv3d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]¶

3D transposed convolution

Parameters:

in_dim (Dim)
out_dim (Dim)
filter_size (list[int])
strides (list[int]|None) – specifies the upscaling. by default, same as filter_size
padding (str) – “same” or “valid”
remove_padding (list[int]|int)
output_padding (list[int|None]|int|None)
with_bias (bool) – whether to add a bias. enabled by default

nd: int | None = 3[source]¶

A generic N-D pooling layer. This would usually be done after a convolution for down-sampling.

Parameters:

source (Tensor)
nd
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dims (Sequence[Dim])
out_spatial_dims (Sequence[Dim]|None)

Returns:

layer, out_spatial_dims

returnn.frontend.conv.max_pool(source: Tensor, *, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim] | Dim, out_spatial_dims: Dim | Sequence[Dim] | None = None) → Tuple[Tensor, Sequence[Dim]][source]¶: max-pool

returnn.frontend.conv.max_pool1d(source: Tensor, *, pool_size: int, padding: str = 'valid', dilation_rate: int = 1, strides: int | None = None, in_spatial_dim: Dim, out_spatial_dim: Dim | None = None) → Tuple[Tensor, Dim][source]¶: max pool

returnn.frontend.conv.pool1d(source: Tensor, *, mode: str, pool_size: int, padding: str = 'valid', dilation_rate: int = 1, strides: int | None = None, in_spatial_dim: Dim, out_spatial_dim: Dim | None = None) → Tuple[Tensor, Dim][source]¶

1D pooling.

Parameters:

source (Tensor)
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dim (Sequence[Dim])
out_spatial_dim (Sequence[Dim]|None)

Returns:

layer, out_spatial_dim

returnn.frontend.conv.pool2d(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None) → Tuple[Tensor, Sequence[Dim]][source]¶

2D pooling.

Parameters:

source (Tensor)
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dims (Sequence[Dim])
out_spatial_dims (Sequence[Dim]|None)

Returns:

layer, out_spatial_dims

returnn.frontend.conv.pool3d(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None) → Tuple[Tensor, Sequence[Dim]][source]¶

3D pooling.

Parameters:

source (Tensor)
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dims (Sequence[Dim])
out_spatial_dims (Sequence[Dim]|None)

Returns:

layer, out_spatial_dims

returnn.frontend.conv.make_conv_out_spatial_dims(in_spatial_dims: Sequence[Dim], *, filter_size: Sequence[int | Dim] | int | Dim, padding: str, strides: Sequence[int] | int = 1, dilation_rate: Sequence[int] | int = 1, description_prefix: str | None = None) → Sequence[Dim][source]¶: create out spatial dims from in spatial dims

returnn.frontend.conv¶

`returnn.frontend.conv`¶