returnn.frontend.conv
¶
Convolution, transposed convolution, pooling
- returnn.frontend.conv.conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, bias: Tensor | None = None) Tuple[Tensor, Sequence[Dim]] [source]¶
convolution
- class returnn.frontend.conv.Conv1d(in_dim: Dim, out_dim: Dim, filter_size: int | Dim, *, padding: str, strides: int | None = None, dilation_rate: int | None = None, groups: int | None = None, with_bias: bool = True)[source]¶
1D convolution
- Parameters:
in_dim (Dim)
out_dim (Dim)
filter_size (int|Dim)
padding (str) – “same” or “valid”
strides (int|None) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.
dilation_rate (int|None) – dilation for the spatial dims
groups (int) – grouped convolution
with_bias (bool) – if True, will add a bias to the output features
- class returnn.frontend.conv.Conv2d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim] | int | Dim, *, padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, with_bias: bool = True)[source]¶
2D convolution
- Parameters:
in_dim (Dim)
out_dim (Dim)
filter_size – (width,), (height,width) or (depth,height,width) for 1D/2D/3D conv. the input data ndim must match, or you can add dimensions via input_expand_dims or input_add_feature_dim. it will automatically swap the batch-dim to the first axis of the input data.
padding (str) – “same” or “valid”
strides (int|Sequence[int]) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.
dilation_rate (int|Sequence[int]) – dilation for the spatial dims
groups (int) – grouped convolution
with_bias (bool) – if True, will add a bias to the output features
- class returnn.frontend.conv.Conv3d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim] | int | Dim, *, padding: str, strides: int | Sequence[int] | None = None, dilation_rate: int | Sequence[int] | None = None, groups: int | None = None, with_bias: bool = True)[source]¶
3D convolution
- Parameters:
in_dim (Dim)
out_dim (Dim)
filter_size – (width,), (height,width) or (depth,height,width) for 1D/2D/3D conv. the input data ndim must match, or you can add dimensions via input_expand_dims or input_add_feature_dim. it will automatically swap the batch-dim to the first axis of the input data.
padding (str) – “same” or “valid”
strides (int|Sequence[int]) – strides for the spatial dims, i.e. length of this tuple should be the same as filter_size, or a single int.
dilation_rate (int|Sequence[int]) – dilation for the spatial dims
groups (int) – grouped convolution
with_bias (bool) – if True, will add a bias to the output features
- returnn.frontend.conv.transposed_conv(source: Tensor, *, in_dim: Dim, out_dim: Dim, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None, filter: Tensor, filter_size: Sequence[Dim], padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, bias: Tensor | None = None) Tuple[Tensor, Sequence[Dim]] [source]¶
transposed conv
- class returnn.frontend.conv.TransposedConv1d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]¶
1D transposed convolution
- Parameters:
in_dim (Dim)
out_dim (Dim)
filter_size (list[int])
strides (list[int]|None) – specifies the upscaling. by default, same as filter_size
padding (str) – “same” or “valid”
remove_padding (list[int]|int)
output_padding (list[int|None]|int|None)
with_bias (bool) – whether to add a bias. enabled by default
- class returnn.frontend.conv.TransposedConv2d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]¶
2D transposed convolution
- Parameters:
in_dim (Dim)
out_dim (Dim)
filter_size (list[int])
strides (list[int]|None) – specifies the upscaling. by default, same as filter_size
padding (str) – “same” or “valid”
remove_padding (list[int]|int)
output_padding (list[int|None]|int|None)
with_bias (bool) – whether to add a bias. enabled by default
- class returnn.frontend.conv.TransposedConv3d(in_dim: Dim, out_dim: Dim, filter_size: Sequence[int | Dim], *, padding: str, remove_padding: Sequence[int] | int = 0, output_padding: Sequence[int | None] | int | None = None, strides: Sequence[int] | None = None, with_bias: bool = True)[source]¶
3D transposed convolution
- Parameters:
in_dim (Dim)
out_dim (Dim)
filter_size (list[int])
strides (list[int]|None) – specifies the upscaling. by default, same as filter_size
padding (str) – “same” or “valid”
remove_padding (list[int]|int)
output_padding (list[int|None]|int|None)
with_bias (bool) – whether to add a bias. enabled by default
- returnn.frontend.conv.pool(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim] | Dim, out_spatial_dims: Dim | Sequence[Dim] | None = None, nd: int | None = None) Tuple[Tensor, Sequence[Dim]] [source]¶
A generic N-D pooling layer. This would usually be done after a convolution for down-sampling.
- Parameters:
source (Tensor)
nd
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dims (Sequence[Dim])
out_spatial_dims (Sequence[Dim]|None)
- Returns:
layer, out_spatial_dims
- returnn.frontend.conv.max_pool(source: Tensor, *, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim] | Dim, out_spatial_dims: Dim | Sequence[Dim] | None = None) Tuple[Tensor, Sequence[Dim]] [source]¶
max-pool
- returnn.frontend.conv.max_pool1d(source: Tensor, *, pool_size: int, padding: str = 'valid', dilation_rate: int = 1, strides: int | None = None, in_spatial_dim: Dim, out_spatial_dim: Dim | None = None) Tuple[Tensor, Dim] [source]¶
max pool
- returnn.frontend.conv.pool1d(source: Tensor, *, mode: str, pool_size: int, padding: str = 'valid', dilation_rate: int = 1, strides: int | None = None, in_spatial_dim: Dim, out_spatial_dim: Dim | None = None) Tuple[Tensor, Dim] [source]¶
1D pooling.
- Parameters:
source (Tensor)
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dim (Sequence[Dim])
out_spatial_dim (Sequence[Dim]|None)
- Returns:
layer, out_spatial_dim
- returnn.frontend.conv.pool2d(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None) Tuple[Tensor, Sequence[Dim]] [source]¶
2D pooling.
- Parameters:
source (Tensor)
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dims (Sequence[Dim])
out_spatial_dims (Sequence[Dim]|None)
- Returns:
layer, out_spatial_dims
- returnn.frontend.conv.pool3d(source: Tensor, *, mode: str, pool_size: Sequence[int] | int, padding: str = 'valid', dilation_rate: Sequence[int] | int = 1, strides: int | Sequence[int] | None = None, in_spatial_dims: Sequence[Dim], out_spatial_dims: Sequence[Dim] | None = None) Tuple[Tensor, Sequence[Dim]] [source]¶
3D pooling.
- Parameters:
source (Tensor)
mode (str) – “max” or “avg”
pool_size (tuple[int]) – shape of the window of each reduce
padding (str) – “valid” or “same”
dilation_rate (tuple[int]|int)
strides (tuple[int]|int|None) – in contrast to tf.nn.pool, the default (if it is None) will be set to pool_size
in_spatial_dims (Sequence[Dim])
out_spatial_dims (Sequence[Dim]|None)
- Returns:
layer, out_spatial_dims
- returnn.frontend.conv.make_conv_out_spatial_dims(in_spatial_dims: Sequence[Dim], *, filter_size: Sequence[int | Dim] | int | Dim, padding: str, strides: Sequence[int] | int = 1, dilation_rate: Sequence[int] | int = 1, description_prefix: str | None = None) Sequence[Dim] [source]¶
create out spatial dims from in spatial dims