Batched Softmax Layer¶
SoftmaxOverSpatialLayer(axis=None, energy_factor=None, start=None, window_start=None, window_size=None, use_time_mask=None, log_space=False, **kwargs)¶
This applies a softmax over spatial axis/axes (currently only time axis supported). E.g. when the input is of shape (B,T,dim), the output will be (B,T,dim). It automatically masks the frames outside the seq defined by the seq-len. In contrast to
SoftmaxLayer, this will not do a linear transformation. See
SeqLenMaskLayerif you just want to apply a masking.
- axis (str|None) – which axis to do the softmax over
- energy_factor (float|None) – the energy will be scaled by this factor. This is like a temperature for the softmax. In Attention-is-all-you-need, this is set to 1/sqrt(base_ctx.dim).
- start (LayerBase|None) – Tensor of shape (B,) indicating the start frame
- window_start (LayerBase|int|None) – Layer with output of shape (B,) or (constant) int value indicating the window start.
- window_size (LayerBase|int|None) – Layer with output of shape (B,) or (constant) int value indicating the window size.
- use_time_mask (bool) – if True, assumes dyn seq len, and use it for masking. By default, if dyn seq len exists, it uses it.
- log_space (bool) – if True, returns in log space (i.e. uses log_softmax)
get_out_data_from_opts(name, sources, axis=None, start=None, window_start=None, window_size=None, **kwargs)¶
- name (str) –
- sources (list[LayerBase]) –
- axis (str|None) –
- start (LayerBase|None) –
- window_start (LayerBase|None) –
- window_size (LayerBase|int|None) –