Alignment Layers#

Forced Alignment Layer#

class returnn.tf.layers.basic.ForcedAlignmentLayer(align_target, topology, input_type, blank_idx=-1, blank_included=False, **kwargs)[source]#

Calculates a forced alignment, via Viterbi algorithm.

Parameters:
  • align_target (LayerBase) –

  • topology (str) – e.g. “ctc” or “rna” (RNA is CTC without label loop)

  • input_type (str) – “log_prob” or “prob”

  • blank_idx (int) – vocab index of the blank symbol

  • blank_included (bool) – whether blank token of the align target is included in the vocabulary

layer_class: Optional[str] = 'forced_align'[source]#
classmethod get_sub_layer_out_data_from_opts(layer_name, parent_layer_kwargs)[source]#
Parameters:
  • layer_name (str) – sub layer name

  • parent_layer_kwargs (dict[str]) –

Returns:

Data template, class type of sub-layer, layer opts (transformed)

Return type:

(Data, type, dict[str])|None

get_sub_layer(layer_name)[source]#
Parameters:

layer_name (str) –

Return type:

LayerBase|None

classmethod get_available_sub_layer_names(parent_layer_kwargs)[source]#
Parameters:

parent_layer_kwargs (dict[str]) –

Return type:

list[str]

get_dep_layers()[source]#
Return type:

list[LayerBase]

classmethod transform_config_dict(d, network, get_layer)[source]#
Parameters:
classmethod get_out_data_from_opts(name, sources, **kwargs)[source]#
Parameters:
  • name (str) –

  • sources (list[LayerBase]) –

Return type:

Data

kwargs: Optional[Dict[str]][source]#
output_before_activation: Optional[OutputWithActivation][source]#
output_loss: Optional[tf.Tensor][source]#
rec_vars_outputs: Dict[str, tf.Tensor][source]#
search_choices: Optional[SearchChoices][source]#
params: Dict[str, tf.Variable][source]#
saveable_param_replace: Dict[tf.Variable, Union['tensorflow.python.training.saver.BaseSaverBuilder.SaveableObject', None]][source]#
stats: Dict[str, tf.Tensor][source]#
input_data: Optional[Data][source]#

Fast Baum-Welch Layer#

class returnn.tf.layers.basic.FastBaumWelchLayer(align_target, align_target_key=None, ctc_opts=None, sprint_opts=None, input_type='log_prob', tdp_scale=1.0, am_scale=1.0, min_prob=0.0, staircase_seq_len_source=None, **kwargs)[source]#

Calls fast_baum_welch() or fast_baum_welch_by_sprint_automata(). We expect that our input are +log scores, e.g. use log-softmax.

Parameters:
  • align_target (str) – e.g. “sprint”, “ctc” or “staircase”

  • align_target_key (str|None) – e.g. “classes”, used for e.g. align_target “ctc”

  • ctc_opts (dict[str]) – used for align_target “ctc”

  • sprint_opts (dict[str]) – used for Sprint (RASR) for align_target “sprint”

  • input_type (str) – “log_prob” or “prob”

  • tdp_scale (float) –

  • am_scale (float) –

  • min_prob (float) – clips the minimum prob (value in [0,1])

  • staircase_seq_len_source (LayerBase|None) –

layer_class: Optional[str] = 'fast_bw'[source]#
recurrent = True[source]#
output_loss: Optional[tf.Tensor][source]#
classmethod transform_config_dict(d, network, get_layer)[source]#
Parameters:
classmethod get_out_data_from_opts(name, sources, **kwargs)[source]#
Parameters:
  • name (str) –

  • sources (list[LayerBase]) –

Return type:

Data

kwargs: Optional[Dict[str]][source]#
output_before_activation: Optional[OutputWithActivation][source]#
rec_vars_outputs: Dict[str, tf.Tensor][source]#
search_choices: Optional[SearchChoices][source]#
params: Dict[str, tf.Variable][source]#
saveable_param_replace: Dict[tf.Variable, Union['tensorflow.python.training.saver.BaseSaverBuilder.SaveableObject', None]][source]#
stats: Dict[str, tf.Tensor][source]#
input_data: Optional[Data][source]#