TFNetworkSigProcLayer

class TFNetworkSigProcLayer.AlternatingRealToComplexLayer(**kwargs)[source]

This layer converts a real valued input tensor into a complex valued output tensor. For this even and odd features are considered the real and imaginary part of one complex number, respectively

layer_class = 'alternating_real_to_complex'[source]
classmethod get_out_data_from_opts(name, sources, n_out=None, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.BatchMedianPoolingLayer(pool_size=1, **kwargs)[source]

This layer is used to pool together batches by taking their medium value. Thus the batch size is divided by pool_size. The stride is hard coded to be equal to the pool size

Parameters:int (pool_size) – size of the pool to take median of (is also used as stride size)
layer_class = 'batch_median_pooling'[source]
classmethod get_out_data_from_opts(name, sources, pool_size, n_out=None, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.ComplexLinearProjectionLayer(nr_of_filters, clp_weights_init='glorot_uniform', **kwargs)[source]
layer_class = 'complex_linear_projection'[source]
classmethod get_out_data_from_opts(nr_of_filters, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.MaskBasedGevBeamformingLayer(nr_of_channels=1, postfilter_id=0, qralgorithm_steps=None, output_nan_filter=False, **kwargs)[source]

This layer applies GEV beamforming to a multichannel signal. The different channels are assumed to be concatenated to the input feature vector. The first source to the layer must contain the complex spectrograms of the single channels and the second source must contain the noise and speech masks

Parameters:
  • nr_of_channels (int) – number of input channels to beamforming (needed to split the feature vector)
  • postfilter_id (int) – Id which is specifying which post filter to apply in gev beamforming. For more information see tfSi6Proc.audioProcessing.enhancement.beamforming.TfMaskBasedGevBeamformer
  • int|None – nr of steps of the qr algorithm to compute eigen vector for beamforming
  • output_nan_filter (bool) – if set to true nan values in the beamforming output are replaced by zero
layer_class = 'mask_based_gevbeamforming'[source]
classmethod get_out_data_from_opts(out_type={}, n_out=None, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.MaskBasedMvdrBeamformingWithDiagLoadingLayer(nr_of_channels=1, diag_loading_coeff=0, qralgorithm_steps=None, output_nan_filter=False, **kwargs)[source]

This layer applies GEV beamforming to a multichannel signal. The different channels are assumed to be concatenated to the input feature vector. The first source to the layer must contain the complex spectrograms of the single channels and the second source must contain the noise and speech masks

Parameters:
  • nr_of_channels (int) – number of input channels to beamforming (needed to split the feature vector)
  • diag_loading_coeff (int) – weighting coefficient for diagonal loading.
  • qralgorithm_steps (int|None) – nr of steps of the qr algorithm to compute eigen vector for beamforming
  • output_nan_filter (bool) – if set to true nan values in the beamforming output are replaced by zero
layer_class = 'mask_based_mvdrbeamforming'[source]
classmethod get_out_data_from_opts(out_type={}, n_out=None, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.MelFilterbankLayer(sampling_rate=16000, fft_size=1024, nr_of_filters=80, **kwargs)[source]

This layer applies the log Mel filterbank to the input

Parameters:
  • int (nr_of_filters) – sampling rate of the signal which the input originates from
  • int – fft_size with which the time signal was transformed into the intput
  • int – number of output filter bins
layer_class = 'mel_filterbank'[source]
classmethod get_out_data_from_opts(name, sources, n_out=None, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.MultiChannelStftLayer(frame_shift, frame_size, fft_size, window='hanning', use_rfft=True, nr_of_channels=1, pad_last_frame=False, **kwargs)[source]

The layer applys a STFT to every channel separately and concatenates the frequency domain vectors for every frame

Parameters:
  • frame_shift (int) – frame shift for stft in samples
  • frame_size (int) – frame size for stft in samples
  • fft_size (int) – fft size in samples
  • window (str) – id of the windowing function used. Possible options are: - hanning
  • use_rfft (bool) – if set to true a real input signal is expected and only the significant half of the FFT bins are returned
  • nr_of_channels (int) – number of input channels
  • pad_last_frame (bool) – padding of last frame with zeros or discarding of last frame
layer_class = 'multichannel_stft_layer'[source]
recurrent = True[source]
classmethod get_out_data_from_opts(fft_size, use_rfft=True, nr_of_channels=1, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.NoiseEstimationByFirstTFramesLayer(nr_of_frames, **kwargs)[source]
Parameters:nr_of_frames (int) – first nr_of_frames frames are used for averaging all frames are used if nr_of_frames is -1
layer_class = 'first_t_frames_noise_estimator'[source]
recurrent = True[source]
class TFNetworkSigProcLayer.ParametricWienerFilterLayer(l_overwrite=None, p_overwrite=None, q_overwrite=None, filter_input=None, parameters=None, noise_estimation=None, average_parameters=False, **kwargs)[source]
Parameters:
  • l_overwrite (float|None) – if given overwrites the l value of the parametric wiener filter with the given constant
  • p_overwrite (float|None) – if given overwrites the p value of the parametric wiener filter with the given constant
  • q_overwrite (float|None) – if given overwrites the q value of the parametric wiener filter with the given constant
  • filter_input (LayerBase|None) – name of layer containing input for wiener filter
  • parameters (LayerBase|None) – name of layer containing parameters for wiener filter
  • noise_estimation (LayerBase|None) – name of layer containing noise estimate for wiener filter
  • average_parameters (bool) – if set to true the parameters l, p and q are averaged over the time axis
layer_class = 'parametric_wiener_filter'[source]
classmethod get_out_data_from_opts(**kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:
  • d (dict[str]) – will modify inplace
  • network (TFNetwork.TFNetwork) –
  • -> LayerBase) get_layer (((str)) – function to get or construct another layer The name get_layer might be misleading, as this should return an existing layer, or construct it if it does not exist yet. network.get_layer would just return an existing layer.

Will modify d inplace such that it becomes the kwargs for self.__init__(). Mostly leaves d as-is. This is used by TFNetwork.construct_from_dict(). It resolves certain arguments, e.g. it resolves the “from” argument which is a list of strings, to make it the “sources” argument in kwargs, with a list of LayerBase instances. Subclasses can extend/overwrite this. Usually the only reason to overwrite this is when some argument might be a reference to a layer which should be resolved.

class TFNetworkSigProcLayer.SignalMaskingLayer(signal, mask, **kwargs)[source]
Parameters:
  • signal (LayerBase) – name of layer the signal to be masked
  • mask (LayerBase) – name of layer containing the mask
layer_class = 'signal_masking'[source]
classmethod transform_config_dict(d, network, get_layer)[source]
Parameters:
  • d (dict[str]) – will modify inplace
  • network (TFNetwork.TFNetwork) –
  • -> LayerBase) get_layer (((str)) – function to get or construct another layer The name get_layer might be misleading, as this should return an existing layer, or construct it if it does not exist yet. network.get_layer would just return an existing layer.

Will modify d inplace such that it becomes the kwargs for self.__init__(). Mostly leaves d as-is. This is used by TFNetwork.construct_from_dict(). It resolves certain arguments, e.g. it resolves the “from” argument which is a list of strings, to make it the “sources” argument in kwargs, with a list of LayerBase instances. Subclasses can extend/overwrite this. Usually the only reason to overwrite this is when some argument might be a reference to a layer which should be resolved.

class TFNetworkSigProcLayer.SplitConcatMultiChannel(nr_of_channels=1, **kwargs)[source]

This layer assumes the feature vector to be a concatenation of features of multiple channels (of the same size). It splits the feature dimension into equisized number of channel features and stacks them in the batch dimension. Thus the batch size is multiplied with the number of channels and the feature size is divided by the number of channels. The channels of one singal will have consecutive batch indices, meaning the signal of the original batch index n is split and can now be found in batch indices (n * nr_of_channels) to ((n+1) * nr_of_channels - 1)

Parameters:nr_of_channels (int) – the number of concatenated channels in the feature dimension
layer_class = 'split_concatenated_multichannel'[source]
classmethod get_out_data_from_opts(name, sources, nr_of_channels, n_out=None, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data
class TFNetworkSigProcLayer.TileFeaturesLayer(repetitions=1, **kwargs)[source]

This function is tiling features with giving number of repetitions

Parameters:int (repetitions) – number of tiling repetitions in feature domain
layer_class = 'tile_features'[source]
classmethod get_out_data_from_opts(name, sources, repetitions, n_out=None, **kwargs)[source]

Gets a Data template (i.e. shape etc is set but not the placeholder) for our __init__ args. The purpose of having this as a separate classmethod is to be able to infer the shape information without having to construct the layer. This function should not create any nodes in the computation graph.

Parameters:kwargs – all the same kwargs as for self.__init__()
Returns:Data template (placeholder not set)
Return type:Data