Defines the TFNetwork and ExternData.

class TFNetwork.ExternData(data=None, default_input='data', default_target='classes')[source]

This holds Data instances for every data-key of external data from the dataset, i.e. the description such as shape and sparsity, etc.

Parameters:data (None|dict[str,dict[str]]) – optional init kwargs for Data
init_from_config(self, config)[source]
Parameters:config (Config.Config) –
classmethod data_kwargs_from_dataset_key(dataset, key)[source]
Return type:


init_from_dataset(self, dataset)[source]
Parameters:dataset (Dataset.Dataset) –
check_matched_dataset(self, dataset, used_data_keys=None)[source]

nothing, will assert the check

register_data_from_dict(self, data)[source]
Parameters:data (dict[str,dict[str]]) – init kwargs for Data
register_data(self, data)[source]
Parameters:data (Data) – will use as the key
has_data(self, name)[source]
Parameters:name (str) –
Return type:bool
get_data(self, name)[source]
Parameters:name (str) –
Return type:Data
Return type:Data
Return type:Data
Returns:str describing the data
Return type:str
get_queue_args(self, with_batch_dim, fixed_batch_dim=None)[source]
  • with_batch_dim (bool) –
  • fixed_batch_dim (int|None) –

kwargs for tf.Queue.__init__

Return type:


Return type:list[(str,Data)]
get_all_dimension_tags(self, allow_same_feature_dim=False)[source]
Parameters:allow_same_feature_dim (bool) –
Return type:list[DimensionTag]
class TFNetwork.TFNetwork(config=None, extern_data=None, rnd_seed=None, train_flag=False, eval_flag=False, search_flag=False, parent_layer=None, parent_net=None, extra_parent_net=None, is_inside_rec_layer=None, name=None)[source]

The main neural network, i.e. collection of interconnected layers, i.e. computation graph with trainable params.

  • config (Config.Config) – only needed to init extern_data if not specified explicitly
  • extern_data (ExternData|None) –
  • rnd_seed (int|None) –
  • train_flag (bool|tf.Tensor) – True if we want to use this model in training, False if in eval, or dynamic
  • eval_flag (bool) – whether to calculate losses. if train_flag is not False, this will be set to True
  • search_flag (bool) – whether we perform a beam-search. see usage
  • parent_layer (TFNetworkLayer.LayerBase|None) –
  • parent_net (TFNetwork|None) –
  • extra_parent_net (TFNetwork|None) –
  • is_inside_rec_layer (bool) – at template construction, use this
  • name (str) – only for debugging
Return type:TFNetwork
Returns:TF scope name, always with “/” at the end, or “”
Return type:str
Returns:name, always with “/” at the end, or “”
Return type:str
construct_from(self, list_or_dict)[source]
Parameters:| dict[str,dict[str]] list_or_dict (list[dict[str]]) –
construct_from_list(self, net_list)[source]
Parameters:net_list (list[dict[str]]) – list of layer descriptions
construct_from_dict(self, net_dict)[source]
Parameters:net_dict (dict[str,dict[str]]) –
construct_extra_net(self, net_dict, layer_list, search_flag=None, dep_layers_in_extra=False, net_name=None, prefix_name=None)[source]

The purpose is to create another net like self but with different flags, e.g. with search_flag = True. That extra_net can have different losses, which will be added. Layers in layer_list will be explicitly re-created in the extra net. Other layers are taken from self.

The creation of the extra net and layers in the extra net can be triggered explicitly by referring to another layer as e.g. "". When done this way, all the dependencies of it are created in self again; unless you explicitly have called another layer like "". See test_extra_search() for an example.

  • net_dict (dict[str,dict[str]]) –
  • layer_list (list[str]) –
  • search_flag (bool|None) –
  • dep_layers_in_extra (bool) – layers not in layer_list, but which are not yet created, will be part of the extra net, not self.
  • net_name (str|None) –
  • prefix_name (str|None) – e.g. “”, such that layers would be called like “”

the layers created via layer_list (all in extra net)

Return type:


construct_layer(self, net_dict, name, get_layer=None, add_layer=None, check_existing=True)[source]
  • net_dict (dict[str,dict[str]]) –
  • name (str) – layer name
  • -> LayerBase)|None get_layer (((str)) – optional, for source layers, for transform_config_dict. By default, this wraps self.construct_layer(). I.e. the name might be misleading, as this should return an existing layer, or construct it if it does not exist yet.
  • LayerBase, dict) -> LayerBase) | None add_layer (((str,) – by default self.add_layer
  • check_existing (bool) – check self.get_layer. (self.layers will be checked in any case)
Return type:


add_layer(self, name, layer_class, **layer_desc)[source]

This will construct the layer given the layer_desc arguments, and add it to the network.

  • name (str) –
  • layer_class ((()->LayerBase)|LayerBase) –
  • layer_desc – contains the kwargs for the layer class. the args should have been transformed via layer_class.transform_config_dict before (see construct_layer). must not contain “name” and “network”, which will be automatically added here. should not contain “output”, which will be initialized to layer_class.get_out_data_from_opts. the layer_class will usually then define the layer.output and its placeholder. there is one notable exception: the InternalLayer, where you predefine the output.
get_extern_data(self, key, mark_data_key_as_used=True)[source]

Returns Data and add the key to self.used_data_keys if mark_data_key_as_used. :param str key: e.g. “data” or “classes” :param bool mark_data_key_as_used: :rtype: Data

get_used_data_keys(self, exclude_extra_added=True)[source]
Parameters:exclude_extra_added (bool) –
Return type:set[str]
get_seq_tags(self, mark_data_key_as_used=True)[source]
Parameters:mark_data_key_as_used (bool) – for extern_data
Returns:tensor of shape (batch,) of dtype string, via extern_data
Return type:tf.Tensor
get_losses_initialized(self, reduce_func=None, with_total=False)[source]
  • reduce_func (((tf.Tensor)->tf.Tensor)|None) – as in get_losses. e.g. TFUtil.identity
  • with_total (bool) – whether to return total loss / constraints

loss name (e.g. “output” or “rec_layer/output” or so) -> LossHolder (initialized, i.e. layer set), and optionally total loss and total constraints (if with_total)

Return type:

(dict[str,LossHolder], tf.Tensor|int|None, tf.Tensor|int|None)


Construct self.total_object.

Return type:int|tf.Tensor
Returns:0 if no loss, or tf.Tensor, scalar. loss + constraints. will be used for the updater.
Return type:int|tf.Tensor
Returns:0 if no loss, or tf.Tensor, scalar. without constraints. will be used for the updater
Return type:int|tf.Tensor
Returns:0 if no constraints, or tf.Tensor, scalar. will be used for the updater
get_fetches_dict(self, config=None, should_train=None, should_eval=None, with_summary=False, with_size=False)[source]
  • config (Config.Config|None) –
  • should_train (bool|None) –
  • should_eval (bool|None) –
  • with_summary (bool) –
  • with_size (bool) –

values and actions which should be calculated and executed in by the TF session for each step

Return type:


Returns:sorted list of targets
Return type:list[str]
Returns:e.g. “classes”
Return type:str
Return type:list[LayerBase]
Return type:str|None
Returns:default output layer name if there is one, or None
get_default_output_layer(self, must_exist=True)[source]
Parameters:must_exist (bool) – if it does not exist, will raise an exception
Return type:LayerBase|None
Returns:the default output layer
get_layer(self, layer_name)[source]

Normally just self.layers[layer_name] but with some extra logic added, such as resolving “base:” prefix to the parent network. Raises LayerNotFound if the layer is not found.

Parameters:layer_name (str) –
Return type:LayerBase
Returns:list of model variables, i.e. from all the layers, excluding auxiliary vars like global_step
Return type:list[tf.Variable]
Returns:params and saveable_param_replace resolved, union of all layers
Return type:dict[tf.Variable,]
Returns:list of model variables or SaveableObject, to save/restore
Return type:list[tf.Variable|]
Returns:list of variables
Return type:list[tf.Variable]
declare_train_params(self, hidden_layer_selection=None, with_output=None)[source]
  • hidden_layer_selection (list[str]|None) –
  • with_output (bool|None) –
Returns:number of model parameters, i.e. total dimension
Return type:int
initialize_params(self, session)[source]
Parameters:session (tf.Session) –

Note: This will create a new node to the graph for each call! And it will overwrite also the already initialized variables. So you should call this only once after network construction and before you maybe load some of the params from external sources. If you know that you will load all params explicitly, you would not need to call this function.

get_var_assigner(self, var)[source]
Parameters:var (tf.Variable) –
get_param_values_dict(self, session)[source]
Parameters:session (tf.Session) –
Returns:dict: layer_name -> param_name -> variable numpy array
Return type:dict[str,dict[str,numpy.ndarray]]

Note that this excludes auxiliary params.

set_param_values_by_dict(self, values_dict, ignore_non_existing=False, **kwargs)[source]
  • values_dict (dict[str,dict[str,numpy.ndarray]]) –
  • ignore_non_existing (bool) –
  • kwargs – passed to LayerBase.set_param_values_by_dict()

Note that this excludes auxiliary params.

Return type:list[tf.Variable]
get_params_serialized(self, session)[source]
Parameters:session (tf.Session) –
Return type:TFNetworkParamsSerialized
set_params_by_serialized(self, serialized, session, **kwargs)[source]
set_global_train_step(self, step, session)[source]
  • step (int) –
  • session (tf.Session) –
get_global_train_step(self, session)[source]
Parameters:session (tf.Session) –
Return type:int
Return type:tf.Tensor

Resets the tf.train.Saver object which will be used for load_params_from_file() and save_params_to_file(). Warning: Don’t repeat that too often as it will always create new ops in the computation graph.

save_params_to_file(self, filename, session)[source]

Will save the model parameters to the filename. Note that the model parameters live inside the current TF session.

  • filename (str) –
  • session (tf.Session) –
load_params_from_file(self, filename, session)[source]

Will load the model parameters from the filename. Note that the model parameters live inside the current TF session.

  • filename (str) –
  • session (tf.Session) –
print_network_info(self, name='Network')[source]
Parameters:name (str) –
Returns:nothing, prints very brief net topology on log
cond_on_train(self, fn_train, fn_eval)[source]

Uses fn_train() or fn_eval() base on self.train_flag. It will be a branched evaluation.

  • fn_train (()->tf.Tensor) –
  • fn_eval (()->tf.Tensor) –

fn_train() if self.train_flag else fn_eval()

Return type:


get_search_choices(self, sources=None, src=None, base_search_choice=None, _visited=None, debug_stream=None)[source]

Recursively searches through all sources, and if there is a ChoiceLayer / any layer with search_choices, returns it. Could also go to the parent network. If there are multiple, it assumes they are on the same search-sequence in the search-tree and it will return the last one.

  • src (LayerBase|None) –
  • base_search_choice (LayerBase|None) –
  • sources (list[LayerBase]|None) –
  • _visited (dict[LayerBase]|None) – keep track about visited layers in case there are circular deps
  • debug_stream (typing.TextIO|None) – if given, will print additional debug info into it

(direct or indirect) source LayerBase which has search_choices, or None

Return type:


debug_search_choices(self, base_search_choice)[source]
Parameters:base_search_choice (LayerBase) –
Returns:nothing, by intention, such that constructs like assert …, debug_search_choices(…) or (…) work

Get the batch-dim size, i.e. amount of sequences in the current batch. Consider that the data tensor is usually of shape [batch, time, dim], this would return shape(data)[0].

The code currently assumes that the batch-dim can be taken from the extern data. If it does not have that available for some reason (e.g. some subnetwork), it will try some alternative sources and assumes that they have the correct batch-dim.

Note that the batch-dim usually stays always the same across the whole network and also every individual batch sequence will stay related. One notable exception of this is the choice layer, where the batch-dim will get expanded by the beam search if search is used, as well as in all following layers, until there is a decide layer.

Returns:int scalar tensor which states the batch-dim
Return type:int|tf.Tensor
set_rec_step_info(self, i, end_flag=None, end_flag_source=None, seq_lens=None)[source]

Used by _SubnetworkRecCell.

  • i (tf.Tensor) – scalar, int32, current step (time)
  • end_flag (tf.Tensor|None) – (batch,), bool, says that the current sequence has ended
  • end_flag_source (LayerBase|None) –
  • seq_lens (tf.Tensor|None) – (batch,) int32, seq lens
Returns:whether we are inside a RecLayer. see get_rec_parent_layer()
Return type:bool
Returns:if we are a subnet of a RecLayer, will return the RecLayer instance
Return type:TFNetworkRecLayer.RecLayer|None
Return type:bool
get_rec_step_info(self, must_exist=True)[source]
Parameters:must_exist (bool) – if True, will throw exception if not available
Return type:TFNetworkRecLayer.RecStepInfoLayer|None

Assumes that have_rec_step_info is True.

Return type:tf.Tensor
Returns:scalar, int32
get_config(self, consider_global_config=True, fallback_dummy_config=True)[source]
  • consider_global_config (bool) – if no config is set, check for global config
  • fallback_dummy_config (bool) – if no config, return a new empty Config, otherwise return None
Return type:


static register_post_control_dependencies(deps)[source]

Will register the control dependencies or globally for a session run on this network. This can e.g. be called inside self.post_init. We use UPDATE_OPS, as that is also e.g. used by batchnorm. See:

Parameters:deps (list[tf.Tensor|tf.Operation]) –
static get_post_control_dependencies()[source]
Return type:list[tf.Operation]
classmethod get_network_stack()[source]
Return type:list[TFNetwork]
classmethod get_current_network(must_exist=True)[source]
Parameters:must_exist (bool) –
Return type:TFNetwork|None

Registers a ref to this network inside the current TF computation graph.

class TFNetwork.TFNetworkParamsSerialized(values_dict, global_train_step)[source]

Holds all the params as numpy arrays, including auxiliary params.

  • values_dict (dict[str,dict[str,numpy.ndarray]]) – dict: layer_name -> param_name -> variable numpy array
  • global_train_step (int) –
class TFNetwork.LossHolder(name, loss, layer_output, reduce_func=None, layer=None, loss_value=None, error_value=None, norm_factor=None, only_on_eval=None, network=None)[source]

This object just keeps a reference to the loss/error value, and does the necessary logic to collect it, and also the normalization logic. Every new computation (nodes in the computation graph) must be constructed on demand, to allow first to collect all possible losses without calculating them, and then calculating them in the right context (e.g. inside a while_loop, or so).

After construction, you should call init() before usage, in case you do not provide layer here.

  • name (str) – The name uniquely identifies the loss. Earlier, this was the same as the layer name. This is still true for simple cases, but for losses coming from a subnetwork or other extended losses, it can be something else. It could look like “output”, or “output/sublayer”.
  • layer (LayerBase) – We can always point to a layer where this comes from (either in the subnet, or the parent layer).
  • layer_output (Data) – template describing the layer output
  • network (TFNetwork) – for which network to create this LossHolder. might be different from
  • loss (TFNetworkLayer.Loss) –
  • reduce_func (((tf.Tensor)->tf.Tensor)|None) – if given, will overwrite the reduce func for the loss. By default, every loss_value and error_value is a scalar (sum or average over the batches, and over the frames for frame-wise losses). However, if you provide reduce_func = TFUtil.identity, you can get the unreduced tensor.
  • loss_value (tf.Tensor|None) –
  • error_value (tf.Tensor|None) –
  • norm_factor (tf.Tensor) –
  • only_on_eval (bool) –
init(self, layer)[source]

It will just set the layer. The LossHolder is initialized if the layer is set.

Parameters:layer (LayerBase) –
Return type:LossHolder
Returns:layer. assumes that it is set
Return type:LayerBase
Returns:only_on_eval flag. assumes that it is set
Return type:bool
Returns:name which can be used for a TF op, thus contains no “/” or other special chars
Return type:str
Returns:loss value. scalar
Return type:tf.Tensor|None
Returns:loss value for fetch. scalar. same as loss_value, but maybe with additional checks
Return type:tf.Tensor|None
Returns:loss value for objective. scalar. might be scaled (scale) and/or normalized (use_normalized_loss)
Return type:tf.Tensor|None
Returns:error value for fetch. scalar
Return type:tf.Tensor|None
Returns:norm factor for loss and error. scalar
Return type:tf.Tensor
get_normalized_loss_value_per_seq(self, per_pos=False)[source]
Parameters:per_pos (bool) – one value per time position
Returns:if per_pos return (batch,time) else (batch,) or None if loss is None
Return type:tf.Tensor|None
get_normalized_error_value_per_seq(self, per_pos=False)[source]
Parameters:per_pos (bool) – one value per time position
Returns:if per_pos return (batch,time) else (batch,) or None if error is None
Return type:tf.Tensor|None
copy_new_base(self, name=None, layer=None, network=None, reduce_func=None)[source]
  • layer (LayerBase) –
  • network (TFNetwork) –
  • name (str) –
  • reduce_func (((tf.Tensor)->tf.Tensor)|None) –

new copy of LossHolder

Return type:


exception TFNetwork.NetworkConstructionDependencyLoopException(network, layer_name, constructing_layers, net_dict)[source]

This is raised when there is a dependency loop in the network construction.

  • network (TFNetwork) –
  • layer_name (str) –
  • constructing_layers (list[str]) –
  • net_dict (dict[str,dict[str]]) –
exception TFNetwork.LayerNotFound[source]

Via TFNetwork.get_layer().

TFNetwork.help_on_tf_exception(exception, fetches, feed_dict, meta_step_info, extern_data, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]
  • exception (tf.errors.OpError|BaseException) –
  • fetches (tf.Tensor|list[tf.Tensor]|dict[str,tf.Tensor]|None) –
  • feed_dict (dict[tf.Tensor,numpy.ndarray]) –
  • meta_step_info (dict[str]) –
  • extern_data (ExternData) –
  • file (typing.IO[str]) –
class TFNetwork.CustomCheckpointLoader(filename, saveable_params, params_prefix='', load_if_prefix='', ignore_missing=False, network=None)[source]

This uses tf.train.NewCheckpointReader.

It would do automatic conversions if needed, e.g. between different LSTM implementations. However, be careful that for some LSTM implementation, there is an additional forget_bias option, which is an additional scalar which gets added (not to the param, but to the forget value directly). When we convert the parameters, this is ignored, and you must take care about that explicitly to make sure you get the same results.

It tries to automatically resolve renames, similar to this:

Also see:

  • filename (str) – filepattern for NewCheckpointReader
  • saveable_params (list[tf.Variable|]) –
  • params_prefix (str) – expect that all vars in saveable_params have this prefix, and remove it
  • load_if_prefix (str) – if given, only load variables with a name containing this string. the variables in the file are expected to have the same name but without this string.
  • ignore_missing (bool) – any vars in the model, which are not found in the checkpoint, will be ignored. however, if there is no single var in the checkpoint, this is still an error.
  • network (TFNetwork) –
class CustomParamImporter(layer, checkpoint_loader)[source]

Helper class for custom param loading.

assign_var(self, var, session)[source]
  • var (tf.Variable) –
  • session (tf.Session) –
class VariableValue(value=None, custom_param_importer=None)[source]

Helper to assign some variable.

assign_var(self, var, session)[source]
  • var (tf.Variable) –
  • session (tf.Session) –
Returns:var -> numpy array
Return type:dict[tf.Variable,CustomCheckpointLoader.VariableValue]
load_now(self, session)[source]
Parameters:session (tf.Session) –
Returns:nothing, will assign the variables in the session

Make sure that this loader is used during initialization.

TFNetwork.set_custom_post_init(var, func)[source]

It registers the provided func such that it gets called for this variable in TFNetwork.initialize_params().

  • var (tf.Variable) –
  • func ((tf.Session)->None) –