returnn.tf.network
¶
Defines the TFNetwork
and ExternData
.
- exception returnn.tf.network.DataNotFound[source]¶
When accessing non-existing data key in
ExternData
(e.g. extern_data).
- class returnn.tf.network.ExternData(data=None, default_input='data', default_target='classes')[source]¶
This holds
Data
instances for every data-key of external data from the dataset, i.e. the description such as shape and sparsity, etc.It is usually defined by a user config. See
init_from_config()
.- Parameters:
data (None|dict[str,dict[str]]) – optional init kwargs for Data
- init_from_config(config, auto_create_placeholders=False, reset_batch=True)[source]¶
It reads
extern_data
from the config, which defines theData
instance options to be created.- Parameters:
config (returnn.config.Config)
auto_create_placeholders (bool)
reset_batch (bool)
- init_from_dataset(dataset, auto_create_placeholders=True)[source]¶
- Parameters:
dataset (returnn.datasets.Dataset)
auto_create_placeholders (bool)
- init_batch_info()[source]¶
Initializes and sets the batch info on the extern data, i.e. sets
Data.batch
. SeeBatchInfo
.
- check_matched_dataset(dataset, used_data_keys=None)[source]¶
- Parameters:
dataset (Dataset.Dataset)
used_data_keys (set[str]|list[str])
- Returns:
nothing, will assert the check
- get_all_dimension_tags(allow_same_feature_dim=False)[source]¶
- Parameters:
allow_same_feature_dim (bool)
- Return type:
list[Dim]
- set_batch_info(batch_info, *, init_batch_info: bool = True)[source]¶
- Parameters:
batch_info (returnn.tf.util.data.BatchInfo)
init_batch_info – calls
init_batch_info()
, which might further initialize/modify the batch info
- class returnn.tf.network.TFNetwork(config=None, extern_data=None, rnd_seed=None, train_flag=None, eval_flag=None, search_flag=None, parent_layer=None, parent_net=None, extra_parent_net=None, extra_name_prefix=None, inside_rec_time_dim=None, over_rec_time_dim=None, over_rec_time_dim_subs=None, control_flow_ctx=None, absolute_name_prefix=None, name='')[source]¶
The main neural network, i.e. collection of interconnected layers, i.e. computation graph with trainable params.
- Parameters:
config (returnn.config.Config) – only needed to init extern_data if not specified explicitly
extern_data (ExternData|None)
rnd_seed (int|None)
train_flag (bool|tf.Tensor) – True if we want to use this model in training, False if in eval, or dynamic
eval_flag (bool) – whether to calculate losses. if train_flag is not False, this will be set to True
search_flag (bool) – whether we perform a beam-search. see usage
parent_layer (returnn.tf.layers.base.LayerBase|None)
parent_net (TFNetwork|None)
extra_parent_net (TFNetwork|None) – we are on the same level (not really a child), but an “extra” net of extra_parent_net
extra_name_prefix (str|None)
inside_rec_time_dim (Dim|None) – dim tag of outer rec layer, when run inside the loop (not optimized)
over_rec_time_dim (Dim|None) – dim tag of outer rec layer, when optimized out of the loop
over_rec_time_dim_subs (set[Dim]|None) – outer rec layer, out of loop, potential shorter
control_flow_ctx (returnn.tf.util.data.ControlFlowContext)
absolute_name_prefix (str|None) – this is for representation
name (str) – only for debugging
- get_root_ctx_network()[source]¶
- Returns:
in contrast to
get_root_network()
, stop where we haveis_root_in_ctx
set, and return that network, together with the prefix- Return type:
(TFNetwork, str)
- get_absolute_name_scope_prefix()[source]¶
- Returns:
TF scope name, always with “/” at the end, or “”
- Return type:
str
- get_absolute_name_prefix()[source]¶
- Returns:
name, always with “/” at the end, or “”. This is for representation. See also
get_absolute_name_scope_prefix()
.- Return type:
str
- construct_from_dict(net_dict: Dict[str, Dict[str, Any]], get_layer=None)[source]¶
- Parameters:
net_dict
get_layer (GetLayer|((str)->LayerBase)|None)
- make_extra_net(prefix_name, net_name=None, only_template=False, boundary=False)[source]¶
-
- With boundary=False, it is accessible from outside via the “extra…:” layer name prefix,
and registered in main_net.extra_nets.
- With boundary=True, it is not accessible from outside,
and not registered in main_net.extra_nets.
- Parameters:
prefix_name (str) – “extra.Whatever”
net_name (str|None)
only_template (bool)
boundary (bool)
- Return type:
- construct_extra_net(net_dict, layer_list, search_flag=None, dep_layers_in_extra=False, check_existing=False, net_name=None, prefix_name=None, base_get_layer=None, base_add_layer=None)[source]¶
The purpose is to create another net like self but with different flags, e.g. with search_flag = True. That extra_net can have different losses, which will be added. Layers in
layer_list
will be explicitly re-created in the extra net. Other layers are taken fromself
. An extra net is like an overlay over the main net.The creation of the extra net and layers in the extra net can be triggered explicitly by referring to another layer as e.g.
"extra.search:layer"
. When done this way, all the dependencies of it are created in self again; unless you explicitly have called another layer like"extra.search:dep"
. Seetest_extra_search()
for an example.- Parameters:
net_dict (dict[str,dict[str]])
layer_list (list[str])
search_flag (bool|None)
dep_layers_in_extra (bool) – layers not in layer_list, but which are not yet created, will be part of the extra net, not self.
check_existing (bool)
net_name (str|None)
prefix_name (str|None) – e.g. “extra.search”, such that layers would be called like “extra.search:layer”
base_get_layer – like in construct_layer
base_add_layer – like in construct_layer
- Returns:
the layers created via layer_list (all in extra net)
- Return type:
list[LayerBase]
- construct_layer(net_dict: Dict[str, Dict[str, Any]], name: str, get_layer=None, add_layer=None, check_existing: bool = True) LayerBase [source]¶
This triggers the construction of the layer name if it is not constructed yet. Every construction trigger corresponds to
add_layer
call (which by default does the actual construction). This can recursively also get/construct other layers (viaget_layer
).- Parameters:
net_dict
name – layer name
get_layer (GetLayer|((str)->LayerBase)|None) –
optional, for source layers, for transform_config_dict. By default, this wraps self.construct_layer(). I.e. the name might be misleading, as this should return an existing layer, or construct it if it does not exist yet.
- Note on custom nested/wrapped get_layer:
This is tricky. When an outer get_layer calls an inner get_layer, then the inner get_layer might construct the layer, and this construction never can get back to the outer get_layer again. This is fine when this is anyway not allowed (e.g. to “base:…”, where the base net is not allowed to access this parent net). But otherwise, this is not an option!
add_layer (((str, LayerBase, dict) -> LayerBase) | None) – by default self.add_layer
check_existing – check self.get_layer. (self.layers will be checked in any case)
- Returns:
layer
- add_layer(name, layer_class, **layer_desc)[source]¶
This will construct the layer given the layer_desc arguments, and add it to the network.
- Parameters:
name (str)
layer_class ((()->LayerBase)|LayerBase)
layer_desc – contains the kwargs for the layer class. the args should have been transformed via layer_class.transform_config_dict before (see construct_layer). must not contain “name” and “network”, which will be automatically added here. should not contain “output”, which will be initialized to layer_class.get_out_data_from_opts. the layer_class will usually then define the layer.output and its placeholder. there is one notable exception: the InternalLayer, where you predefine the output.
- get_extern_data(key, mark_data_key_as_used=True)[source]¶
Returns Data and add the key to self.used_data_keys if mark_data_key_as_used. :param str key: e.g. “data” or “classes” :param bool mark_data_key_as_used: :rtype: Data
- get_used_data_keys(exclude_extra_added=True)[source]¶
- Parameters:
exclude_extra_added (bool)
- Return type:
set[str]
- get_seq_tags(mark_data_key_as_used=True, beam=None)[source]¶
- Parameters:
mark_data_key_as_used (bool) – for extern_data
beam (returnn.tf.util.data.SearchBeam|None)
- Returns:
tensor of shape (batch,) of dtype string, via extern_data
- Return type:
tf.Tensor
- get_losses_initialized(reduce_func=None, with_total=False)[source]¶
- Parameters:
reduce_func (((tf.Tensor)->tf.Tensor)|None) – as in get_losses. e.g. TFUtil.identity
with_total (bool) – whether to return total loss / constraints
- Returns:
loss name (e.g. “output” or “rec_layer/output” or so) -> LossHolder (initialized, i.e. layer set), and optionally total loss and total constraints (if with_total)
- Return type:
(dict[str,LossHolder], tf.Tensor|int|None, tf.Tensor|int|None)
- get_objective()[source]¶
- Return type:
int|tf.Tensor
- Returns:
0 if no loss, or tf.Tensor, scalar. loss + constraints. will be used for the updater.
- get_total_loss()[source]¶
- Return type:
int|tf.Tensor
- Returns:
0 if no loss, or tf.Tensor, scalar. without constraints. will be used for the updater
- get_total_constraints()[source]¶
- Return type:
int|tf.Tensor
- Returns:
0 if no constraints, or tf.Tensor, scalar. will be used for the updater
- get_fetches_dict(config=None, should_train=None, should_eval=None, with_summary=False, with_size=False, horovod_collected_reduce_inputs=None)[source]¶
- Parameters:
config (returnn.config.Config|None)
should_train (bool|None)
should_eval (bool|None)
with_summary (bool)
with_size (bool)
horovod_collected_reduce_inputs (dict[str,(tf.Tensor,tf.Tensor)]|None) – will write into. see below
- Returns:
values and actions which should be calculated and executed in self.run() by the TF session for each step
- Return type:
dict[str,tf.Tensor|tf.Operation]
- get_default_output_layer_name()[source]¶
- Return type:
str|None
- Returns:
default output layer name if there is one, or None
- get_default_output_layer(must_exist=True)[source]¶
- Parameters:
must_exist (bool) – if it does not exist, will raise an exception
- Return type:
LayerBase|None
- Returns:
the default output layer
- get_layer(layer_name)[source]¶
Normally just self.layers[layer_name] but with some extra logic added, such as resolving “base:” prefix to the parent network. Raises
LayerNotFound
if the layer is not found.- Parameters:
layer_name (str)
- Return type:
- get_all_layers_shallow()[source]¶
- Returns:
layers, including extra net, not including sub layers
- Return type:
list[LayerBase]
- get_all_layers_deep()[source]¶
- Returns:
all layers, including extra net, including sub layers. duplicates are made unique. It might exclude internal layers. We ensure that layers are unique by their absolute name.
- Return type:
list[LayerBase]
- get_params_list()[source]¶
- Returns:
list of model variables, i.e. from all the layers, excluding auxiliary vars like global_step
- Return type:
list[tf.Variable]
- get_saveable_param_replace_dict()[source]¶
- Returns:
params and saveable_param_replace resolved, union of all layers
- Return type:
dict[tf.Variable,tensorflow.python.training.saver.BaseSaverBuilder.SaveableObject]
- get_saveable_params_list()[source]¶
- Returns:
list of model variables or SaveableObject, to save/restore
- Return type:
list[tf.Variable|tensorflow.python.training.saver.BaseSaverBuilder.SaveableObject]
- declare_train_params(hidden_layer_selection=None, with_output=None, global_trainable=None)[source]¶
- Parameters:
hidden_layer_selection (list[str]|None)
with_output (bool|None)
global_trainable (bool|None)
- get_num_params()[source]¶
- Returns:
number of model parameters, i.e. total dimension
- Return type:
int
- initialize_params(session)[source]¶
- Parameters:
session (tf.compat.v1.Session)
Note: This will create a new node to the graph for each call! And it will overwrite also the already initialized variables. So you should call this only once after network construction and before you maybe load some of the params from external sources. If you know that you will load all params explicitly, you would not need to call this function.
- get_param_values_dict(session)[source]¶
- Parameters:
session (tf.compat.v1.Session)
- Returns:
dict: layer_name -> param_name -> variable numpy array
- Return type:
dict[str,dict[str,numpy.ndarray]]
Note that this excludes auxiliary params.
- set_param_values_by_dict(values_dict, ignore_non_existing=False, **kwargs)[source]¶
- Parameters:
values_dict (dict[str,dict[str,numpy.ndarray]])
ignore_non_existing (bool)
kwargs – passed to
LayerBase.set_param_values_by_dict()
Note that this excludes auxiliary params.
- set_params_by_serialized(serialized, session, **kwargs)[source]¶
- Parameters:
serialized (TFNetworkParamsSerialized)
session (tf.compat.v1.Session)
kwargs – passed to
set_param_values_by_dict()
- reset_saver()[source]¶
Resets the
tf.train.Saver
object which will be used forload_params_from_file()
andsave_params_to_file()
. Warning: Don’t repeat that too often as it will always create new ops in the computation graph.
- save_params_to_file(filename, session)[source]¶
Will save the model parameters to the filename. Note that the model parameters live inside the current TF session.
- Parameters:
filename (str)
session (tf.compat.v1.Session)
- load_params_from_file(filename, session)[source]¶
Will load the model parameters from the filename. Note that the model parameters live inside the current TF session.
- Parameters:
filename (str)
session (tf.compat.v1.Session)
- print_network_info(name='Network')[source]¶
- Parameters:
name (str)
- Returns:
nothing, prints very brief net topology on log
- cond_on_train(fn_train, fn_eval)[source]¶
Uses fn_train() or fn_eval() base on self.train_flag. It will be a branched evaluation.
- Parameters:
fn_train (()->(tf.Tensor|T))
fn_eval (()->(tf.Tensor|T))
- Returns:
fn_train() if self.train_flag else fn_eval()
- Return type:
tf.Tensor|T
- get_search_choices(sources=None, src=None, base_search_choice=None, _layer_to_search_choices=None, debug_stream=None)[source]¶
Recursively searches through all sources, and if there is a
ChoiceLayer
/ any layer with search_choices, returns it. Could also go to the parent network. If there are multiple, it assumes they are on the same search-sequence in the search-tree and it will return the last one.- Parameters:
- Returns:
(direct or indirect) source LayerBase which has search_choices, or None
- Return type:
LayerBase|None
- debug_search_choices(base_search_choice)[source]¶
- Parameters:
base_search_choice (LayerBase)
- Returns:
nothing, by intention, such that constructs like assert …, debug_search_choices(…) or (…) work
- get_data_batch_dim()[source]¶
Get the batch-dim size, i.e. amount of sequences in the current batch. Consider that the data tensor is usually of shape [batch, time, dim], this would return shape(data)[0].
The code currently assumes that the batch-dim can be taken from the extern data. If it does not have that available for some reason (e.g. some subnetwork), it will try some alternative sources and assumes that they have the correct batch-dim.
Note that the batch-dim usually stays always the same across the whole network and also every individual batch sequence will stay related. One notable exception of this is the choice layer, where the batch-dim will get expanded by the beam search if search is used, as well as in all following layers, until there is a decide layer.
- Returns:
int scalar tensor which states the batch-dim
- Return type:
int|tf.Tensor
- get_global_batch_info()[source]¶
- Returns:
global batch info from root network from extern data
- Return type:
- set_rec_step_info(i, prev_end_flag=None, prev_end_layer=None, seq_lens=None)[source]¶
Used by _SubnetworkRecCell.
- Parameters:
i (tf.Tensor) – scalar, int32, current step (time)
prev_end_flag (tf.Tensor|None) – (batch,), bool, says that the current sequence has ended. This is about the last frame, not the current!
prev_end_layer (LayerBase|None)
seq_lens (tf.Tensor|None) – (batch,) int32, seq lens
- is_inside_rec_layer(inside_loop=True)[source]¶
- Parameters:
inside_loop (bool) – only True if we are inside the loop of the most recent rec layer
- Returns:
whether we are inside a
RecLayer
(with inside_loop: and not optimized out-of-the-loop). At template construction inside a rec layer, this is always true, but the rec layer itself does not exist yet.- Return type:
bool
Also see
get_inside_rec_time_dim()
andget_rec_parent_layer()
.
- get_inside_rec_time_dim(inside_loop=True)[source]¶
- Parameters:
inside_loop (bool) – only True if we are inside the loop of the most recent rec layer
- Returns:
when the net is inside a rec loop (
RecLayer
and not optimized out of the loop), this returns the dim tag the rec layer iterates over- Return type:
Dim|None
- get_all_rec_time_dims()[source]¶
- Returns:
all rec time dims, moved out or not, including all parents
- Return type:
set[Dim]
- get_rec_parent_layer(inside_loop=True)[source]¶
- Parameters:
inside_loop (bool) – only return if the network is constructed within the loop (not moved out) of the most recent parent rec layer
- Returns:
if we are a subnet of a
RecLayer
, will return the RecLayer instance. At template construction time, this is always None.- Return type:
- get_rec_step_info(must_exist=True)[source]¶
- Parameters:
must_exist (bool) – if True, will throw exception if not available
- Return type:
- get_rec_step_index()[source]¶
Assumes that have_rec_step_info is True.
- Return type:
tf.Tensor
- Returns:
scalar, int32
- get_config(consider_global_config=True, fallback_dummy_config=True)[source]¶
- Parameters:
consider_global_config (bool) – if no config is set, check for global config
fallback_dummy_config (bool) – if no config, return a new empty Config, otherwise return None
- Return type:
- static register_post_control_dependencies(deps)[source]¶
Will register the control dependencies or globally for a session run on this network. This can e.g. be called inside self.post_init. We use UPDATE_OPS, as that is also e.g. used by batchnorm. See:
- Parameters:
deps (list[tf.Tensor|tf.Operation])
- Returns:
nothing
- register_graph_reset_callback(cb)[source]¶
Note: These callbacks are not called automatically. You explicitly have to call
call_graph_reset_callbacks()
.Note: We don’t store this in the graph itself (e.g. via tf.get_collection), as we don’t want to serialize this (which would also lead to an error, because it cannot be serialized).
Note: Currently these callbacks might get called multiple times, so make sure that this is not a problem. Also make sure that the network/session is still in a valid state after this has been called, e.g. such that further session runs would still work correctly.
Note: These callbacks will only be called if there was not any error.
- Parameters:
cb (function|()->None)
- call_graph_reset_callbacks()[source]¶
Calls any callbacks registered via
register_graph_reset_callback()
.
- set_run_opts(epoch, dataset_name)[source]¶
The run options are valid during one loop over some dataset.
Contrary to epoch_step, train_flag, etc, we do not provide these as TF placeholders, for convenience, because it is not needed right now. If it is needed, it probably is easier to introduce auxiliary TF variables (on CPU) instead and just set them once here.
- Parameters:
epoch (int)
dataset_name (str|None)
- set_run_finished(error_occurred=False)[source]¶
Maybe calls any callbacks registered via
register_run_finished_callback()
(if no error occurred) and cleans up the run opts.- Parameters:
error_occurred (bool)
- classmethod get_current_network(must_exist=True)[source]¶
- Parameters:
must_exist (bool)
- Return type:
TFNetwork|None
- register_network_scope()[source]¶
Registers a ref to this network inside the current TF computation graph.
- get_search_choices_from_beam(beam)[source]¶
Currently we have somewhat redundant information in
returnn.tf.util.data.SearchBeam
(which is totally independent from other things in RETURNN (which is good)) andreturnn.tf.layers.base.SearchChoices
(which is more dependent on the RETURNN layers,and has some more info).
The
Data
(which is also independent from other things in RETURNN (which is also good)) only knows aboutreturnn.tf.util.data.SearchBeam
but not aboutreturnn.tf.layers.base.SearchChoices
. Thus there are situations where we only have a ref to the former, but like to get a ref to the latter.Note that this might (hopefully) get cleaned up at some point…
- Parameters:
- Return type:
- register_search_choices_for_beam(beam, search_choices)[source]¶
- Parameters:
search_choices (returnn.tf.layers.base.SearchChoices)
- class returnn.tf.network.Subnetwork(parent_net, name, opts=None)[source]¶
Represents a subnetwork.
Despite the different namespace, optionally some variable sharing, and optionally some custom input data, layers behave just as in the root network, with the same dependency resolution (both ways). I.e. a layer outside can depend only on a single sub layer and not the whole subnetwork (in contrast to
LayerBase.get_sub_layer()
).This is usually used with
SubnetworkLayer
, viaLayerBase:cls_get_sub_network()
.This works for custom calls on
TFNetwork.construct_layer()
with customget_layer
oradd_layer
e.g. in template construction from theRecLayer
subnetwork and doesn’t require extra logic for this.This has also a mode to start its own template construction, for the case this layer is embedded in another layer (e.g.
CondLayer
orMaskedComputationLayer
, in contrast toSubnetworkLayer
). This is triggered by a special type of extra parent network withextra_only_template
set. This implies that the parent (non-extra) network can not directly access the sub network, which is important for the template construction here (see_construct_template_subnet()
).A special extra parent can also have the
extra_boundary
flag set, which triggers that we have our own construction code (but not using templates, but constructing the real layers). This is used also for the embedded case (e.g.MaskedComputationLayer
). This is needed when the parent (non-extra) network cannot directly access this sub network.- Parameters:
parent_net (TFNetwork)
name (str)
opts (dict[str]|None)
- construct_layer(name, parent_get_layer=None)[source]¶
With default parent_get_layer, this will not trigger recursive constructions in the parent net, but any recursive construction in this subnet.
- construct_all(parent_get_layer=None)[source]¶
Trigger the standard construction of all layers in the net dict.
- Parameters:
parent_get_layer (GetLayer|((str)->LayerBase)|None)
- class returnn.tf.network.TFNetworkParamsSerialized(values_dict, global_train_step)[source]¶
Holds all the params as numpy arrays, including auxiliary params.
- Parameters:
values_dict (dict[str,dict[str,numpy.ndarray]]) – dict: layer_name -> param_name -> variable numpy array
global_train_step (int)
- class returnn.tf.network.GetLayer(network, net_dict=None, subnetwork=None, add_layer_func=None, parent_get_layer=None)[source]¶
Helper object which represents the get_layer function which also triggers layer construction. This is implemented to better handle subnetworks and to avoid a deep stack of get_layer functions. Instead of defining another wrapped get_layer function, any subnetwork can instead create a new instance of this object. https://github.com/rwth-i6/returnn/issues/993
- Parameters:
network (TFNetwork)
net_dict (dict[str]|None)
subnetwork (Subnetwork|None)
add_layer_func (((str,LayerBase,dict)->LayerBase)|None) – by default TFNetwork.add_layer
parent_get_layer (GetLayer|((str)->LayerBase)|None)
- class returnn.tf.network.LossHolder(name, loss, layer_output, reduce_func=None, layer=None, loss_value=None, error_value=None, norm_factor=None, only_on_eval=None, network=None)[source]¶
This object just keeps a reference to the loss/error value, and does the necessary logic to collect it, and also the normalization logic. Every new computation (nodes in the computation graph) must be constructed on demand, to allow first to collect all possible losses without calculating them, and then calculating them in the right context (e.g. inside a while_loop, or so).
After construction, you should call init() before usage, in case you do not provide layer here.
- Parameters:
name (str) – The name uniquely identifies the loss. Earlier, this was the same as the layer name. This is still true for simple cases, but for losses coming from a subnetwork or other extended losses, it can be something else. It could look like “output”, or “output/sublayer”.
layer (LayerBase) – We can always point to a layer where this comes from (either in the subnet, or the parent layer).
layer_output (Data) – template describing the layer output
network (TFNetwork) – for which network to create this LossHolder. might be different from layer.network
loss (returnn.tf.layers.base.Loss)
reduce_func (((tf.Tensor)->tf.Tensor)|None) – if given, will overwrite the reduce func for the loss. By default, every loss_value and error_value is a scalar (sum or average over the batches, and over the frames for frame-wise losses). However, if you provide reduce_func = TFUtil.identity, you can get the unreduced tensor.
loss_value (tf.Tensor|None)
error_value (tf.Tensor|None)
norm_factor (tf.Tensor)
only_on_eval (bool)
- init(layer)[source]¶
It will just set the layer. The LossHolder is initialized if the layer is set.
- Parameters:
layer (LayerBase)
- Returns:
self
- Return type:
- get_tf_name()[source]¶
- Returns:
name which can be used for a TF op, thus contains no “/” or other special chars
- Return type:
str
- get_loss_value_for_fetch()[source]¶
- Returns:
loss value for fetch. scalar. same as loss_value, but maybe with additional checks
- Return type:
tf.Tensor|None
- get_loss_value_for_objective()[source]¶
- Returns:
loss value for objective. scalar. might be scaled (scale) and/or normalized (use_normalized_loss)
- Return type:
tf.Tensor|None
- get_normalized_loss_value_per_seq()[source]¶
- Returns:
(batch,) or None if loss is None
- Return type:
tf.Tensor|None
- get_normalized_error_value_per_seq()[source]¶
- Returns:
(batch,) or None if error is None
- Return type:
tf.Tensor|None
- get_loss_value_per_pos()[source]¶
- Returns:
(batch,time) or None if loss is None
- Return type:
tf.Tensor|None
- get_error_value_per_pos()[source]¶
- Returns:
(batch,time) or None if error is None
- Return type:
tf.Tensor|None
- exception returnn.tf.network.NetworkLayerException(message, layer_name, network, net_dict=None)[source]¶
Some exception by the network, e.g. during construction.
- Parameters:
message (str)
layer_name (str)
network (TFNetwork)
net_dict (dict[str]|None)
- exception returnn.tf.network.NetworkConstructionDependencyLoopException(network, layer_name, constructing_layers, net_dict)[source]¶
This is raised when there is a dependency loop in the network construction.
- Parameters:
network (TFNetwork)
layer_name (str)
constructing_layers (list[str])
net_dict (dict[str,dict[str]])
- exception returnn.tf.network.LayerNotFound(message, layer_name, network, net_dict=None)[source]¶
-
- Parameters:
message (str)
layer_name (str)
network (TFNetwork)
net_dict (dict[str]|None)
- returnn.tf.network.help_on_tf_exception(session, exception, fetches, feed_dict=None, meta_step_info=None, extern_data=None, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]¶
Generic debugging helper, on any TF exception (or even any other exception as well). Will try to provide as much helpful context information as possible. (This is not in
TFUtil
because it depends on ExternData, which is only defined here.)- Parameters:
session (tf.compat.v1.Session)
exception (tf.errors.OpError|BaseException)
fetches (tf.Tensor|list[tf.Tensor]|dict[str,tf.Tensor]|object|None)
feed_dict (dict[tf.Tensor,numpy.ndarray]|None)
meta_step_info (dict[str]|None)
extern_data (ExternData|None)
file (IO[str]|io.TextIOBase|io.StringIO)
- class returnn.tf.network.CustomCheckpointLoader(filename, saveable_params, params_prefix='', load_if_prefix='', ignore_missing=False, ignore_params=(), ignore_params_prefixes=(), var_name_mapping=None, network=None, custom_missing_load_func: CustomLoadParamFunc | None = None)[source]¶
This uses tf.train.NewCheckpointReader.
It would do automatic conversions if needed, e.g. between different LSTM implementations. However, be careful that for some LSTM implementation, there is an additional
forget_bias
option, which is an additional scalar which gets added (not to the param, but to the forget value directly). When we convert the parameters, this is ignored, and you must take care about that explicitly to make sure you get the same results.It tries to automatically resolve renames, similar to this:
Also see:
- Parameters:
filename (str) – filepattern for NewCheckpointReader or .index/.meta file path
saveable_params (list[tf.Variable|tensorflow.python.training.saver.BaseSaverBuilder.SaveableObject])
params_prefix (str) – expect that all vars in saveable_params have this prefix, and remove it
load_if_prefix (str) – if given, only load variables with a name containing this string. the variables in the file are expected to have the same name but without this string.
ignore_missing (bool) – any vars in the model, which are not found in the checkpoint, will be ignored. however, if there is no single var in the checkpoint, this is still an error.
ignore_params (Container[str]) – these param (by name) will not be loaded
ignore_params_prefixes (Iterable[str]) – these param (by prefix name) will not be loaded
var_name_mapping (dict[str,str]) – defines a custom mapping (new_name -> name_in_checkpoint) for renamed vars in the checkpoint
network (TFNetwork)
custom_missing_load_func
- class CustomParamImporter(layer, checkpoint_loader)[source]¶
Helper class for custom param loading.
- Parameters:
layer (LayerBase)
checkpoint_loader (CustomCheckpointLoader)
- class VariableValue(value=None, custom_param_importer=None)[source]¶
Helper to assign some variable.
- Parameters:
value (numpy.ndarray|None)
custom_param_importer (CustomCheckpointLoader.CustomParamImporter|None)
- get_variable_value_map()[source]¶
- Returns:
var -> numpy array
- Return type:
dict[tf.Variable,CustomCheckpointLoader.VariableValue]
- class returnn.tf.network.CustomLoadParamFunc(*args, **kwargs)[source]¶
This is a custom param importer function.
- returnn.tf.network.set_custom_post_init(var, func)[source]¶
It registers the provided func such that it gets called for this variable in
TFNetwork.initialize_params()
.- Parameters:
var (tf.Variable)
func ((tf.compat.v1.Session)->None)
- returnn.tf.network.have_custom_post_init(var)[source]¶
- Parameters:
var (tf.Variable)
- Returns:
whether
set_custom_post_init()
was called on this var, i.e. we have custom init- Return type:
bool