TFUtil

class TFUtil.Condition(lock=None, name='Condition')[source]

A pure TensorFlow implementation of a condition.

init()[source]
signal()[source]

Must be called with the lock held. Emits one signal.

signal_all()[source]

Must be called with the lock held. Emits as many signals as they are waiters.

wait()[source]

Must be called with the lock held, will unlock while waiting for a signal.

wait_counter()[source]
class TFUtil.CudaEnv[source]
get_compiler_bin()[source]
get_compiler_opts()[source]
classmethod get_instance()[source]
Return type:CudaEnv
is_available()[source]
verbose_find_cuda = False[source]
class TFUtil.CustomGradient[source]
Defun(*input_types, **kwargs)[source]
Parameters:
  • tf.Tensor) -> tf.Tensor grad_op ((tf.Operation,) –
  • input_types (list[tf.DType]) –
  • kwargs (dict[str]) – passed to self.register()
Returns:

function decorator

Return type:

((tf.Tensor) -> tf.Tensor) -> ((tf.Tensor) -> tf.Tensor)

register(input_types, op, grad_op, name=None)[source]
Parameters:
  • input_types (list[tf.DType]) –
  • -> tf.Tensor op ((tf.Tensor)) –
  • tf.Tensor) -> tf.Tensor grad_op ((tf.Operation,) – args are (op, out_grad) and it must return in_grad
  • name (str) – optional func_name
Returns:

op

Return type:

(tf.Tensor) -> tf.Tensor

register_loss_and_error_signal(loss, x, grad_x, name=None)[source]

Wrapper around self.register(). Expects that loss = loss(x), and grad_x = partial loss / partial x.

Parameters:
  • loss (tf.Tensor) –
  • x (tf.Tensor) –
  • grad_x (tf.Tensor) –
  • name (str) – optional func_name
Returns:

loss but with the gradient for x

Return type:

tf.Tensor

class TFUtil.CustomUpdate[source]
set_on_var(var)[source]
Parameters:var (tf.Variable) – variable to update. this will be recognized by TFUpdater.Updater
update_var(var)[source]
Parameters:var (tf.Variable) – variable to update
Returns:operation which updates the variable, e.g. tf.assign_add(var, something)
Return type:tf.Operation
class TFUtil.CustomUpdateExpAverage(average, alpha)[source]

exponential moving average

Parameters:
  • average (tf.Tensor) –
  • alpha (float) –
update_var(var)[source]
class TFUtil.Data(name, shape=None, dtype=None, placeholder=None, sparse=None, dim=None, size_placeholder=None, batch_dim_axis=0, time_dim_axis=<class 'Util.NotSpecified'>, available_for_inference=True, auto_create_placeholders=False, beam_size=None)[source]

This class is to describe a tensor, i.e. it’s shape and properties like whether we should consider it as sparse data (i.e. it represents indices). This is used in TFNetwork to describe the dataset external data as well as for every layer output.

Parameters:
  • name (str) –
  • shape (tuple[int|None]|list[int|None]) – including time-dim (can be None). excluding batch-dim. e.g. (time,feat)=(None,128)
  • dtype (str) – e.g. “float32” or “int64”
  • placeholder (tf.Tensor|None) – with added batch-dim
  • sparse (bool) – whether to treat the value as an index. do not confuse with tf.SparseTensor
  • dim (None|int) – feature dimension, shape[-1] if not sparse, otherwise like num_classes
  • batch_dim_axis (int|None) – where we add the batch-dim. e.g. shape=(time,...), 0 -> (batch,time,...), 1 -> (time,batch,...). This is normally always set, and a lot of code expects this. However, you can set it to None if this Data does not have a batch-dim.
  • time_dim_axis (int|None) – where we have the time dim axis, after we added the batch-dim. this is often 1. however, can be None if there is no time-dim.
  • tf.Tensor size_placeholder (dict[int,tf.Tensor]) – for every None in shape, this will describe the size. The size is always a tensor of shape (batch,), i.e. the size can be different for each sequence in a batch.
  • available_for_inference (bool) – e.g. the extern data “classes” is usually not available for inference
  • beam_size (int|None) – the batch-dim could be extended by a beam-size, such that it represents the merged dims [batch, beam_size].
SpecialAxesNames = ('batch_dim_axis', 'time_dim_axis', 'feature_dim_axis')[source]
batch_ndim[source]
Return type:int
Returns:ndim counted with batch-dim
batch_shape[source]
Returns:shape with added batch-dim. e.g. (batch,time,feat) = (None,None,128)
Return type:tuple[int|None]
copy()[source]
Returns:copy of myself, using self.get_kwargs(), and with placeholder and size_placeholder
Return type:Data
copy_as_batch_major()[source]
Returns:copy of myself with batch_dim_axis == 0
Return type:Data
copy_as_time_major()[source]
Returns:copy of myself with time_dim_axis == 0
Return type:Data
copy_extend_with_beam(beam_size)[source]
Parameters:beam_size (int) –
Returns:copy of myself where the batch-dim is extended/multiplied by beam_size, using tile_transposed
Return type:Data
copy_template(name=None)[source]
Returns:copy of myself, using self.get_kwargs(), without placeholder
Return type:Data
copy_template_adding_time_dim(name=None, time_dim_axis=0)[source]
Parameters:
  • name (str|None) – if set, this will be the new name
  • time_dim_axis (int) – the new time-dim-axis index
Returns:

copy of myself adding the time-dimension without placeholder

Return type:

Data

copy_template_excluding_time_dim(name=None)[source]
Parameters:name (str|None) – if set, this will be the new name
Returns:copy of myself excluding the time-dimension without placeholder
Return type:Data
copy_time_flattened()[source]
Returns:copy of myself where the time-axis is flattened away into the batch-dim-axis. See get_placeholder_time_flattened() and :func:`flatten_with_seq_len_mask for more details.
Return type:Data
copy_with_batch_dim_axis(batch_dim_axis)[source]
Parameters:batch_dim_axis (int) –
Returns:copy of myself with specific batch_dim_axis
Return type:Data
feature_dim_axis[source]
get_axes(exclude_time=False, exclude_batch=False)[source]
Parameters:
  • exclude_time (bool) – will filter out the time-axis
  • exclude_batch (bool) – will filter out the batch-axis
Returns:

list of axes, like range(len(self.shape)), calculated with batch dim.

Return type:

list[int]

get_axes_from_description(axes)[source]
Parameters:axes (int|list[int]|str|list[str]) – one axis or multiple axis. This is counted with batch-dim, which by default is axis 0 (see enforce_batch_dim_axis). It also accepts the special tokens “B”|”batch”, “spatial”, “spatial_except_time”, or “F”|”feature”, and more (see the code).
Returns:list of axes, counted with batch-dim
Return type:list[int]
get_axes_with_size()[source]
Returns:list of axes which can vary in size for each entry of the batch-dim, e.g. the time-dim-axis. The axis index is counted without the batch-dim.
Return type:list[int]
get_axis_from_description(axis)[source]
Parameters:axis (int|str) –
Returns:axis, counted with batch-dim
Return type:int
get_batch_axis(axis)[source]
Parameters:axis (int) – counted without batch-dim
Returns:axis counted with batch-dim
Return type:int
get_batch_axis_excluding_batch(axis)[source]
Parameters:axis (int) – counted with batch-dim
Returns:axis counted without batch-dim
Return type:int
get_bc_spatial_batch_shape()[source]
Returns:shape which will broadcast along all spatial dimensions and time/batch dim
Return type:tuple[int]
get_description(with_name=True, with_placeholder=False)[source]
get_feature_axes()[source]
Return type:list[int]
Returns:list of axes which are feature axes, counted without batch-dim.
get_feature_batch_axes()[source]
Return type:list[int]
Returns:list of axes which are feature axes, counted with batch-dim. currently there is only one or zero such axis.
get_kwargs()[source]
get_placeholder_as_batch_major()[source]
get_placeholder_as_time_major()[source]
get_placeholder_flattened(keep_dims=False)[source]
Parameters:keep_dims (bool) – if set, it will add broadcast dimensions after the flattening behind the first axis
Return type:tf.Tensor
Returns:placeholder where all dynamic axes are flattened into a single axis. e.g. for the usual case (batch, time, dim), it becomes (batch’|time’, dim), or (batch, time, height, dim) will also become (batch’|time’, dim). with keep_dims, (batch, time, height, dim) will become (batch’|time’, 1, 1, dim).
get_placeholder_kwargs(with_batch=True)[source]
get_placeholder_time_flattened()[source]
get_placeholder_with_specific_batch_dim_axis(batch_dim_axis)[source]
get_sequence_lengths()[source]
Returns:seq lens tensor of shape (batch,) of dtype int32
Return type:tf.Tensor
get_size_placeholder_kwargs(axis, with_batch=True)[source]
get_spatial_axes()[source]
Return type:list[int]
Returns:list of axes which are not feature and batch axes, counted without batch-dim.
get_spatial_batch_axes()[source]
Return type:list[int]
Returns:list of axes which are not feature and batch axes, counted with batch-dim.
get_special_axes_dict(counted_with_batch_dim=True, include_batch_dim_axis=False, only_available=False)[source]
Parameters:
  • counted_with_batch_dim (bool) –
  • include_batch_dim_axis (bool) –
  • only_available (bool) –
Returns:

dict axis-name -> axis

Return type:

dict[str,int]

have_batch_axis()[source]
have_time_axis()[source]
is_batch_major[source]
Returns:whether this is in batch-major format, i.e. (batch,...)
Return type:bool
is_time_major[source]
Returns:whether this is in time-major format, i.e. (time,batch,...)
Return type:bool
matches_var_dim_pattern(other)[source]
Parameters:other (Data) –
Returns:whether the variable-dims pattern matches, i.e. same variable dims (get_variable_dim_pattern), same time dim, excluding batch-dim. i.e. the size_placeholder should be compatible.
Return type:bool
ndim[source]
Return type:int
Returns:ndim counted without batch-dim
ndim_dense[source]
Return type:int
Returns:ndim counted without batch-dim, added by 1 if we are sparse
shape_dense[source]
size_dtype = 'int32'[source]
time_dim_axis_excluding_batch[source]
time_dimension()[source]
Returns:shape(placeholder)[time_dim_axis], int scalar
Return type:tf.Tensor
class TFUtil.ExplicitRandomShuffleQueue(capacity, min_after_dequeue=0, dtypes=None, shapes=None, names=None, seed=None, shared_name=None, name='explicit_random_shuffle_queue')[source]

This is intended to behave very much like tf.RandomShuffleQueue, except that it’s implemented by other TF native ops / data structures, and you can change min_after_dequeue at runtime. This means that if you have your own logic about when to end, you can set min_after_dequeue=0 and dequeue all the remaining entries from the queue, and then later increase min_after_dequeue again. You can also start with a small min_after_dequeue and increase the number steadily. The original tf.RandomShuffleQueue had the effect of a reset min_after_dequeue=0 after you closed the queue. However, there was no way to reopen the queue. That is the whole reason this implementation exists.

One difference of this implementation is that you must call the init() op once before usage.

One way to implement this is in pure TF. We need some TF container type which supports having entries of different shapes (where the shape can differ where-ever we specified None). We also need some TF container which we can access by index. tf.TensorArray can handle that.

Another way to implement this is by multiple stateful tf.py_func which all reference this instance.

Parameters:
  • capacity (int) –
  • min_after_dequeue (int|tf.Tensor) –
  • dtypes (list[str|tf.DType]) –
  • shapes (list[tuple[int|tf.Tensor|None]]) –
  • names (list[str]|None) –
  • seed (int) –
  • shared_name (str|None) –
  • name (str) –
dequeue()[source]
enqueue(v)[source]
Parameters:v (list[tf.Tensor]|dict[str,tf.Tensor]|tf.Tensor) –
Return type:tf.Operation
init()[source]
Return type:tf.Operation
min_after_dequeue_assign(min_after_dequeue)[source]
Parameters:min_after_dequeue (tf.Tensor) –
Return type:tf.Operation
min_after_dequeue_read()[source]
size()[source]
Return type:tf.Tensor
class TFUtil.FlipGradientBuilder[source]

Gradient Reversal Layer. Discussion:

Code from here:
https://github.com/pumpikano/tf-dann/blob/master/flip_gradient.py
class TFUtil.GlobalTensorArrayOpMaker[source]

Creates a TensorArray which does not use the per-run (“per-step”) resource manager container but uses the standard container which persists across runs. This TensorArray resource handle is then just a standard TensorArray resource handle which can be used with all TensorArray related functions/ops.

Note: This whole implementation currently does not work because tensor_array.h is not available. See https://github.com/tensorflow/tensorflow/issues/10527 and test_GlobalTensorArray().

An alternative to this might be the MapStagingArea (https://github.com/tensorflow/tensorflow/pull/9686), which should get into TF 1.2.2.

code = '\n #include "tensorflow/core/framework/op_kernel.h"\n #include "tensorflow/core/framework/register_types.h"\n #include "tensorflow/core/framework/resource_mgr.h"\n #include "tensorflow/core/framework/tensor.h"\n #include "tensorflow/core/framework/tensor_shape.h"\n #include "tensorflow/core/framework/types.h"\n #include "tensorflow/core/kernels/bounds_check.h"\n #include "tensorflow/core/kernels/tensor_array.h"\n #include "tensorflow/core/lib/core/errors.h"\n #include "tensorflow/core/lib/core/refcount.h"\n #include "tensorflow/core/lib/strings/strcat.h"\n #include "tensorflow/core/platform/dynamic_annotations.h"\n #include "tensorflow/core/platform/logging.h"\n #include "tensorflow/core/platform/thread_annotations.h"\n #include "tensorflow/core/platform/types.h"\n\n using namespace tensorflow;\n \n // Adopted from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/data_flow_ops.cc.\n REGISTER_OP("GlobalTensorArray")\n .Input("size: int32")\n .Attr("container: string = \'\'")\n .Attr("shared_name: string = \'\'")\n .Attr("dtype: type")\n .Attr("element_shape: shape = { unknown_rank: true }")\n .Attr("dynamic_size: bool = false")\n .Attr("clear_after_read: bool = true")\n .Attr("tensor_array_name: string = \'\'")\n .Output("handle: resource")\n .Output("flow: float")\n .SetIsStateful()\n .SetShapeFn([](InferenceContext* c) {\n ShapeHandle unused;\n TF_RETURN_IF_ERROR(c->WithRank(c->input(0), 0, &unused));\n c->set_output(0, c->Vector(2));\n c->set_output(1, c->Scalar());\n return Status::OK();\n })\n .Doc("GlobalTensorArray, persistent across runs");\n \n // Copied from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/tensor_array_ops.cc,\n // and https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/resource_op_kernel.h.\n // The original TensorArrayOp used the per-run ("per-step") resource manager container\n // but we use the standard container which persists across runs.\n class GlobalTensorArrayOp : public OpKernel {\n public:\n explicit GlobalTensorArrayOp(OpKernelConstruction* context)\n : OpKernel(context), device_type_(context->device_type()) {\n OP_REQUIRES_OK(context, context->GetAttr("dtype", &dtype_));\n OP_REQUIRES_OK(context, context->GetAttr("element_shape", &element_shape_));\n OP_REQUIRES_OK(context, context->GetAttr("dynamic_size", &dynamic_size_));\n OP_REQUIRES_OK(context,\n context->GetAttr("clear_after_read", &clear_after_read_));\n OP_REQUIRES_OK(context,\n context->GetAttr("tensor_array_name", &tensor_array_name_));\n if (tensor_array_name_.empty()) tensor_array_name_ = name();\n\n AllocatorAttributes alloc_attr;\n alloc_attr.set_on_host(true);\n OP_REQUIRES_OK(context, context->allocate_persistent(\n tensorflow::DT_STRING, tensorflow::TensorShape({2}),\n &handle_, alloc_attr));\n }\n \n ~GlobalTensorArrayOp() {\n if (resource_ != nullptr) {\n resource_->Unref();\n if (cinfo_.resource_is_private_to_kernel()) {\n if (!cinfo_.resource_manager()\n ->template Delete<T>(cinfo_.container(), cinfo_.name())\n .ok()) {\n // Do nothing; the resource can have been deleted by session resets.\n }\n }\n }\n }\n \n void Compute(OpKernelContext* ctx) override {\n mutex_lock l(mu_);\n if (resource_ == nullptr) {\n ResourceMgr* mgr = ctx->resource_manager();\n OP_REQUIRES(ctx, mgr != nullptr, errors::Internal("No resource manager."));\n OP_REQUIRES_OK(ctx, cinfo_.Init(mgr, def()));\n auto h = handle_.AccessTensor(ctx)->template flat<string>();\n h(0) = cinfo_.container();\n h(1) = cinfo_.name();\n OP_REQUIRES_OK(ctx, CreateTensorArray(ctx, rm, &handle_, &resource_));\n }\n\n Tensor* handle;\n OP_REQUIRES_OK(ctx, ctx->allocate_output(0, TensorShape({}), &handle));\n handle->flat<ResourceHandle>()(0) =\n resource_->resource_handle(ctx); \n if (ctx->num_outputs() == 2) {\n // Create the flow output.\n Tensor* flow;\n OP_REQUIRES_OK(ctx, ctx->allocate_output(1, TensorShape({}), &flow));\n if (device_type_ == DEVICE_CPU) {\n // Value doesn\'t matter, but this makes msan not complaint about\n // copying an uninitialized value. To do this on GPU would require\n // a kernel launch or a host->device memcpy, so we avoid that.\n flow->flat<float>()(0) = 0;\n }\n }\n }\n \n private:\n Status CreateTensorArray(OpKernelContext* ctx, ResourceMgr* rm,\n Tensor* tensor_array_output_handle,\n TensorArray** output_tensor_array) EXCLUSIVE_LOCKS_REQUIRED(mu_) {\n const Tensor* tensor_size;\n TF_RETURN_IF_ERROR(ctx->input("size", &tensor_size));\n \n if (!TensorShapeUtils::IsScalar(tensor_size->shape())) {\n return errors::InvalidArgument(\n "TensorArray size must be scalar, but had shape: ",\n tensor_size->shape().DebugString());\n }\n const int32 size = tensor_size->scalar<int32>()();\n if (size < 0) {\n return errors::InvalidArgument("Size should be >= 0.");\n }\n \n TensorArray* tensor_array = new TensorArray(\n cinfo_.name(), dtype_, *tensor_array_output_handle, size, element_shape_,\n dynamic_size_, false /* multiple_writes_aggregate */,\n false /* is_grad */, -1 /* marked_size */, clear_after_read_);\n \n // TODO: could use LookupOrCreate instead...\n TF_RETURN_IF_ERROR(\n rm->Create(cinfo_.container(), cinfo_.name(), tensor_array));\n \n *output_tensor_array = tensor_array;\n \n return Status::OK();\n }\n\n mutex mu_;\n ContainerInfo cinfo_ GUARDED_BY(mu_);\n PersistentTensor handle_ GUARDED_BY(mu_);\n TensorArray* resource_ GUARDED_BY(mu_) = nullptr;\n \n const DeviceType device_type_;\n DataType dtype_;\n PartialTensorShape element_shape_;\n bool dynamic_size_;\n bool clear_after_read_;\n string tensor_array_name_; // The name used to create the TensorArray.\n \n TF_DISALLOW_COPY_AND_ASSIGN(GlobalTensorArrayOp);\n };\n \n REGISTER_KERNEL_BUILDER(Name("GlobalTensorArray").Device(DEVICE_CPU), GlobalTensorArrayOp);\n\n '[source]
get_op()[source]
class TFUtil.Lock(name='Lock')[source]

A pure TensorFlow implementation of a mutex / lock.

init()[source]
lock()[source]

On first call, just returns. Any further call will block, unless there is an unlock() call.

unlock()[source]

Must be called after lock().

class TFUtil.OpCodeCompiler(base_name, code_version, code, c_macro_defines=None, ld_flags=None, include_deps=None, static_version_name=None, should_cleanup_old_all=True, should_cleanup_old_mydir=False, use_cuda_if_available=True, verbose=False)[source]

Helper class to compile TF ops on-the-fly, similar as Theano. https://www.tensorflow.org/versions/master/how_tos/adding_an_op/

Parameters:
  • base_name (str) – base name for the module, e.g. “zero_out”
  • code_version (int|tuple[int]) – check for the cache whether to reuse
  • code (str) – the source code itself
  • c_macro_defines (dict[str,str|int]|None) – e.g. {“TENSORFLOW”: 1}
  • ld_flags (list[str]|None) – e.g. [“-lblas”]
  • include_deps (list[str]|None) – if provided and an existing lib file, we will check if any dependency is newer and we need to recompile. we could also do it automatically via -MD but that seems overkill and too slow.
  • static_version_name (str|None) – normally, we use .../base_name/hash as the dir but this would use .../base_name/static_version_name.
  • should_cleanup_old_all (bool) – whether we should look in the cache dir and check all ops if we can delete some old ones which are older than some limit (self._cleanup_time_limit_days)
  • should_cleanup_old_mydir (bool) – whether we should delete our op dir before we compile there.
  • verbose (bool) – be slightly more verbose
load_module()[source]
class TFUtil.OutputWithActivation(x, act_func=None)[source]
Parameters:
  • x (tf.Tensor) –
  • act_func (None|(tf.Tensor)->tf.Tensor) –
get_logits()[source]
Return type:tf.Tensor
Returns:logits. logits are (not necessarily normalized) log probabilities, i.e. the input of softmax.

This call assumes that self.y is in probability space.

is_softmax_act_func()[source]
class TFUtil.TFArrayContainer(dtype, handle=None, container=None, shared_name=None, name='array_container')[source]

Array container, like std::vector, with random index access.

Currently does not work. See https://github.com/tensorflow/tensorflow/issues/10950, and test_TFArrayContainer(). Bug #10950 is fixed upstream, should be in TF 1.2.2.

An alternative to this could be GlobalTensorArrayOpMaker and MapStagingArea, which should get into TF 1.2.2.

Parameters:
  • dtype (tf.DType) –
  • container (str) –
  • shared_name (str) –
  • name (str) –
  • handle (tf.resource) – existing handle to reuse. otherwise we will create a new one
code = '\n #include <vector>\n\n // For Eigen::ThreadPoolDevice.\n #define EIGEN_USE_THREADS 1\n\n #include "tensorflow/core/framework/op.h"\n #include "tensorflow/core/framework/shape_inference.h"\n #include "tensorflow/core/framework/op_kernel.h"\n #include "tensorflow/core/framework/resource_mgr.h"\n #include "tensorflow/core/framework/resource_op_kernel.h"\n #include "tensorflow/core/framework/tensor.h"\n #include "tensorflow/core/framework/tensor_shape.h"\n #include "tensorflow/core/framework/types.h"\n #include "tensorflow/core/platform/macros.h"\n #include "tensorflow/core/platform/mutex.h"\n #include "tensorflow/core/platform/types.h"\n\n using namespace tensorflow;\n\n REGISTER_OP("ArrayContainerCreate")\n .Attr("T: type")\n .Attr("container: string = \'\'")\n .Attr("shared_name: string = \'\'")\n .Output("resource: resource")\n .SetIsStateful()\n .SetShapeFn(shape_inference::ScalarShape)\n .Doc(R"doc(Array container, random index access)doc");\n\n REGISTER_OP("ArrayContainerGetSize")\n .Input("handle: resource")\n .Output("out: int32")\n .SetShapeFn(shape_inference::ScalarShape)\n ;\n\n REGISTER_OP("ArrayContainerSetSize")\n .Attr("T: type")\n .Input("handle: resource")\n .Input("size: int32")\n ;\n\n REGISTER_OP("ArrayContainerGet")\n .Attr("T: type")\n .Input("handle: resource")\n .Input("index: int32")\n .Output("out: T")\n ;\n\n REGISTER_OP("ArrayContainerSet")\n .Attr("T: type")\n .Input("handle: resource")\n .Input("index: int32")\n .Input("value: T")\n ;\n\n // https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/resource_mgr.h\n struct ArrayContainer : public ResourceBase {\n ArrayContainer(const DataType& dtype) : dtype_(dtype) {}\n\n string DebugString() override { return "ArrayContainer"; }\n int64 MemoryUsed() const override { return 0; };\n\n mutex mu_;\n const DataType dtype_;\n std::vector<PersistentTensor> data_ GUARDED_BY(mu_);\n\n int32 get_size() {\n mutex_lock l(mu_);\n return (int32) data_.size();\n }\n\n Status set_size(int32 size) {\n if(size < 0)\n return errors::InvalidArgument("size ", size, " must be >= 0");\n mutex_lock l(mu_);\n data_.resize((size_t) size);\n return Status::OK();\n }\n\n Status get(OpKernelContext* ctx, int32 idx, PersistentTensor* v) {\n mutex_lock l(mu_);\n if(idx < 0)\n return errors::InvalidArgument("idx ", idx, " must be >= 0");\n if((size_t)idx >= data_.size())\n return errors::InvalidArgument("idx ", idx, " must be < size ", data_.size());\n PersistentTensor& t = data_[(size_t)idx];\n if(!t.IsInitialized())\n return errors::InvalidArgument("tensor at idx ", idx, " must have been set before");\n *v = t;\n return Status::OK();\n }\n\n Status set(OpKernelContext* ctx, int32 idx, const Tensor& v) {\n mutex_lock l(mu_);\n if(idx < 0)\n return errors::InvalidArgument("idx ", idx, " must be >= 0");\n if((size_t)idx >= data_.size())\n return errors::InvalidArgument("idx ", idx, " must be < size ", data_.size());\n data_[idx] = PersistentTensor(v);\n return Status::OK();\n }\n\n };\n\n ResourceHandle OwnMakeResourceHandle(OpKernelContext* ctx, const string& container,\n const string& name,\n const TypeIndex& type_index) {\n ResourceHandle result;\n result.set_device(ctx->device()->attributes().name());\n printf("make dev %s\\n", result.device().c_str());\n string actual_container;\n if (!container.empty()) {\n actual_container = container;\n } else {\n actual_container = ctx->resource_manager()->default_container();\n }\n result.set_container(actual_container);\n result.set_name(name);\n result.set_hash_code(type_index.hash_code());\n result.set_maybe_type_name(type_index.name());\n printf("make dev %s\\n", result.device().c_str());\n return result;\n }\n\n // https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/resource_op_kernel.h\n class ArrayContainerCreateOp : public ResourceOpKernel<ArrayContainer> {\n public:\n explicit ArrayContainerCreateOp(OpKernelConstruction* context) : ResourceOpKernel(context) {\n OP_REQUIRES_OK(context, context->GetAttr("T", &dtype_));\n }\n\n void Compute(OpKernelContext* context) override {\n ResourceOpKernel<ArrayContainer>::Compute(context);\n mutex_lock l(mu_);\n ResourceHandle rhandle = OwnMakeResourceHandle(context, cinfo_.container(), cinfo_.name(), MakeTypeIndex<ArrayContainer>());\n printf("created. device: %s\\n", rhandle.device().c_str());\n printf("container: %s\\n", rhandle.container().c_str());\n printf("name: %s\\n", rhandle.name().c_str());\n printf("actual device: %s\\n", context->device()->attributes().name().c_str());\n printf("actual name: %s\\n", cinfo_.name().c_str());\n rhandle.set_device("foo");\n printf("now device: %s\\n", rhandle.device().c_str());\n ResourceHandle cpy = rhandle;\n printf("cpy device: %s\\n", cpy.device().c_str());\n }\n \n private:\n virtual bool IsCancellable() const { return false; }\n virtual void Cancel() {}\n\n Status CreateResource(ArrayContainer** ret) override EXCLUSIVE_LOCKS_REQUIRED(mu_) {\n *ret = new ArrayContainer(dtype_);\n if(*ret == nullptr)\n return errors::ResourceExhausted("Failed to allocate");\n return Status::OK();\n }\n\n Status VerifyResource(ArrayContainer* ar) override {\n if(ar->dtype_ != dtype_)\n return errors::InvalidArgument("Data type mismatch: expected ", DataTypeString(dtype_),\n " but got ", DataTypeString(ar->dtype_), ".");\n return Status::OK();\n }\n \n DataType dtype_;\n };\n REGISTER_KERNEL_BUILDER(Name("ArrayContainerCreate").Device(DEVICE_CPU), ArrayContainerCreateOp);\n\n class ArrayContainerGetSizeOp : public OpKernel {\n public:\n using OpKernel::OpKernel;\n\n void Compute(OpKernelContext* context) override {\n ArrayContainer* ar;\n \n const Tensor* handle;\n OP_REQUIRES_OK(context, context->input("handle", &handle));\n const ResourceHandle& rhandle = handle->scalar<ResourceHandle>()();\n printf("device: %s\\n", rhandle.device().c_str());\n printf("container: %s\\n", rhandle.container().c_str());\n printf("name: %s\\n", rhandle.name().c_str());\n \n OP_REQUIRES_OK(context, GetResourceFromContext(context, "handle", &ar));\n core::ScopedUnref unref(ar);\n\n int32 size = ar->get_size();\n Tensor* tensor_size = nullptr;\n OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape({}), &tensor_size));\n tensor_size->flat<int32>().setConstant(size);\n }\n };\n REGISTER_KERNEL_BUILDER(Name("ArrayContainerGetSize").Device(DEVICE_CPU), ArrayContainerGetSizeOp);\n\n class ArrayContainerSetSizeOp : public OpKernel {\n public:\n using OpKernel::OpKernel;\n\n void Compute(OpKernelContext* context) override {\n ArrayContainer* ar;\n OP_REQUIRES_OK(context, GetResourceFromContext(context, "handle", &ar));\n core::ScopedUnref unref(ar);\n\n const Tensor* tensor_size;\n OP_REQUIRES_OK(context, context->input("size", &tensor_size));\n OP_REQUIRES(context, TensorShapeUtils::IsScalar(tensor_size->shape()),\n errors::InvalidArgument(\n "TensorArray index must be scalar, but had shape: ",\n tensor_size->shape().DebugString()));\n const int32 size = tensor_size->scalar<int32>()();\n OP_REQUIRES_OK(context, ar->set_size(size));\n }\n };\n REGISTER_KERNEL_BUILDER(Name("ArrayContainerSetSize").Device(DEVICE_CPU), ArrayContainerSetSizeOp);\n\n class ArrayContainerGetOp : public OpKernel {\n public:\n explicit ArrayContainerGetOp(OpKernelConstruction* context) : OpKernel(context) {\n OP_REQUIRES_OK(context, context->GetAttr("T", &dtype_));\n }\n\n void Compute(OpKernelContext* context) override {\n ArrayContainer* ar;\n OP_REQUIRES_OK(context, GetResourceFromContext(context, "handle", &ar));\n core::ScopedUnref unref(ar);\n\n const Tensor* tensor_index;\n OP_REQUIRES_OK(context, context->input("index", &tensor_index));\n OP_REQUIRES(context, TensorShapeUtils::IsScalar(tensor_index->shape()),\n errors::InvalidArgument(\n "TensorArray index must be scalar, but had shape: ",\n tensor_index->shape().DebugString()));\n const int32 index = tensor_index->scalar<int32>()();\n\n PersistentTensor value;\n OP_REQUIRES_OK(context, ar->get(context, index, &value));\n context->set_output(0, *value.AccessTensor(context));\n }\n\n private:\n DataType dtype_;\n };\n REGISTER_KERNEL_BUILDER(Name("ArrayContainerGet").Device(DEVICE_CPU), ArrayContainerGetOp);\n\n class ArrayContainerSetOp : public OpKernel {\n public:\n explicit ArrayContainerSetOp(OpKernelConstruction* context) : OpKernel(context) {\n OP_REQUIRES_OK(context, context->GetAttr("T", &dtype_));\n }\n\n void Compute(OpKernelContext* context) override {\n ArrayContainer* ar;\n OP_REQUIRES_OK(context, GetResourceFromContext(context, "handle", &ar));\n core::ScopedUnref unref(ar);\n\n const Tensor* tensor_index;\n const Tensor* tensor_value;\n OP_REQUIRES_OK(context, context->input("index", &tensor_index));\n OP_REQUIRES_OK(context, context->input("value", &tensor_value));\n \n OP_REQUIRES(context, TensorShapeUtils::IsScalar(tensor_index->shape()),\n errors::InvalidArgument(\n "index must be scalar, but had shape: ",\n tensor_index->shape().DebugString()));\n const int32 index = tensor_index->scalar<int32>()();\n OP_REQUIRES(context, tensor_value->IsInitialized(), errors::InvalidArgument("value must be initialized"));\n\n OP_REQUIRES_OK(context, ar->set(context, index, *tensor_value));\n }\n\n private:\n DataType dtype_;\n };\n REGISTER_KERNEL_BUILDER(Name("ArrayContainerSet").Device(DEVICE_CPU), ArrayContainerSetOp);\n '[source]
get(index)[source]
Parameters:index (tf.Tensor) – >= 0 and < size
Returns:tensor at that index
Return type:tf.Tensor
get_size()[source]
Returns:size int32 scalar
Return type:tf.Tensor
set(index, value)[source]
Parameters:
  • index (tf.Tensor) – >= 0 and < size
  • value (tf.Tensor) –
Returns:

operation

Return type:

tf.Operation

set_size(size)[source]
Parameters:size (tf.Tensor) –
Returns:operation
Return type:tf.Operation
class TFUtil.VariableAssigner(var)[source]
Parameters:var (tf.Variable) –
assign(value, session)[source]
Parameters:
  • value (numpy.ndarray|int|float) –
  • session (tf.Session) –
TFUtil.add_scaled_noise_to_gradients(grads_and_vars, gradient_noise_scale)[source]

Adds scaled noise from a 0-mean normal distribution to gradients. Adapted from tf.contrib.layers.optimizers.

Parameters:
  • tf.Variable)] grads_and_vars (list[(tf.Tensor,) –
  • gradient_noise_scale (float) – used as stddev for tf.truncated_normal().
Returns:

adapted grads_and_vars

Return type:

list[(tf.Tensor, tf.Variable)]

TFUtil.assert_min_tf_version(version, reason)[source]
Parameters:
  • version (tuple[int]) – e.g. (1,2,0) or (1,2)
  • reason (str) –
TFUtil.auto_init_var(v)[source]
Parameters:v (tf.Variable) –
Returns:a reference to the var via tf.identity
Return type:tf.Tensor
TFUtil.batched_uniq(x, seq_lens)[source]
Parameters:
  • x (tf.Tensor) – shape (batch,time) -> index, some int type
  • seq_lens (tf.Tensor|None) – shape (batch,) of int32|int64
Returns:

tuple (z, new_seq_lens), where z is of shape (batch, max_new_time), max_new_time = max(new_seq_lens), seq_lens is of shape (batch,).

Return type:

(tf.Tensor, tf.Tensor)

TFUtil.check_dim_equal(x, x_axis, y, y_axis)[source]
Parameters:
  • x (tf.Tensor) –
  • x_axis (int) – which axis to check
  • y (tf.Tensor) –
  • y_axis (int) – which axis to check
Returns:

x with check added that shape(x)[x_axis] == shape(y)[y_axis]

Return type:

tf.Tensor

TFUtil.check_initial_tf_thread_pool_init()[source]
TFUtil.check_input_dim(x, axis, dim)[source]
Parameters:
  • x (tf.Tensor) –
  • axis (int) – which axis to check
  • dim (int|tf.Tensor) –
Returns:

x with check added

Return type:

tf.Tensor

TFUtil.check_input_ndim(x, ndim)[source]
Parameters:
  • x (tf.Tensor) –
  • ndim (int) –
Returns:

x with check added

Return type:

tf.Tensor

TFUtil.check_input_ndim_equal_offset(x, y, y_ndim_offset=0)[source]
Parameters:
  • x (tf.Tensor) –
  • y (tf.Tensor) –
  • y_ndim_offset (int) –
Returns:

x with check added such that ndim(x) == ndim(y) + y_ndim_offset

Return type:

tf.Tensor

TFUtil.check_shape_equal(x, y)[source]
Parameters:
  • x (tf.Tensor) –
  • y (tf.Tensor) –
Returns:

x with check added that shape(x) == shape(y)

Return type:

tf.Tensor

TFUtil.circular_pad(x, paddings, axes=None)[source]
Parameters:
  • x (tf.Tensor) – shape (..., height, width)
  • (int,int))|tf.Tensor paddings (int|((int,int),) – how much to add ((top,bottom),(left,right))
Returns:

tensor with shape (..., top + height + bottom, left + width + right)

Return type:

tf.Tensor

TFUtil.cond(pred, fn1, fn2, name=None)[source]

This is a wrapper around tf.control_flow_ops.cond(). This will be a branched execution, i.e. either fn1() or fn2() will be executed, or at least the resulting graph will be evaluated. If pred can is constant at the call, only the corresponding fn will be called. This is similar as the TF internal _smart_cond().

Parameters:
  • pred (tf.Tensor|bool) –
  • fn1 (()->(tf.Tensor|list[tf.Tensor])) –
  • fn2 (()->(tf.Tensor|list[tf.Tensor])) –
  • name (str) –
Returns:

fn1() if pred else fn2()

Return type:

tf.Tensor|list[tf.Tensor]

TFUtil.constant_with_shape(x, shape, dtype=None, name='constant_with_shape')[source]
Parameters:
  • x (tf.Tensor|float|int|bool) – scalar
  • shape (list[tf.Tensor|int]|tuple[tf.Tensor|int]|tf.Tensor) –
  • dtype (tf.DType) –
  • name (str) –
Returns:

x of the specified shape

Return type:

tf.Tensor

TFUtil.debugRegisterBetterRepr()[source]

Some types don’t have good __repr__ implementations by default (for the current TF version). For debugging, it can be helpful to give some more info. This monkey-patches clazz.__repr__ of some TF classes if they are object.__repr__.

TFUtil.dimshuffle(x, axes, name='dimshuffle')[source]

Like Theanos dimshuffle. Combines tf.transpose, tf.expand_dims and tf.squeeze.

Parameters:
  • x (tf.Tensor) –
  • axes (list[int|str]|tuple[int|str]) –
  • name (str) – scope name
Return type:

tf.Tensor

TFUtil.directed(x, direction)[source]

If direction == 1 or direction is None, returns just x. If direction == -1, returns reversed(x).

Parameters:
  • x (tf.Tensor) –
  • direction (int|None) – -1 or 1 (or None)
Return type:

tf.Tensor

TFUtil.dot(a, b)[source]
Parameters:
  • a (tf.Tensor) – shape [...da...,d]
  • b (tf.Tensor) – shape [d,...db...]
Returns:

tensor of shape [...da...,d,...db...]

Return type:

tf.Tensor

TFUtil.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)[source]

Computes dropout. Like tf.nn.dropout() but avoid tf.div() if possible.

Parameters:
  • x (tf.Tensor) –
  • keep_prop (float|tf.Tensor) –
  • noise_shape (tf.Tensor|tuple[int]) –
  • seed (int) –
  • name (str) –
TFUtil.encode_raw(x, axis=-1, seq_lens=None)[source]

The inverse function of tf.decode_raw(). Also see: https://stackoverflow.com/questions/43403147/how-to-create-a-encode-raw-tensorflow-function

Parameters:
  • x (tf.Tensor) – of integer types [0,255], will get casted to uint8
  • axis (int) – the axis to reduce-join the string. decode_raw has added it at the end
  • seq_lens (tf.Tensor|None) – must have same shape as x after reduce-joining. Note that using seq_lens will make our output not compatible with tf.decode_raw() anymore because tf.decode_raw() requires all strings to be of the same length.
Returns:

string tensor

Return type:

tf.Tensor

TFUtil.enforce_copy(x)[source]
Parameters:x (tf.Tensor|tf.Variable) –
Returns:copy of input, i.e. enforces that this is not a ref
TFUtil.expand_dims_unbroadcast(x, axis, dim, name='expand_dims_unbroadcast')[source]
Parameters:
  • x (tf.Tensor) –
  • axis (int|tf.Tensor) – new axis
  • dim (int|tf.Tensor) – dimension for axis
  • name (str) – scope name
Returns:

if x is of shape (a,b,c) and axis=0, then we return (dim,a,b,c)

Return type:

tf.Tensor

TFUtil.expand_multiple_dims(x, axes, name='expand_multiple_dims')[source]
Parameters:
  • x (tf.Tensor) –
  • axes (list[int]|tuple[int]) – after completion, tf.shape(y)[axis] == 1 for axis in axes
  • name (str) – scope name
Returns:

y where we have a new broadcast axis for each axis in axes

Return type:

tf.Tensor

TFUtil.filter_grad(x, threshold, axis)[source]
Parameters:
  • x (tf.Tensor) –
  • threshold (float) – all grads going through x which max(grad**2) is over the threshold are removed
  • axis (int|list[int]) – max(grad**2) will be reduced over this axis
Returns:

identity(x) with custom gradient

Return type:

tf.Tensor

TFUtil.flatten_with_seq_len_mask(x, seq_lens, time_major=False)[source]
Parameters:
  • x (tf.Tensor) – shape (batch,time,...s...) with time_major=False or otherwise shape (time,batch,...s....)
  • seq_lens (tf.Tensor) – shape (batch,) of int32
  • time_major (bool) – if the time-dim is the first dimension in x
Returns:

tensor of shape (time’, ...s...) where time’ = sum(seq_len) <= batch*time

Return type:

tf.Tensor

TFUtil.get_activation_function(s)[source]
Parameters:s (str|None) –
Return type:(tf.Tensor) -> tf.Tensor
TFUtil.get_base_name(x)[source]
Parameters:x (tf.Tensor) – has name e.g. “layer0/rec/W:0”
Returns:return the base name, e.g. “W”, without the output index
TFUtil.get_current_name_scope()[source]
Returns:current absolute name scope, via tf.name_scope
Return type:str

http://stackoverflow.com/questions/40907769/how-to-get-current-tensorflow-name-scope

Note that this is a private member and might break at some point. Note also that this does not need to be the same as get_current_var_scope_name().

TFUtil.get_current_var_scope_name()[source]
Returns:current absolute variable scope name, via tf.variable_scope
Return type:str
TFUtil.get_global_train_flag_placeholder()[source]
Returns:bool scalar tensor
Return type:tf.Tensor
TFUtil.get_initializer(s, seed=None, eval_local_ns=None)[source]
Parameters:
  • s (str|dict[str]|float) – e.g. “glorot_uniform” or “truncated_normal” or “orthogonal”, or config dict with “class”, or string to be `eval`ed if it contains “(”. constant if a float is given.
  • seed (int|tf.Tensor) –
  • eval_local_ns (dict[str]|None) –
Returns:

(function (shape) -> tf.Tensor) | tf.Initializer

Return type:

((tuple[int]) -> tf.Tensor) | tf.Initializer

TFUtil.get_name_scope_of_tensor(x)[source]
Parameters:x (tf.Tensor) – has name e.g. “layer0/rec/W:0”
Returns:the name scope of x, e.g. “layer0/rec”
Return type:str
TFUtil.get_ndim(x)[source]
Parameters:x (tf.Tensor) –
Returns:x.ndim either as a static int or otherwise as an expression
Return type:int|tf.Tensor
TFUtil.get_range(start, stop=<class 'Util.NotSpecified'>)[source]
Parameters:
  • start (int|tf.Tensor|None) –
  • stop (int|tf.Tensor|None) –
Returns:

either tuple(range(start, stop)) or the same as a symbolic expression

Return type:

tuple[int]|tf.Tensor

TFUtil.get_shape(x)[source]
Parameters:x (tf.Tensor) –
Returns:list of scalars, which are either int if known statically, or otherwise expressions
Return type:list[int|tf.Tensor]
TFUtil.get_shape_dim(x, axis, name='shape_dim')[source]
Parameters:
  • x (tf.Tensor) –
  • axis (int) – which axis
  • name (str) –
Returns:

x.shape[axis] either as a static int or otherwise as an expression

Return type:

int|tf.Tensor

TFUtil.global_queue(name, queue_type, capacity, dtypes, shapes=None, names=None)[source]
Parameters:
  • queue_type ((args)->tf.QueueBase) – some function which creates a queue
  • name (str) – global name
  • dtypes (list[tf.DType|str]) –
  • shapes (list[tf.TensorShape|tuple[int|None]]|None) –
  • names (list[str]|None) –
Return type:

tf.QueueBase

TFUtil.global_tensor(f, name)[source]

This creates a global accessible tensor in the graph to be reused later, i.e. on the second call given a unique name, it will not create a new tensor but return the previously created tensor. This is for the current graph, i.e. if there is a new graph, it will recreate the tensor.

Parameters:
  • -> tf.Tensor f (()) – callable which creates the tensor
  • name (str) – global reference name for the tensor
Returns:

the tensor

Return type:

tf.Tensor

TFUtil.identity(x)[source]
Parameters:x (tf.Tensor) –
Return type:tf.Tensor
TFUtil.identity_op_nested(x, name='identity')[source]
Parameters:
  • x (tf.Tensor|list[tf.Tensor]|dict[str,tf.Tensor]) –
  • name (str) –

:rtype tf.Tensor|list[tf.Tensor]|dict[str,tf.Tensor]

TFUtil.identity_with_ops(x, ops)[source]
Parameters:
  • x (tf.Tensor) –
  • -> list[tf.Operation|tf.Tensor] ops (()) –
Returns:

x with all ops executed

Return type:

tf.Tensor

TFUtil.init_variable_if_needed(v)[source]
Parameters:v (tf.Variable) –
Return type:tf.Operation
TFUtil.is_gpu_available()[source]

Returns whether TensorFlow can access a GPU.

TFUtil.make_var_tuple(v)[source]
Parameters:v (tf.Tensor|list[tf.Tensor]|tuple[tf.Tensor]) –
Returns:tuple of tensors
Return type:tuple[tf.Tensor]
TFUtil.move_axis(x, old_axis, new_axis)[source]
Parameters:
  • x (tf.Tensor) –
  • old_axis (int) –
  • new_axis (int) –
TFUtil.nan_to_num(x, nan_num=0, inf_num=1e+30)[source]

Like numpy.nan_to_num().

Parameters:
  • x (tf.Tensor) –
  • nan_num (float|tf.Tensor) –
  • inf_num (float|tf.Tensor) –
Returns:

x with replaced nan and inf

TFUtil.nd_indices(indices, batch_axis=0)[source]
Parameters:indices (tf.Tensor) – e.g. (batch, ...) -> index
Returns:extended indices with batch-idx which can be used for tf.gather_nd, i.e. in the example of shape (batch, ..., 2) where the 2-tuple represents (batch_idx, index).
Return type:tf.Tensor
TFUtil.optional_add(*args)[source]
Parameters:args (list[tf.Tensor|None]|tf.Tensor) –
Return type:tf.Tensor|None
Returns:sums all non-None values, or returns None if there are none
TFUtil.pad_zeros_in_axis(x, before=0, after=0, axis=0)[source]
Parameters:
  • x (tf.Tensor) –
  • before (int|tf.Tensor) –
  • after (int|tf.Tensor) –
  • axis (int) –
Returns:

TFUtil.post_control_dependencies(x, updates)[source]
Parameters:
  • x (tf.Tensor|list[tf.Tensor]|dict[str,tf.Tensor]) –
  • updates (list[tf.Operation]) –
Returns:

identity(x) with control_dependencies(updates)

Return type:

tf.Tensor|list[tf.Tensor]|dict[str,tf.Tensor]

TFUtil.print_available_devices()[source]
TFUtil.raise_OutOfRangeError()[source]
Returns:an op which raises an OutOfRangeError
Return type:tf.Operation
TFUtil.random_uniform_abs_initializer(limit, **kwargs)[source]
TFUtil.reuse_name_scope(*args, **kwds)[source]
Parameters:
  • name (str|tf.VariableScope) – relative name scope (absolute if absolute=True or if tf.VariableScope)
  • absolute (bool) – if True it will be absolute

We try to both set the variable scope and the name scope.

TFUtil.reuse_name_scope_of_tensor(*args, **kwds)[source]
Parameters:
  • x (tf.Tensor) – has name e.g. “layer0/rec/W:0”
  • prefix (str) –
  • postfix (str) –
Returns:

reuse the name scope of x, e.g. “layer0/rec”, yields scope

TFUtil.reversed(x)[source]

Just returns x[::-1]. It will cache the value inside the passed object so that we don’t recompute it multiple times.

Parameters:x (tf.Tensor) –
Return type:tf.Tensor
TFUtil.sequence_mask(lengths, **kwargs)[source]

Wraps around tf.sequence_mask(). It will cache the value inside the passed object so that we don’t recompute it multiple times.

Parameters:
  • lengths (tf.Tensor) – shape (batch,)
  • kwargs (dict[str]) – passed on to tf.sequence_mask
Returns:

tensor mask of shape (batch,maxlen/time). default dtype is bool unless you specify something else

Return type:

tf.Tensor

TFUtil.sequence_mask_time_major(lengths, **kwargs)[source]

Wraps around tf.transpose(tf.sequence_mask(), (1,0)). It will cache the value inside the passed object so that we don’t recompute it multiple times.

Parameters:
  • lengths (tf.Tensor) – shape (batch,)
  • kwargs (dict[str]) – passed on to tf.sequence_mask
Returns:

mask of shape (maxlen/time,batch)

TFUtil.sequential_control_dependencies(*args, **kwds)[source]

tf.control_dependencies but each operation will be created such that it is executed after the ones coming before in the list, i.e. l[0] is executed first, l[-1] is executed last.

Parameters:l (list[()->(tf.Operation|tf.Tensor)]) –
TFUtil.setup_tf_thread_pools(num_threads=None, log_file=None)[source]

See here for documentation of intra_op_parallelism_threads and inter_op_parallelism_threads: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto

intra_op_parallelism_threads is used for the LocalDevice::EigenThreadPoolInfo, which is always global. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/local_device.cc

inter_op_parallelism_threads is used for the (global if not use_per_session_threads) session thread pool. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/direct_session.cc

TF will setup the thread pools on first usage. That can happen quite early, esp for intra_op_parallelism_threads. E.g. list_local_devices() will trigger this, i.e. any call to is_gpu_available() or print_available_devices(). For debugging, you can set the env-var TF_CPP_MIN_VLOG_LEVEL=1 and then check for these message:

Local device intra op parallelism threads: 4
Direct session inter op parallelism threads: 4

Thus, call this function as early as possible with your preferred number of threads, used for both thread pools. It will create a dummy session and directly close it again, but if you use the global thread pools, those settings will remain for further sessions. This function will only execute on the first call.

Parameters:
  • num_threads (int) – used for both intra and inter parallelism thread pools
  • log_file (stream|None) –
TFUtil.single_strided_slice(x, axis, begin=None, end=None, step=None)[source]
Parameters:
  • x (tf.Tensor) –
  • axis (int|tf.Tensor) –
  • begin (int|tf.Tensor|None) –
  • end (int|tf.Tensor|None) –
  • step (int|tf.Tensor|None) –
Returns:

e.g. if axis == 0, returns x[begin:end:step], if axis == 1, returns x[:, begin:end:step], etc.

Return type:

tf.Tensor

TFUtil.slice_pad_zeros(x, begin, end, axis=0)[source]
Parameters:
  • x (tf.Tensor) – of shape (..., time, ...)
  • begin (int|tf.Tensor) –
  • end (int|tf.Tensor) –
  • axis (int) –
Returns:

basically x[begin:end] (with axis==0) but if begin < 0 or end > x.shape[0], it will not discard these frames but pad zeros, such that the resulting shape[0] == end - begin.

Return type:

tf.Tensor

TFUtil.sparse_labels(x, seq_lens, dtype=tf.int32, collapse_repeated=False)[source]
Parameters:
  • x (tf.Tensor) – shape (batch,time) -> index, some int type
  • seq_lens (tf.Tensor|None) – shape (batch,) of int32|int64
  • dtype (tf.DType|None) – if given, will cast the x values to this type. ctc_loss() wants int32
  • collapse_repeated (bool) – like uniq() behavior
Returns:

SparseTensor, e.g. input for tf.nn.ctc_loss()

Return type:

tf.SparseTensor

TFUtil.sparse_labels_with_seq_lens(x, seq_lens, dtype=tf.int32, collapse_repeated=False)[source]
Parameters:
  • x (tf.Tensor) – shape (batch,time) -> index, some int type
  • seq_lens (tf.Tensor|None) – shape (batch,) of int32|int64
  • dtype (tf.DType|None) – if given, will cast the x values to this type. ctc_loss() wants int32
  • collapse_repeated (bool) – like uniq() behavior
Returns:

SparseTensor, e.g. input for tf.nn.ctc_loss(), and seq_lens of shape (batch,)

Return type:

(tf.SparseTensor, tf.Tensor)

TFUtil.spatial_smoothing_energy(x, dim, use_circular_conv=True)[source]
Parameters:
  • x (tf.Tensor) – shape (..., dim)
  • dim (int) – last dimension of x
  • use_circular_conv (bool) – whether to use circular convolution, via circular_pad
Return type:

tf.Tensor

Returns:

energy of shape (...)

Via Achieving Human Parity in Conversational Speech Recognition, Microsoft, 2017. Interpret the last dimension as 2D (w, h) and apply some high-pass filter on it.

TFUtil.stop_event_writer_thread(event_writer)[source]

There is a bug in TensorFlow (at least 1.1.0) (https://github.com/tensorflow/tensorflow/issues/4820) that the event writer thread is never stopped. This will try to stop it. Only do it if you don’t use the event writer anymore.

Parameters:event_writer (tensorflow.python.summary.writer.event_file_writer.EventFileWriter) –
TFUtil.swapaxes(x, axis1, axis2)[source]
Parameters:
  • x (tf.Tensor) –
  • axis1 (tf.Tensor|int) –
  • axis2 (tf.Tensor|int) –
Returns:

tensor with swapped axes, like numpy.swapaxes

Return type:

tf.Tensor

TFUtil.tf_version_tuple()[source]
Returns:version tuple, e.g. (1, 1, 0), parsed from tf.__version__
Return type:tuple[int]
TFUtil.tile_transposed(x, axis, multiples)[source]

Example: x with shape (D,), tf.tile(x, [N]) can be reshaped into (N,D), while tile_transposed(x, axis=0, multiples=N) can be reshaped into (D,N).

Parameters:
  • x (tf.Tensor) –
  • axis (int) –
  • multiples (int|tf.Tensor) –
Returns:

tensor with shape[axis] == x.shape[axis] * multiples

Return type:

tf.Tensor

TFUtil.true_once()[source]
Returns:tensor which will be True once and then always False Internally, this creates a non-trainable variable as a helper.
Return type:tf.Tensor
TFUtil.uniq(x)[source]
Parameters:x (tf.Tensor) – 1D shape (time,) -> index, some int type
Returns:like numpy.uniq. unlike tf.unique which will never repeat entries.

Example: uniq([0, 0, 1, 1, 0, 0]) == [0, 1, 0], tf.unique([0, 0, 1, 1, 0, 0]) == [0, 1]. For a batched variant, see batched_uniq, or sparse_labels() with option collapse_repeated.

TFUtil.var_creation_scope(*args, **kwds)[source]
If you create a variable inside of a while-loop, you might get the following error:
InvalidArgumentError: The node ‘while/w/Assign’ has inputs from different frames. The input ‘while/j’ is in frame ‘while/while/’. The input ‘while/w’ is in frame ‘’.

Also see tests/test_TFUtil.py:test_loop_var_creation(). Related TF bugs:

The solution is to reset the current frame. Resetting all control dependencies has this effect.

TFUtil.variable_scalar_summaries_dict(x, name=None)[source]

Collects all interesting information about x, such as min/max/mean, etc. (all scalars). This is used by variable_summaries().

Parameters:
  • x (tf.Tensor|tf.Variable) –
  • name (str) –
Returns:

dicth with key -> scalar info, e.g. with “%s_mean” % name -> tf.reduce_mean(x)

Return type:

dict[str,tf.Tensor]

TFUtil.variable_summaries(var, name=None, with_histogram=False)[source]

Attach a lot of summaries to a Tensor (for TensorBoard visualization). Also see variable_scalar_summaries_dict().

Parameters:
  • var (tf.Tensor|tf.Variable) –
  • name (str) –
  • with_histogram (bool) – adds histogram. note that this can add noticeable overhead
Returns:

nothing, use tf.summary.merge_all() to collect the summaries

TFUtil.view_as(x, dtype)[source]

Does the numpy.view equivalent. Note that the current implementation is inefficient (uses tf.py_func) and CPU-only. :param tf.Tensor x: :param tf.DType dtype: :return: x.view(dtype) equivalent (see numpy.view)

TFUtil.windowed_nd(source, window, padding='same', time_axis=1, new_window_axis=2)[source]
Parameters:
  • source (tf.Tensor) – N-D tensor of shape (..., n_time, ...)
  • window (int|tf.Tensor) – window size
  • padding (str) – “same” or “valid”
  • time_axis (int) –
  • new_window_axis (int) –
Returns:

tensor of shape (..., n_time, ..., window, ...)

Return type:

tf.Tensor

TFUtil.xavier_initializer(uniform=True, seed=None, dtype=tf.float32)[source]

Alias for tf.glorot_uniform_initializer or tf.glorot_normal_initializer.

Parameters:
  • uniform (bool) – uniform or normal distribution
  • seed (int) –
  • dtype (tf.DType) –
Returns:

((tuple[int]) -> tf.Tensor) | tf.Initializer