returnn.torch.util.array_
¶
Array (Tensor) functions
- returnn.torch.util.array_.masked_select(input: Tensor, mask: Tensor, *, mask_len: int | Tensor | None = None)[source]¶
Like
torch.masked_select()
but much more efficient, both in terms of memory and computation time, both on CPU and GPU.See here for the issues with
torch.masked_select()
: https://github.com/rwth-i6/returnn/issues/1584 https://github.com/pytorch/pytorch/issues/30246 https://github.com/pytorch/pytorch/issues/56896- Parameters:
input – [mask_dims…, remaining_dims…]
mask – [mask_dims…], binary mask to index with. if it has less dims than
input
, the remaining dims are broadcasted.mask_len – if given, the length of the mask. this avoids a CUDA synchronization.
- Returns:
selected elements, shape [mask_len, remaining_dims…]
- returnn.torch.util.array_.nonzero(mask: Tensor, *, out_len: int | Tensor) Tensor [source]¶
This has the advantage over
torch.nonzero()
that we do not need to perform a CUDA synchronization. We can avoid that when we know the output length in advance.However, in my benchmarks, it seems this is slower than torch.nonzero. https://github.com/rwth-i6/returnn/pull/1593 https://github.com/pytorch/pytorch/issues/131256
- Parameters:
mask – flattened (dim() == 1) mask, bool
out_len
- Returns:
indices of True elements, shape [out_len]. like
mask.nonzero().flatten()