returnn.util.debug
¶
Some generic debugging utilities.
- returnn.util.debug.auto_exclude_all_new_threads(func)[source]¶
- Parameters:
func (T)
- Returns:
func wrapped
- Return type:
T
- returnn.util.debug.dump_all_thread_tracebacks(*, exclude_thread_ids: Collection[int] | None = None, exclude_self: bool = False, file: TextIO | None = None)[source]¶
- Parameters:
exclude_thread_ids
exclude_self
file
- returnn.util.debug.setup_warn_with_traceback()[source]¶
Installs some hook for
warnings.showwarning
.
- returnn.util.debug.init_better_exchook()[source]¶
Installs our own
sys.excepthook
, which usesbetter_exchook
, but adds some special handling for the main thread.
- returnn.util.debug.format_signum(signum)[source]¶
- Parameters:
signum (int)
- Returns:
string “signum (signame)”
- Return type:
str
- returnn.util.debug.signal_handler(signum, frame)[source]¶
Prints a message on stdout and dump all thread stacks.
- Parameters:
signum (int) – e.g. signal.SIGUSR1
frame – ignored, will dump all threads
- returnn.util.debug.install_signal_handler_if_default(signum, exceptions_are_fatal=False)[source]¶
- Parameters:
signum (int) – e.g. signal.SIGUSR1
exceptions_are_fatal (bool) – if True, will reraise any exceptions. if False, will just print a message
- Returns:
True iff no exception, False otherwise. not necessarily that we registered our own handler
- Return type:
bool
- returnn.util.debug.install_native_signal_handler(*, reraise_exceptions: bool = False)[source]¶
Installs some own custom C signal handler.
- returnn.util.debug.init_faulthandler(sigusr1_chain=False)[source]¶
Maybe installs signal handlers, SIGUSR1 and SIGUSR2 and others. If no signals handlers are installed yet for SIGUSR1/2, we try to install our own Python handler. This also tries to install the handler from the fauldhandler module, esp for SIGSEGV and others.
- Parameters:
sigusr1_chain (bool) – whether the default SIGUSR1 handler should also be called.
- returnn.util.debug.init_cuda_not_in_main_proc_check()[source]¶
Installs some hook to Theano which checks that CUDA is only used in the main proc.
- returnn.util.debug.debug_shell(user_ns: Dict[str, Any] | None = None, user_global_ns: Dict[str, Any] | None = None, exit_afterwards: bool = True)[source]¶
Provides some interactive Python shell. Uses IPython if possible. Wraps to
better_exchook.debug_shell
.- Parameters:
user_ns
user_global_ns
exit_afterwards – will do sys.exit(1) at the end
- class returnn.util.debug.PyTracer(funcs_to_trace_list: Sequence[Callable | LambdaType], capture_type: type | Tuple[type, ...])[source]¶
Trace Python function execution to get intermediate outputs from the local variables.
E.g. for PyTorch code, when comparing results, it can be useful to see the intermediate tensors.
Example:
with PyTracer([my_func], torch.Tensor) as trace_my_impl: ... with PyTracer([reference_func], torch.Tensor) as trace_ref_impl: ...
Or another example:
from returnn.tensor import Tensor with PyTracer([my_func], Tensor) as trace_my_impl: ... with PyTracer([reference_func], torch.Tensor) as trace_ref_impl: ... check_py_traces_rf_to_pt_equal(trace_my_impl.captured_locals, trace_ref_impl.captured_locals, [...])
See also
check_py_traces_rf_to_pt_equal()
to compare the traces.This class uses the Python
sys.settrace()
mechanism to trace the locals. It accessesframe.f_locals
to get the local variables. Note that this behavior is slightly buggy in versions of CPython <3.13, see for example: https://github.com/python/cpython/issues/113939 https://github.com/python/cpython/issues/74929 And thus the behavior might be different depending on the Python version. In Python >=3.13, you likely get a few more locals than before.- Parameters:
funcs_to_trace_list – list of functions to trace the locals. only those functions will be traced.
capture_type – only capture variables of this type, e.g. torch.Tensor.
- returnn.util.debug.check_py_traces_rf_to_pt_equal(trace_rf: Dict[Callable, List[Dict[str, List[Tensor]]]], trace_pt: Dict[Callable, List[Dict[str, List[torch.Tensor]]]], checks: List[Tuple[Tuple[Callable, int, str, int], Tuple[Callable, int, str, int], Tuple[Dim | str, ...] | Callable[[torch.Tensor], Tensor]]])[source]¶
Compares traces from some RETURNN-frontend (RF) based implementation with some pure PyTorch (PT) based implementation.
- Parameters:
trace_rf – RETURNN-frontend trace, from
PyTracer
trace_pt – pure PyTorch trace, from
PyTracer
checks –
list of checks to perform. each check is a tuple of: - RF trace entry, e.g. (func, i, name, j) - PT trace entry, e.g. (func, i, name, j) - PT dims, e.g. (batch_dim, other_dim, …).
Instead of Dim, you can also use a string, which will be resolved from the RF trace (then you also need
Dim
incapture_type
of thePyTracer
). If callable, it gets the PyTorch tensor and should return the RETURNN tensor. Sometimes you might want to perform some reshaping, slicing, or similar, and then use rf.convert_to_tensor.