returnn.util.debug

Some generic debugging utilities.

returnn.util.debug.auto_exclude_all_new_threads(func)[source]
Parameters:

func (T)

Returns:

func wrapped

Return type:

T

returnn.util.debug.dump_all_thread_tracebacks(*, exclude_thread_ids: Collection[int] | None = None, exclude_self: bool = False, file: TextIO | None = None)[source]
Parameters:
  • exclude_thread_ids

  • exclude_self

  • file

returnn.util.debug.setup_warn_with_traceback()[source]

Installs some hook for warnings.showwarning.

returnn.util.debug.init_better_exchook()[source]

Installs our own sys.excepthook, which uses better_exchook, but adds some special handling for the main thread.

returnn.util.debug.format_signum(signum)[source]
Parameters:

signum (int)

Returns:

string “signum (signame)”

Return type:

str

returnn.util.debug.signal_handler(signum, frame)[source]

Prints a message on stdout and dump all thread stacks.

Parameters:
  • signum (int) – e.g. signal.SIGUSR1

  • frame – ignored, will dump all threads

returnn.util.debug.install_signal_handler_if_default(signum, exceptions_are_fatal=False)[source]
Parameters:
  • signum (int) – e.g. signal.SIGUSR1

  • exceptions_are_fatal (bool) – if True, will reraise any exceptions. if False, will just print a message

Returns:

True iff no exception, False otherwise. not necessarily that we registered our own handler

Return type:

bool

returnn.util.debug.install_native_signal_handler(*, reraise_exceptions: bool = False)[source]

Installs some own custom C signal handler.

returnn.util.debug.install_lib_sig_segfault()[source]

Installs libSegFault (common on Unix/Linux).

returnn.util.debug.init_faulthandler(sigusr1_chain=False)[source]

Maybe installs signal handlers, SIGUSR1 and SIGUSR2 and others. If no signals handlers are installed yet for SIGUSR1/2, we try to install our own Python handler. This also tries to install the handler from the fauldhandler module, esp for SIGSEGV and others.

Parameters:

sigusr1_chain (bool) – whether the default SIGUSR1 handler should also be called.

returnn.util.debug.init_ipython_kernel(*args, **kwargs)[source]
Parameters:
  • args

  • kwargs

Returns:

returnn.util.debug.init_cuda_not_in_main_proc_check()[source]

Installs some hook to Theano which checks that CUDA is only used in the main proc.

returnn.util.debug.debug_shell(user_ns: Dict[str, Any] | None = None, user_global_ns: Dict[str, Any] | None = None, exit_afterwards: bool = True)[source]

Provides some interactive Python shell. Uses IPython if possible. Wraps to better_exchook.debug_shell.

Parameters:
  • user_ns

  • user_global_ns

  • exit_afterwards – will do sys.exit(1) at the end

class returnn.util.debug.PyTracer(funcs_to_trace_list: Sequence[Callable | LambdaType], capture_type: type | Tuple[type, ...])[source]

Trace Python function execution to get intermediate outputs from the local variables.

E.g. for PyTorch code, when comparing results, it can be useful to see the intermediate tensors.

Example:

with PyTracer([my_func], torch.Tensor) as trace_my_impl:
    ...

with PyTracer([reference_func], torch.Tensor) as trace_ref_impl:
    ...

Or another example:

from returnn.tensor import Tensor

with PyTracer([my_func], Tensor) as trace_my_impl:
    ...

with PyTracer([reference_func], torch.Tensor) as trace_ref_impl:
    ...

check_py_traces_rf_to_pt_equal(trace_my_impl.captured_locals, trace_ref_impl.captured_locals, [...])

See also check_py_traces_rf_to_pt_equal() to compare the traces.

This class uses the Python sys.settrace() mechanism to trace the locals. It accesses frame.f_locals to get the local variables. Note that this behavior is slightly buggy in versions of CPython <3.13, see for example: https://github.com/python/cpython/issues/113939 https://github.com/python/cpython/issues/74929 And thus the behavior might be different depending on the Python version. In Python >=3.13, you likely get a few more locals than before.

Parameters:
  • funcs_to_trace_list – list of functions to trace the locals. only those functions will be traced.

  • capture_type – only capture variables of this type, e.g. torch.Tensor.

returnn.util.debug.check_py_traces_rf_to_pt_equal(trace_rf: Dict[Callable, List[Dict[str, List[Tensor]]]], trace_pt: Dict[Callable, List[Dict[str, List[torch.Tensor]]]], checks: List[Tuple[Tuple[Callable, int, str, int], Tuple[Callable, int, str, int], Tuple[Dim | str, ...] | Callable[[torch.Tensor], Tensor]]])[source]

Compares traces from some RETURNN-frontend (RF) based implementation with some pure PyTorch (PT) based implementation.

Parameters:
  • trace_rf – RETURNN-frontend trace, from PyTracer

  • trace_pt – pure PyTorch trace, from PyTracer

  • checks

    list of checks to perform. each check is a tuple of: - RF trace entry, e.g. (func, i, name, j) - PT trace entry, e.g. (func, i, name, j) - PT dims, e.g. (batch_dim, other_dim, …).

    Instead of Dim, you can also use a string, which will be resolved from the RF trace (then you also need Dim in capture_type of the PyTracer). If callable, it gets the PyTorch tensor and should return the RETURNN tensor. Sometimes you might want to perform some reshaping, slicing, or similar, and then use rf.convert_to_tensor.