returnn.frontend.init
#
Common parameter initialization functions.
https://github.com/rwth-i6/returnn/wiki/Parameter-initialization
- class returnn.frontend.init.Normal(stddev: float, *, truncated: bool = True, dtype: str | None = None)[source]#
Initialization by normal distribution (truncated by default), independent of the dimensions (fan in/out).
See
VarianceScaling
and derivatives for variants which depend on fan in/out.
- class returnn.frontend.init.VarianceScaling(scale: float | None = None, mode: str | None = None, distribution: str | None = None, dtype: str | None = None)[source]#
Provides a generalized way for initializing weights. All the common initialization methods are special cases such as Xavier Glorot and Kaiming He.
Code adopted from TensorFlow VarianceScaling.
- class returnn.frontend.init.Glorot(scale: float | None = None, mode: str | None = None, distribution: str | None = None, dtype: str | None = None)[source]#
Xavier Glorot (http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf). scale 1, fan_avg, uniform
- class returnn.frontend.init.He(scale: float | None = None, mode: str | None = None, distribution: str | None = None, dtype: str | None = None)[source]#
Kaiming He (https://arxiv.org/pdf/1502.01852.pdf). scale 2, fan_in, normal