Allows automatic gradient norm clipping. This feature can help to stabilize training in certain situations by limiting the magnitude of gradient updates. The implementation is inspired by the paper "AutoClip: Adaptive Gradient Clipping for Source Separation Networks"