Replies: 1 comment 2 replies
-
I can see such an abstraction could be useful for implementing the layers you mention, and I think it could be useful to add it since it will also simplify our BatchNorm implementation somewhat, so I am fine with adding it. I'd also like to hear @jheek's thoughts since he mainly implemented our BatchNorm layer. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In Haiku there is an Exponential Moving Average layer that wrap some NDArray or Pytree and perform ema on the input. It's quite handy to use because most of the "states / variables" in DL model are for moving average.
Aside from the obvious BatchNorm, it's would be very useful to implement Vector Quantization, Momentum Contrast, Bootstrap Your Own Latent etc.
With this layer, most user would not need to handle variables inside module themself.
Would this be a useful addition to Flax?
I have a port of the haiku module in flax and could try to put it in a PR if deemed useful.
Beta Was this translation helpful? Give feedback.
All reactions