From c4222e01b07444734082fe605f10d0bc0acab6a4 Mon Sep 17 00:00:00 2001
From: Marc van Zee <marcvanzee@gmail.com>
Date: Fri, 14 May 2021 15:59:41 +0000
Subject: [PATCH 1/5] Flax version 0.3.4

---
 CHANGELOG.md    | 44 +++++++++++++++++++++++++++++++++++---------
 README.md       |  2 +-
 flax/version.py |  2 +-
 3 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index e603f6bdbf..0a60314d96 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,28 +7,54 @@ vNext
 (Add your change to a random empty line to avoid merge conflicts)
  -
  -
- - Added an NLP text classification example (SST-2 sentiment) to examples/sst2.
-   that uses a birectional LSTM (BiLSTM) to encode the input text.
- - Added flax.training.train_state to simplifying using Optax optimizers.
- - Rewrote ImageNet example to use Optax instead of flax.optim for optimizers.
  -
- - `mutable` argument is now available on `Module.init` and `Module.init_with_outputs`
- - When calling `init` the 'intermediates' collection is no longer mutable
-   Therefore, intermediates will no longer be returned from initialization by default. 
  -
  -
- - Bug Fix: Correclty handle non-default parameters of Linen Modules with nested inheritance.
  -
  -
  -
  -
- - `BatchNorm` instances will behave correctly during init when called multiple times.
  -
  -
  -
  -
  -
+ -
+ -
+ -
+ -
+ -
+ -
+ -
+ -
+ -
+
+0.3.4
+------
+
+Possibly breaking changes:
+ - When calling `init` the 'intermediates' collection is no longer mutable.
+   Therefore, intermediates will no longer be returned from initialization by default. 
+ - Don't update batch statistics during initialization.
+ - Attention: require deterministic only if using dropout
+
 
+Other changes:
+ - Rewrote various examples to use Optax instead of Flax optimizers (e.g., Imagenet, SST2)
+ - Added an NLP text classification example (SST-2 sentiment) to examples/sst2.
+   that uses a birectional LSTM (BiLSTM) to encode the input text.
+ - Added flax.training.train_state to simplifying using Optax optimizers.
+ - `mutable` argument is now available on `Module.init` and `Module.init_with_outputs`
+ - Bug Fix: Correctly handle non-default parameters of Linen Modules with nested inheritance.
+ - Expose dot_product_attention_weights, allowing access to attention weights.
+ - `BatchNorm` instances will behave correctly during init when called multiple times.
+ - Added a more extensive "how to contribute" guide in `contributing.md`.
+ - Add proper cache behavior for lift.jit, fixing cache misses.
+ - Fix bug in Embed layer: make sure it behaves correctly when embedding is np.array.
+ - Fix linen.Module for deep inheritance chains.
+ - Fix bug in DenseGeneral: correctly expand bias to account for batch & noncontracting dimensions.
+ - Allow Flax lifted transforms to work on partially applied Modules.
+ - Make MultiOptimizer use apply_gradient instead of apply_param_gradient
 
 0.3.3
 ------
diff --git a/README.md b/README.md
index ee3ebba287..eb442526e6 100644
--- a/README.md
+++ b/README.md
@@ -153,7 +153,7 @@ To cite this repository:
   author = {Jonathan Heek and Anselm Levskaya and Avital Oliver and Marvin Ritter and Bertrand Rondepierre and Andreas Steiner and Marc van {Z}ee},
   title = {{F}lax: A neural network library and ecosystem for {JAX}},
   url = {http://github.com/google/flax},
-  version = {0.3.3},
+  version = {0.3.4},
   year = {2020},
 }
 ```
diff --git a/flax/version.py b/flax/version.py
index 34dcfec74e..4ca93680b0 100644
--- a/flax/version.py
+++ b/flax/version.py
@@ -12,5 +12,5 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.3.3"
+__version__ = "0.3.4"
 

From 7788281d0a1378d76e700354cde3aa8e4158b91e Mon Sep 17 00:00:00 2001
From: Marc van Zee <marcvanzee@google.com>
Date: Tue, 18 May 2021 12:02:11 +0200
Subject: [PATCH 2/5] Update CHANGELOG.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
---
 CHANGELOG.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 0a60314d96..8509b748ee 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -41,8 +41,9 @@ Possibly breaking changes:
 
 Other changes:
  - Rewrote various examples to use Optax instead of Flax optimizers (e.g., Imagenet, SST2)
- - Added an NLP text classification example (SST-2 sentiment) to examples/sst2.
-   that uses a birectional LSTM (BiLSTM) to encode the input text.
+ - Added an NLP text classification example (on the SST-2 dataset) to
+   [`examples/sst2`](https://github.com/google/flax/tree/master/examples/sst2).
+   that uses a bidirectional LSTM (BiLSTM) to encode the input text.
  - Added flax.training.train_state to simplifying using Optax optimizers.
  - `mutable` argument is now available on `Module.init` and `Module.init_with_outputs`
  - Bug Fix: Correctly handle non-default parameters of Linen Modules with nested inheritance.

From e54f8e704140a647331a0acb23e76cd761b824ad Mon Sep 17 00:00:00 2001
From: Marc van Zee <marcvanzee@google.com>
Date: Tue, 18 May 2021 12:03:32 +0200
Subject: [PATCH 3/5] Apply suggestions from code review

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
---
 CHANGELOG.md | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 8509b748ee..346f0c2bd3 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -44,18 +44,19 @@ Other changes:
  - Added an NLP text classification example (on the SST-2 dataset) to
    [`examples/sst2`](https://github.com/google/flax/tree/master/examples/sst2).
    that uses a bidirectional LSTM (BiLSTM) to encode the input text.
- - Added flax.training.train_state to simplifying using Optax optimizers.
+ - Added `flax.training.train_state` to simplify using Optax optimizers.
  - `mutable` argument is now available on `Module.init` and `Module.init_with_outputs`
- - Bug Fix: Correctly handle non-default parameters of Linen Modules with nested inheritance.
- - Expose dot_product_attention_weights, allowing access to attention weights.
+ - Bug fix: Correctly handle non-default parameters of Linen Modules with nested inheritance.
+ - Expose `dot_product_attention_weights`, allowing access to attention weights.
  - `BatchNorm` instances will behave correctly during init when called multiple times.
  - Added a more extensive "how to contribute" guide in `contributing.md`.
- - Add proper cache behavior for lift.jit, fixing cache misses.
+ - Add proper cache behavior for [`lift.jit`](https://flax.readthedocs.io/en/latest/_autosummary/flax.linen.jit.html#flax.linen.jit),
+fixing cache misses.
  - Fix bug in Embed layer: make sure it behaves correctly when embedding is np.array.
- - Fix linen.Module for deep inheritance chains.
+ - Fix `linen.Module` for deep inheritance chains.
  - Fix bug in DenseGeneral: correctly expand bias to account for batch & noncontracting dimensions.
  - Allow Flax lifted transforms to work on partially applied Modules.
- - Make MultiOptimizer use apply_gradient instead of apply_param_gradient
+ - Make `MultiOptimizer` use `apply_gradient` instead of `apply_param_gradient`.
 
 0.3.3
 ------

From 13f16c18da6a5c9f8ab1998b92e84a04bbc72b3e Mon Sep 17 00:00:00 2001
From: Marc van Zee <marcvanzee@google.com>
Date: Tue, 18 May 2021 12:08:54 +0200
Subject: [PATCH 4/5] Update CHANGELOG.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
---
 CHANGELOG.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 346f0c2bd3..cf1d9de345 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -36,7 +36,7 @@ Possibly breaking changes:
  - When calling `init` the 'intermediates' collection is no longer mutable.
    Therefore, intermediates will no longer be returned from initialization by default. 
  - Don't update batch statistics during initialization.
- - Attention: require deterministic only if using dropout
+ - When not using any non-determinism (e.g., dropout), it is not longer necessary to specify the `deterministic` argument in `MultiHeadDotProductAttention`.
 
 
 Other changes:

From 3f2775ee6ccbeb238981431c7400fc537033a601 Mon Sep 17 00:00:00 2001
From: Marc van Zee <marcvanzee@google.com>
Date: Tue, 18 May 2021 12:09:41 +0200
Subject: [PATCH 5/5] Update CHANGELOG.md

---
 CHANGELOG.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index cf1d9de345..7b2151d164 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -40,7 +40,7 @@ Possibly breaking changes:
 
 
 Other changes:
- - Rewrote various examples to use Optax instead of Flax optimizers (e.g., Imagenet, SST2)
+ - Rewrote various examples to use Optax instead of Flax optimizers (e.g., Imagenet, SST2).
  - Added an NLP text classification example (on the SST-2 dataset) to
    [`examples/sst2`](https://github.com/google/flax/tree/master/examples/sst2).
    that uses a bidirectional LSTM (BiLSTM) to encode the input text.