L1/L2 regularization of network weights with NNX #4160

pushkar5586 · 2024-08-30T21:26:43Z

pushkar5586
Aug 30, 2024

Hi all!

Could you please share a sample code / example showing how to apply L1/L2 regularization on network weights using the NNX API?

Many thanks!

Answered by cgarciae

Sep 5, 2024

Hey @pushkar5586, sorry for the delay. To apply global regularization you could use nnx.state to extract the Params and then follow recipe from #1654. Here is the basic example on the landing page with L2 regularization:

from flax import nnx
import optax
import 


class Model(nnx.Module):
  def __init__(self, din, dmid, dout, rngs: nnx.Rngs):
    self.linear = nnx.Linear(din, dmid, rngs=rngs)
    self.bn = nnx.BatchNorm(dmid, rngs=rngs)
    self.dropout = nnx.Dropout(0.2, rngs=rngs)
    self.linear_out = nnx.Linear(dmid, dout, rngs=rngs)

  def __call__(self, x):
    x = nnx.relu(self.dropout(self.bn(self.linear(x))))
    return self.linear_out(x)

model = Model(2, 64, 3, rngs=nnx.Rngs(0))

View full answer

cgarciae · 2024-09-05T10:36:54Z

cgarciae
Sep 5, 2024
Maintainer

Hey @pushkar5586, sorry for the delay. To apply global regularization you could use nnx.state to extract the Params and then follow recipe from #1654. Here is the basic example on the landing page with L2 regularization:

from flax import nnx
import optax
import 


class Model(nnx.Module):
  def __init__(self, din, dmid, dout, rngs: nnx.Rngs):
    self.linear = nnx.Linear(din, dmid, rngs=rngs)
    self.bn = nnx.BatchNorm(dmid, rngs=rngs)
    self.dropout = nnx.Dropout(0.2, rngs=rngs)
    self.linear_out = nnx.Linear(dmid, dout, rngs=rngs)

  def __call__(self, x):
    x = nnx.relu(self.dropout(self.bn(self.linear(x))))
    return self.linear_out(x)

model = Model(2, 64, 3, rngs=nnx.Rngs(0))  # eager initialization
optimizer = nnx.Optimizer(model, optax.adam(1e-3))  # reference sharing

def l2_loss(x, alpha):
    return alpha * (x ** 2).sum()

@nnx.jit # automatic state management
def train_step(model, optimizer, x, y):
  def loss_fn(model):
    y_pred = model(x) 
    loss = ((y_pred - y) ** 2).mean() # model loss
    loss += sum(
        l2_loss(w, alpha=0.001) 
        for w in jax.tree_leaves(nnx.state(model, nnx.Param))
    )

  loss, grads = nnx.value_and_grad(loss_fn)(model)
  optimizer.update(grads)  # inplace updates

  return loss

2 replies

hyunwoooh5 Sep 10, 2024

Is there anyway that I can only add weights for l2_loss? This one includes biases.

cgarciae Sep 10, 2024
Maintainer

Indeed, create a Filter to select he non-bias Params and pass it to nnx.state e.g.

def non_bias(path: tuple, value):
  return path[-1] != 'bias'
  
non_bias_params = nnx.All(non_bias, nnx.Param)
...
loss += sum(
    l2_loss(w, alpha=0.001) 
    for w in jax.tree_leaves(nnx.state(model, non_bias_params))
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

L1/L2 regularization of network weights with NNX #4160

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

L1/L2 regularization of network weights with NNX #4160

pushkar5586 Aug 30, 2024

Replies: 1 comment · 2 replies

cgarciae Sep 5, 2024 Maintainer

hyunwoooh5 Sep 10, 2024

cgarciae Sep 10, 2024 Maintainer

pushkar5586
Aug 30, 2024

Replies: 1 comment 2 replies

cgarciae
Sep 5, 2024
Maintainer

cgarciae Sep 10, 2024
Maintainer