Preferred pattern for freezing layers/params #1176

schrute99 · 2021-03-23T09:51:30Z

schrute99
Mar 23, 2021

Hello everyone,

How would I go about freezing some of my layers/params (for transfer learning)?

Lets say in the following example from the docs, how would I freeze only the kernel weight?

class SimpleDense(nn.Module):
  features: int
  kernel_init: Callable = nn.initializers.lecun_normal()
  bias_init: Callable = nn.initializers.zeros

  @nn.compact
  def __call__(self, inputs):
    kernel = self.param('kernel',
                        self.kernel_init, 
                        (inputs.shape[-1], self.features)) 
    y = lax.dot_general(inputs, kernel,
                        (((inputs.ndim - 1,), (0,)), ((), ())),) 
    bias = self.param('bias', self.bias_init, (self.features,))
    y = y + bias
    return y

And then in this simple network, how would I freeze only the first layer?

class SimpleMLP(nn.Module):
  features1: int
  features2: int

  @nn.compact
  def __call__(self, x):
    x = nn.Dense(features1, name='dense1')(x)
    x = nn.Dense(features2, name='dense2')(x)
    return x