Loading pretrained weights (Flax Linen) #544

rwightman · 2020-10-20T01:19:51Z

rwightman
Oct 20, 2020

I've been experimenting with JAX via Flax Linen and Objax. I have a Flax Linen EfficientNet model def working now and next step was run basic validation on the model.

I got stuck figuring out what a reasonable approach for loading the weights. There are examples for init the model, training, etc. I feel I have a handle on general concepts related to the params, pytrees, etc. But I'm not seeing a quick and easy path to load pretrained weights into a variables param/batch_stats for just running inference.

I've got model weights in a numpy dict, it's flat like a PyTorch dict ('module.module.conv.weight'). I can separate batch_stats and param easily enough. Are there any helpers that would map an existing variables struct, in an assignable form (var[key] = jnp.array(mynpweights)) that I could just iterate (without tree traversal) in model creation order? I'd rather not even map via the keys at this stage as I've changed a few and haven't settled on final names.

Answered by jheek

Oct 20, 2020

I think the easiest way to do what you describe is by using flax.traverse_util.unflatten_dict You pass it the flattened dictionary structure so you just write a function like this:

flat_params = {}
for key, val in mynpweights.items()
  segments = key.split('.')
  # some logic on the segments
  flat_params[segments] = val
params = flax.traverse_util.unflatten_dict(flat_params)

Do you think that works?

We don't have a utility for traversing keys in creation order. I'm also not quite sure if that would be flexible enough? What if our internal layers would lets say create a bias before a kernel or the other way around then this would stop working, I guess?

View full answer

jheek · 2020-10-20T10:02:54Z

jheek
Oct 20, 2020
Maintainer

I think the easiest way to do what you describe is by using flax.traverse_util.unflatten_dict You pass it the flattened dictionary structure so you just write a function like this:

flat_params = {}
for key, val in mynpweights.items()
  segments = key.split('.')
  # some logic on the segments
  flat_params[segments] = val
params = flax.traverse_util.unflatten_dict(flat_params)

Do you think that works?

We don't have a utility for traversing keys in creation order. I'm also not quite sure if that would be flexible enough? What if our internal layers would lets say create a bias before a kernel or the other way around then this would stop working, I guess?

0 replies

rwightman · 2020-10-20T20:51:51Z

rwightman
Oct 20, 2020
Author

@jheek thanks, the flatten/unflatten utils were helpful, I wrote my own that essentially did the same thing but no sense in keeping that... they do actually traverse in the correct order

My current hack is:

    state_dict = load_state_dict('./efficientnet_b0.npz', transpose=True)
    source_params, source_state = split_state_dict(state_dict)

    var_init = model.init({'params': rng}, jnp.ones(input_shape, jnp.float32), training=False)
    var_unfrozen = unfreeze(var_init)
    flat_params = flatten_dict(var_unfrozen['params'])
    flat_state = flatten_dict(var_unfrozen['batch_stats'])
    for ok, ov, sv in zip(flat_params.keys(), flat_params.values(), source_params.values()):
        assert ov.shape == sv.shape
        flat_params[ok] = sv
    for ok, ov, sv in zip(flat_state.keys(), flat_state.values(), source_state.values()):
        assert ov.shape == sv.shape
        flat_state[ok] = sv
    params = unflatten_dict(flat_params)
    batch_stats = unflatten_dict(flat_state)

    model_ut = lambda x: model.apply(dict(params=params, batch_stats=batch_stats), x, mutable=False, training=False)

1 reply

rwightman Oct 20, 2020
Author

The params/stats for the limited layers I am using right now are in the correct order. Brittle yes, but I'd rather skip the name mapping headaches for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading pretrained weights (Flax Linen) #544

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Loading pretrained weights (Flax Linen) #544

rwightman Oct 20, 2020

Replies: 2 comments · 1 reply

jheek Oct 20, 2020 Maintainer

rwightman Oct 20, 2020 Author

rwightman Oct 20, 2020 Author

rwightman
Oct 20, 2020

Replies: 2 comments 1 reply

jheek
Oct 20, 2020
Maintainer

rwightman
Oct 20, 2020
Author

rwightman Oct 20, 2020
Author