Pjit of train step using Flax' train_state object #1792

mattiasmar · 2022-01-18T11:44:51Z

mattiasmar
Jan 18, 2022

Background: The Flax' train_state object contains the model params and optimizer states whose parallelization is controlled by pjit and have been successfully used by the pjit'ed optimizer.init (as reviewed in discussion #1789).

The last cell of this collab (the cell that begins with "#Pjit of training step") demonstrates a failing attempt to reuse those very same PartionSpec's for the in_axis_resources & out_axis_resources of the pjit'ed train_epoch method of the Flax MNIST example.

Could you tell how to pjit the train_epoch method of the Flax MNIST example, without breaking the train_state object (as my end goal is to create a generic way of pjit'ing models in flax)?

Current error:
IndexError: Array slice indices must have static start/stop/step to be used with NumPy indexing syntax. Found slice(None, Traced<ShapedArray(int32[])>with<DynamicJaxprTrace(level=1/0)>, None). To index a statically sized array at a dynamic position, try lax.dynamic_slice/dynamic_update_slice (JAX does not support dynamically sized arrays within JIT compiled functions).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pjit of train step using Flax' train_state object #1792

{{title}}

Replies: 0 comments

Select a reply

Pjit of train step using Flax' train_state object #1792

mattiasmar Jan 18, 2022

Replies: 0 comments

mattiasmar
Jan 18, 2022