Can't get loss from a jitted function #1735

nikhilanayak · 2021-12-16T23:54:58Z

nikhilanayak
Dec 16, 2021

I jitted all the big code blocks in my code and it runs fine. However, when I do this, I also want to track the loss after each batch. The issue is that with jit, I can't coerce the Traceds to floats, so I can print them. Is there a way I can do this?

marcvanzee · 2021-12-17T09:28:12Z

marcvanzee
Dec 17, 2021
Maintainer

If you want to output the loss you can return it from the jitted function and output it then. This is done in all our examples, I would recommend taking a look at the mnist example and the Annotated MNIST which contains more explanation.

0 replies

nikhilanayak · 2021-12-17T09:31:48Z

nikhilanayak
Dec 17, 2021
Author

I looked over the FLAX examples, but with my code:

        def loss_fn(params):
                data = dataset[0]
                #data = dataset[0][0]

                rec, out = Autoencoder().apply({"params": params}, data)
                #globals()["p"] = rec, out
                loss = compute_metrics(rec, out)["loss"]
                return jnp.array(loss)

        grads = jax.grad(loss_fn)(state.params)

I never actually return the loss, so I don't have a way to get the loss itself. Am I using the wrong Jax/Flax functions?

0 replies

marcvanzee · 2021-12-17T09:33:38Z

marcvanzee
Dec 17, 2021
Maintainer

If you have taken a look at the examples, then you must have noticed that they don't use jax.grad, but jax.grad_and_value, which returns the gradient and the value. In your case you only return the gradient so you don't have the actual loss.

0 replies

nikhilanayak · 2021-12-17T09:48:58Z

nikhilanayak
Dec 17, 2021
Author

Thanks, that seems to get the loss and the grad. However, this code

state = train_state.TrainState.create(
           apply_fn=Autoencoder().apply,
           params=Autoencoder().init(rng, dataset[0])["params"],
           tx=optax.sgd(0.01)
    )

which was in the main code (not in a function), was causing an OOM, so I wrapped that in a jitted main. However, now I can't print out the loss because I'm still inside jitted code. I tried creating the state in a jit function and getting the value into a non-jit function, but that still causes OOMs. Am I going about this the wrong way or is there just no way to do this?

1 reply

marcvanzee Dec 17, 2021
Maintainer

(Note: I moved this thread from an issue to a GH Discussion and did some reformatting)_

Yes, if you are jitting your entire code you won't be able to print out anything. The trick is to jit out as large code blocks as possible, but still be able to print out useful things when necessary.

Usually you want to jit the function where you expect XLA can optimize, which is where the XLA ops are used, which are when you init and apply your model. So if you only jit the function you put here, then you probably OOM because you haven't jitted the train step. Have you done that?

Perhaps it is also useful to look at our WMT example and see how we jit our code there:

The init function
The apply function (using pmap, which internally jits the function as well)

nikhilanayak · 2021-12-17T19:44:19Z

nikhilanayak
Dec 17, 2021
Author

I tried moving various things in and out of jit, but it seems the actual OOM comes from when this code isn't jitted

    loss, grads = jax.value_and_grad(loss_fn)(state.params)

Because of this, I can't get the actual value of loss (because it needs to be jitted).

1 reply

marcvanzee Dec 17, 2021
Maintainer

Why don't you put this in a function, which you then jit and let that function return the loss as well? Like apply_model in the MNIST example.

nikhilanayak · 2021-12-18T07:13:07Z

nikhilanayak
Dec 18, 2021
Author

Like this (this isn't all the code, just the relevant part)? :

@jit
def update_model(state, grads):
    #loss, grads = jax.value_and_grad(loss_fn)(state.params)
    return state.apply_gradients(grads=grads)

@jit
def train_step(state):
	def loss_fn(params):
		data = dataset[0]

		rec, out = Autoencoder().apply({"params": params}, data)
		loss = compute_metrics(rec, out)["loss"]
		return jnp.array(loss)
        

	loss, grads = jax.value_and_grad(loss_fn)(state.params)
	return loss, update_model(state, grads)
  

optimizer = optim.sgd.GradientDescent(learning_rate=0.01)
EPOCHS = 10

@jit
def create_state():
    return train_state.TrainState.create(
        apply_fn=Autoencoder().apply,
        params=Autoencoder().init(rng, dataset[0])["params"],
        tx=optax.sgd(0.01)
    )

def main():
    state = create_state()
    for epoch in range(EPOCHS):
            bar = tqdm(range(len(dataset)))
            for i in bar:
                    loss, state = train_step(state)
                    print(loss)
                    bar.set_description(f"Epoch {epoch}")


main()

I jitted as much as possible, but the OOM is still coming from the loss, state = train_step(state) line.

1 reply

marcvanzee Dec 19, 2021
Maintainer

Note: if you jit train_step, you don't have to jit other functions that are called within train_step, it will be jitted as one big function and the nested jits are ignored.

Other than that it looks fine, only the fact that you are accessing the globaldataset[0] inside train_step is a bit odd...

You said this code is not causing an OOM if you do things differently, what exactly do you change then?

Would you mind sharing all your code in a Colab? Then I can debug it myself as well and that may be easier.

Note: I will be OOO from now on for the next to week for Christmas holidays.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't get loss from a jitted function #1735

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Can't get loss from a jitted function #1735

nikhilanayak Dec 16, 2021

Replies: 6 comments · 3 replies

marcvanzee Dec 17, 2021 Maintainer

nikhilanayak Dec 17, 2021 Author

marcvanzee Dec 17, 2021 Maintainer

nikhilanayak Dec 17, 2021 Author

marcvanzee Dec 17, 2021 Maintainer

nikhilanayak Dec 17, 2021 Author

marcvanzee Dec 17, 2021 Maintainer

nikhilanayak Dec 18, 2021 Author

marcvanzee Dec 19, 2021 Maintainer

nikhilanayak
Dec 16, 2021

Replies: 6 comments 3 replies

marcvanzee
Dec 17, 2021
Maintainer

nikhilanayak
Dec 17, 2021
Author

marcvanzee
Dec 17, 2021
Maintainer

nikhilanayak
Dec 17, 2021
Author

marcvanzee Dec 17, 2021
Maintainer

nikhilanayak
Dec 17, 2021
Author

marcvanzee Dec 17, 2021
Maintainer

nikhilanayak
Dec 18, 2021
Author

marcvanzee Dec 19, 2021
Maintainer