Skip to content

Jit'ing the grad function #1958

Answered by marcvanzee
srush asked this question in Q&A
Mar 3, 2022 · 1 comments · 1 reply
Discussion options

You must be logged in to vote

(I am not sure if I fully understand your question, so please clarify if my answer below is not what you were asking.)

Do you mean you are calling grad inside your Module's apply function? If you jit something inside another jit block, then the inner jit should be a no-op. We usually jit the entire train function, which calls the grad function so that will be jitted inside this bigger block (we actually prefer jitting bigger blocks because it gives XLA more opportunity for optimizing things -- at the cost of longer compile time).

Since we call .apply each time, I would need to run jit in the inner loop, which seems bad.

The jitted function will be compiled only once for each shape it is…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@srush
Comment options

Answer selected by srush
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants