Improve `torch_xla.compile` documentation #9194

sdasgup3 · 2025-05-19T07:09:29Z

fixes #8859

docs/source/learn/eager.md

mikegre-google · 2025-05-20T21:22:13Z

docs/source/learn/eager.md


-## Basic Usage
+- **Recompilation Overhead**: Non-core operations (e.g., data preprocessing)
+could leak into the graph, triggering expensive recompilations.


... can leak ....

docs/source/learn/eager.md

mikegre-google · 2025-05-20T21:27:06Z

docs/source/learn/eager.md

+- `torch_xla.compile` is optimized for PyTorch/XLA training workflows.  Designed
+to work efficiently with the XLA backend for iterative training, it's the
+recommended API for compiling training loops due to its observed performance
+advantages. Best practice dictates enclosing the complete training step—forward


The best practice is to enclose the complete training step-forward pass, ...

docs/source/learn/eager.md

mikegre-google · 2025-05-20T21:34:23Z

docs/source/learn/eager.md

+
+The benchmark results unequivocally demonstrate that eager mode combined with
+`torch_xla.compile` achieves performance parity with the traditional LazyTensor
+tracing mode, both yielding `147` tokens/s. This empirically validates the claim


"... This shows that the new API provides a better user experience without a performance penalty. " I'm trying to keep sentence length down to ease readibility. I don't think the last phrase ("no-regret ...") is needed.

I did some wordsmiths. PTAL

docs/source/learn/eager.md

tengyifei · 2025-05-21T18:10:20Z

docs/source/learn/eager.md


-## Basic Usage
+- **Recompilation Overhead**: Non-core operations (e.g., data preprocessing) can


I think it could be clearer to explain that these non-core operations are all recorded. Maybe this phrasing:

- **Recompilation Overhead**: Whenever any part of the captured graph changes, `torch_xla.sync()` will recompile the whole graph. Changes in non-core operations (e.g., data preprocessing) thus trigger expensive recompilations.

tengyifei · 2025-05-21T18:10:52Z

docs/source/learn/eager.md

+
+To address these issues, PyTorch/XLA introduces an experimental eager mode
+(enabled via `torch_xla.experimental.eager_mode(True)`) and the
+`torch_xla.compile` API. **This shift aligns PyTorch/XLA more closely with


Did you mean to emphasize this sentence? Seems a bit jarring to me but just a minor opinion.

tengyifei · 2025-05-21T18:11:50Z

docs/source/learn/eager.md

+performance**. Eager mode is likely to become the default in future releases.
+
+- **Eager Mode**: Executes operations immediately, enhancing flexibility and
+debugging but at a performance cost for core tasks.


I think "core tasks" is vague. Could simply remove: "[...] but at a performance cost."

tengyifei · 2025-05-21T18:17:44Z

docs/source/learn/eager.md

+This varying overhead means pure eager mode is not intended for main training or
+inference loops. Its utility lies in non-core tasks like data preprocessing,
+random number generation, custom utilities, or debugging, where immediate
+execution is prioritized over throughput.


We also need to call out that torch_xla.compile is independently useful, even not in eager mode. That's why torchprime does: https://github.com/AI-Hypercomputer/torchprime/blob/31c450e82c6273f50f9815351f6fbebb42903f58/torchprime/torch_xla_models/train.py#L330

Wrapping the training loop in torch_xla.compile provides a few benefits:

There's no mark_step anywhere

Dataloading operations don't leak into the training loop graph. This benefit is similar to if eager mode was turned on. The only difference is that the dataloading operations are captured into a separate graph as opposed to running eagerly.

torch_xla.compile(full_graph=True) will catch accidental graph breaks

Since torch_xla.compile isn't really tied to eager mode, I think it could be clearer to rename the document heading as such. Alternatively, we could move the contents about torch_xla.compile to a separate compile.md, and then talk about its interaction with eager mode in this markdown.

sdasgup3 requested review from mikegre-google and tengyifei as code owners May 19, 2025 07:09

sdasgup3 requested a review from ghpvnist May 19, 2025 07:10

mikegre-google reviewed May 19, 2025

View reviewed changes

docs/source/learn/eager.md Outdated Show resolved Hide resolved

docs/source/learn/eager.md Outdated Show resolved Hide resolved

docs/source/learn/eager.md Outdated Show resolved Hide resolved

docs/source/learn/eager.md Outdated Show resolved Hide resolved

sdasgup3 force-pushed the gh-8859 branch 2 times, most recently from 69266ed to 890336c Compare May 19, 2025 19:08

qihqi requested a review from mikegre-google May 20, 2025 18:12

mikegre-google reviewed May 20, 2025

View reviewed changes

sdasgup3 force-pushed the gh-8859 branch from 890336c to dfe31ee Compare May 20, 2025 22:48

ghpvnist approved these changes May 20, 2025

View reviewed changes

docs/source/learn/eager.md Outdated Show resolved Hide resolved

Improve torch_xla.compile documentation

e23349b

sdasgup3 force-pushed the gh-8859 branch from dfe31ee to e23349b Compare May 20, 2025 23:26

tengyifei requested changes May 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `torch_xla.compile` documentation #9194

Improve `torch_xla.compile` documentation #9194

sdasgup3 commented May 19, 2025

mikegre-google May 20, 2025

mikegre-google May 20, 2025

sdasgup3 May 20, 2025

mikegre-google May 20, 2025

sdasgup3 May 20, 2025

tengyifei May 21, 2025

tengyifei May 21, 2025

tengyifei May 21, 2025

tengyifei May 21, 2025

tengyifei May 21, 2025


		## Basic Usage
		- Recompilation Overhead: Non-core operations (e.g., data preprocessing) can

Improve torch_xla.compile documentation #9194

Are you sure you want to change the base?

Improve torch_xla.compile documentation #9194

Conversation

sdasgup3 commented May 19, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Improve `torch_xla.compile` documentation #9194

Improve `torch_xla.compile` documentation #9194