💡 [REQUEST] - Clarification of requires_grad about beginner/nn_tutorial.html

### 🚀 Describe the improvement or the new tutorial

It was challenging for me to initially grasp why `requires_grad` was done *after* weights, but *in the same line* as bias under https://docs.pytorch.org/tutorials/beginner/nn_tutorial.html#neural-net-from-scratch-without-torch-nn

At first glance, the code looks inconsistent:

1. `weights` initialization is split into two lines.
2. `bias` initialization is done in one line.

**The Logic Gap**
The tutorial currently explains *that* we do it, but not exactly *why* the distinction exists between these two specific variables.

* **The Bias** is created using a factory function (`torch.zeros`) with no subsequent mathematical operations. It is born as a "Leaf Node" (a source parameter).
* **The Weights** involve a mathematical operation (`/ math.sqrt(...)`). If we set `requires_grad=True` inside `torch.randn()`, PyTorch records the division as a computational step. The resulting `weights` variable becomes a **non-leaf node** (a calculated outcome), which the optimizer cannot update.

**Proposed Improvement**
I propose modifying the comment block to explicitly mention that `requires_grad` must be deferred until *after* the initialization math is complete to preserve the tensor as a trainable parameter (Leaf Node).

### Existing tutorials on this topic

* https://docs.pytorch.org/tutorials/beginner/nn_tutorial.html
* https://docs.pytorch.org/tutorials/beginner/nn_tutorial.html#neural-net-from-scratch-without-torch-nn

### Additional context

<img width="1375" height="560" alt="Image" src="https://github.com/user-attachments/assets/07bb6dfb-2b18-4665-97e8-85682ba6c2bd" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!