Create rule S6986: "optimizer.zero_grad()" should be used in conjunct…

…ion with "optimizer.step()" and "loss.backward()"
SonarSource · Jun 6, 2024 · 1eb2c28 · 1eb2c28
1 parent 7549e7c
commit 1eb2c28
Show file tree

Hide file tree

Showing 2 changed files with 73 additions and 27 deletions.
diff --git a/rules/S6986/python/metadata.json b/rules/S6986/python/metadata.json
@@ -1,12 +1,14 @@
 {
-  "title": "FIXME",
+  "title": "\"optimizer.zero_grad()\" should be used in conjunction with \"optimizer.step()\" and \"loss.backward()\"",
   "type": "CODE_SMELL",
   "status": "ready",
   "remediation": {
     "func": "Constant\/Issue",
-    "constantCost": "5min"
+    "constantCost": "1min"
   },
   "tags": [
+    "pytorch",
+    "machine-learning"
   ],
   "defaultSeverity": "Major",
   "ruleSpecification": "RSPEC-6986",
@@ -16,9 +18,7 @@
   "quickfix": "unknown",
   "code": {
     "impacts": {
-      "MAINTAINABILITY": "HIGH",
-      "RELIABILITY": "MEDIUM",
-      "SECURITY": "LOW"
+      "RELIABILITY": "HIGH"
     },
     "attribute": "CONVENTIONAL"
   }

diff --git a/rules/S6986/python/rule.adoc b/rules/S6986/python/rule.adoc
@@ -1,44 +1,90 @@
-FIXME: add a description
-
-// If you want to factorize the description uncomment the following line and create the file.
-//include::../description.adoc[]
-
+This rule raises an issue when PyTorch `optimizer.step()` and `loss.backward()` is used without `optimizer.zero_grad()`.
 == Why is this an issue?
 
-FIXME: remove the unused optional headers (that are commented out)
+In PyTorch the training loop of a neural network is comprised of a several steps: 
+* Forward pass, to pass the data through the model and output predictions
+* Loss computation, to compute the loss based and the predictions and the actual data
+* Backward pass, to compute the gradient loss with the `loss.backward()` method
+* Weights update, to update the model weights with the `optimizer.step()` method
+* Gradients zeroed out, to prevent the gradients to accumulate with the `optimizer.zero_grad()` method
+
+When training a model it is important to reset gradients for each training loop. Failing to do so will skew the 
+results as the update of the model's parameters will be done with the accumulated gradients from the previous iterations.
 
-//=== What is the potential impact?
 
 == How to fix it
-//== How to fix it in FRAMEWORK NAME
+
+To fix the issue call the `optimizer.zero_grad()` method.
 
 === Code examples
 
 ==== Noncompliant code example
 
-[source,text,diff-id=1,diff-type=noncompliant]
+[source,python,diff-id=1,diff-type=noncompliant]
 ----
-FIXME
+import torch
+from my_data import data
+
+loss_fn = torch.nn.CrossEntropyLoss()
+optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
+
+for epoch in range(100): 
+  for i in range(len(data)): 
+      output = model(data[i])
+      loss = loss_fn(output, labels[i])
+      loss.backward()
+      optimizer.step() # Noncompliant: optimizer.zero_grad() was not called in the training loop
 ----
 
 ==== Compliant solution
 
-[source,text,diff-id=1,diff-type=compliant]
+[source,python,diff-id=1,diff-type=compliant]
 ----
-FIXME
+import torch
+from my_data import data, labels
+
+loss_fn = torch.nn.CrossEntropyLoss()
+optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
+
+for epoch in range(100): 
+  for i in range(len(data)): 
+      optimizer.zero_grad()
+      output = model(data[i])
+      loss = loss_fn(output, labels[i])
+      loss.backward()
+      optimizer.step() # Compliant
 ----
 
-//=== How does this work?
+== Resources
+=== Documentation
+
+* PyTorch Documentation - https://pytorch.org/tutorials/beginner/introyt/trainingyt.html#the-training-loop[The Training Loop]
+* PyTorch Documentation - https://pytorch.org/tutorials/recipes/recipes/zeroing_out_gradients.html#zeroing-out-gradients-in-pytorch[Zeroing out gradients in PyTorch]
+* PyTorch Documentation - https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html#torch-optim-optimizer-zero-grad[torch.optim.Optimizer.zero_grad - reference]
+* PyTorch Documentation - https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.step.html#torch-optim-optimizer-step[torch.optim.Optimizer.step - reference] 
+* PyTorch Documentation - https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html#torch-tensor-backward[torch.Tensor.backward - reference]
+
+
+ifdef::env-github,rspecator-view[]
+
+(visible only on this page)
+
+== Implementation specification 
+
+Only in a loop if an optimizer.step() is called and loss.backward() is called, we shall raise the issue.
+
+=== Message 
+
+Primary: Call the {optimizer name}.zero_grad() method
+
+
+=== Issue location
+
+Primary : The {optimizer name}.step() method
 
-//=== Pitfalls
+=== Quickfix
 
-//=== Going the extra mile
+No
 
+endif::env-github,rspecator-view[]
 
-//== Resources
-//=== Documentation
-//=== Articles & blog posts
-//=== Conference presentations
-//=== Standards
-//=== External coding guidelines
-//=== Benchmarks