-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Web Workflow Access Causes Program Pause And Board Freeze #9171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I see a similar situation, where code is paused when fetching a URL from the web workflow, to the point of causing a complete freeze of the board, losing USB and everything. When the board's code is particularly busy, this happens easily. The freeze seems to be triggered by some access by web workflow, including the single scanning done by the web workflow home page of another board or the recurrent scanning by discotool manager (both of which retrieve some information on the board after detecting it on MDNS). When connected to USB, the frozen board does not respond to ctrl-C but usually still has the USB drive work, but sometimes also error and unmount after a little while, without coming back. The web workflow might remain working when the code is frozen, and apparently using the web workflow might make the code run again for a bit. On a board with a more complex code, including a neopixel strip, a dotstar strip and a webserver, the freezing happens easily during normal web workflow use, making it quite difficult to use. I usually don't see it recover after the board freezes. Repro on:
Repro: latest, 9.2.4, 9.0.0 Here is a simple code that helps visualize the board freezing: import board
import time
import neopixel
status = neopixel.NeoPixel(board.NEOPIXEL, 1)
while True:
for color in [0x200020, 0x002020]:
status.fill(color)
time.sleep(0.25) Here is some python code that connects to the board's web workflow in a loop to force trigger the freeze: import requests, sys, time
from datetime import datetime as d
ADDRESS = "192.168.1.38"
if len(sys.argv) > 1: ADDRESS = sys.argv[1]
url = f"http://{ADDRESS}/cp/version.json"
was_ok = None
t0 = d.now()
try:
while True:
is_ok = True
try:
with requests.get(url, timeout=1) as response:
is_ok &= True
except (requests.exceptions.ReadTimeout, requests.exceptions.ConnectionError):
is_ok &= False
print((f"{str(d.now()-t0)[:7]} " + ("ERROR","ok")[is_ok]).ljust(60), "\033[1G\033[1A")
if is_ok != was_ok:
print()
was_ok = is_ok
time.sleep(0.1)
except KeyboardInterrupt:
print() On a QTPY S2, this usually triggers the issue after approximately 30 seconds.
|
Web workflow responses are currently blocking. So, if they take a while, then everything else will be starved. I think the easiest way to fix this will be switching to Zephyr (or another RTOS). That way the web workflow can run in a separate thread and yield as it waits for sockets. |
Is blocking new to CP9 ? On 9.x latest, the test starts failing within 30 seconds. In fact it's quite regular, the first error since reset happens after 140 to 150 requests, regardless of the sleep duration in the test script (tested 10ms, 50ms, 100ms). import board
import time
import neopixel
import adafruit_dotstar
pixel = neopixel.NeoPixel(board.NEOPIXEL, 90)
pidots = adafruit_dotstar.DotStar(board.SCK, board.MISO, 90)
while True:
for color in [0x200020, 0x002020]:
pixel.fill(color)
pidots.fill(color)
time.sleep(0.5) |
No, it isn't new to CP9. CP9 did upgrade to IDF 5 though. It was a big step and the "wake circuitpython up from socket activity" is complicated.
The easiest way to hunt this down may be a git bisect. It'll be time consuming but also enlightening. |
CircuitPython version
Code/REPL
Behavior
Accessing the Welcome page of the web workflow can cause the executing program to pause. With the above code, the LED stops flashing. Clicking on the Full Code Editor link causes the program to resume.
Description
This issue does not seem to happen in 8.2.10.
Additional information
With the above code, if the pause happens and you wait more than a minute, then resuming by entering the Full Code Editor leads to an immediate watchdog exception.
The text was updated successfully, but these errors were encountered: