state of the server #1591

MarouaneZhani · 2024-12-17T07:45:28Z

Is your feature request related to a problem? Please describe.

When there is a problem in the backend, the connection to the server is lost, but the service (webui) is still running.
So the pod is still in the running state, and cannot be detected that there is a problem and need at least to be restarted.

Example:
due to an internal timeout error:
raise WorkflowTimeoutError(msg)
llama_index.core.workflow.errors.WorkflowTimeoutError: Operation timed out after 240 seconds

an exception was thrown in the ASGI application and then raising KeyError: 'Session is disconnected'.
Because of that the server was disonnected, and the application cannot be used anymore and needed a manual restart in order to operate again.

Describe the solution you'd like
It would be good to have an internal process/ way of checking the health of the server and if it's still connected.
I have tried it with the already implemented health check using an api call, but it's returning 200 since the WebUI is working, but not the server ...
the idea would be either to implement an additional endpoint to return the status of the server, or for example some way of checking the server from the app itself using an async process running in parallel to the app.

Additional context
Would be necessary for any deployments using chainlit to avoid downtime.

MarouaneZhani · 2024-12-17T08:27:17Z

@dosu what do you think ?

dosubot · 2024-12-17T08:30:32Z

Currently, there isn't specific information available about a built-in mechanism in Chainlit for checking the backend server's health independently from the WebUI. However, implementing a custom health check endpoint or an asynchronous process to monitor the server's status could be a viable solution.

You could consider creating an additional endpoint that performs specific checks on the backend server's functionality, such as verifying database connections or other critical services, and returns a status code based on these checks. Alternatively, implementing an asynchronous task within the application that periodically checks the server's health and logs or alerts when issues are detected could help in identifying problems early.

These approaches would help in detecting backend issues even when the WebUI is still operational, thus allowing for more proactive management of server health and reducing downtime.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

willydouhard · 2025-01-06T11:15:45Z

I am guessing you are managing multiple instances of your chainlit app? In the latest release, I fixed some issues with sticky sessions and auto scaling. This documentation might also help https://docs.chainlit.io/deploy/overview#tips-and-tricks

MarouaneZhani added the needs-triage label Dec 17, 2024

dosubot bot added the enhancement New feature or request label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

state of the server #1591

state of the server #1591

MarouaneZhani commented Dec 17, 2024

MarouaneZhani commented Dec 17, 2024

dosubot bot commented Dec 17, 2024

willydouhard commented Jan 6, 2025

state of the server #1591

state of the server #1591

Comments

MarouaneZhani commented Dec 17, 2024

MarouaneZhani commented Dec 17, 2024

dosubot bot commented Dec 17, 2024

willydouhard commented Jan 6, 2025