Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

state of the server #1591

Open
MarouaneZhani opened this issue Dec 17, 2024 · 3 comments
Open

state of the server #1591

MarouaneZhani opened this issue Dec 17, 2024 · 3 comments
Labels
enhancement New feature or request needs-triage

Comments

@MarouaneZhani
Copy link

Is your feature request related to a problem? Please describe.

When there is a problem in the backend, the connection to the server is lost, but the service (webui) is still running.
So the pod is still in the running state, and cannot be detected that there is a problem and need at least to be restarted.

Example:
due to an internal timeout error:
raise WorkflowTimeoutError(msg)
llama_index.core.workflow.errors.WorkflowTimeoutError: Operation timed out after 240 seconds

an exception was thrown in the ASGI application and then raising KeyError: 'Session is disconnected'.
Because of that the server was disonnected, and the application cannot be used anymore and needed a manual restart in order to operate again.

Describe the solution you'd like
It would be good to have an internal process/ way of checking the health of the server and if it's still connected.
I have tried it with the already implemented health check using an api call, but it's returning 200 since the WebUI is working, but not the server ...
the idea would be either to implement an additional endpoint to return the status of the server, or for example some way of checking the server from the app itself using an async process running in parallel to the app.

Additional context
Would be necessary for any deployments using chainlit to avoid downtime.

@dosubot dosubot bot added the enhancement New feature or request label Dec 17, 2024
@MarouaneZhani
Copy link
Author

@dosu what do you think ?

Copy link

dosubot bot commented Dec 17, 2024

Currently, there isn't specific information available about a built-in mechanism in Chainlit for checking the backend server's health independently from the WebUI. However, implementing a custom health check endpoint or an asynchronous process to monitor the server's status could be a viable solution.

You could consider creating an additional endpoint that performs specific checks on the backend server's functionality, such as verifying database connections or other critical services, and returns a status code based on these checks. Alternatively, implementing an asynchronous task within the application that periodically checks the server's health and logs or alerts when issues are detected could help in identifying problems early.

These approaches would help in detecting backend issues even when the WebUI is still operational, thus allowing for more proactive management of server health and reducing downtime.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@willydouhard
Copy link
Collaborator

I am guessing you are managing multiple instances of your chainlit app? In the latest release, I fixed some issues with sticky sessions and auto scaling. This documentation might also help https://docs.chainlit.io/deploy/overview#tips-and-tricks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs-triage
Projects
None yet
Development

No branches or pull requests

2 participants