Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements for PGMQueue Python Library #216

Open
3 of 6 tasks
tavallaie opened this issue May 16, 2024 · 8 comments
Open
3 of 6 tasks

Enhancements for PGMQueue Python Library #216

tavallaie opened this issue May 16, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@tavallaie
Copy link
Contributor

tavallaie commented May 16, 2024

I am considering adding new functionalities to the PGMQueue library to enhance its capabilities. Below is a list of potential functions. I am seeking input from other developers on which functions should be prioritized, and any additional suggestions.

Potential New Functions

  • 1. Queue Status/Length:

    • get_queue_length: Get the current number of messages in the queue.
  • 2. Queue Monitoring:

    • get_queue_stats: Retrieve various statistics about the queue, such as the number of messages enqueued, processed, failed, etc.
  • 3. Message Management:

    • requeue_message: Requeue a specific message to process it again.
    • update_message: Update the content of a specific message in the queue.
  • 4. Batch Operations:

    • delete_batch: Delete a batch of messages by their IDs.
    • archive_batch: Archive a batch of messages by their IDs.
  • 5. Advanced Queue Operations:

    • pause_queue: Temporarily pause processing of messages in the queue. (Requires schema changes)
    • resume_queue: Resume processing of messages in a paused queue. (Requires schema changes)
    • delete_queue: Delete an entire queue and all its messages. (May require schema changes)
  • 6. Message Retrieval:

    • peek_message: Retrieve a message without removing it from the queue.
    • peek_batch: Retrieve a batch of messages without removing them from the queue.

How You Can Help

Please provide feedback on the following:

  1. Which functions do you think are most critical to add?
  2. Are there any additional functions you believe would be beneficial?
  3. Any other suggestions or comments?
@ChuckHend
Copy link
Member

Great ideas @tavallaie . Here's some thoughts:

For all of these features that already have a corresponding SQL function, I think it makes a lot of sense to just implement the function in the python lib. pgmq.metrics(), pgmq.metrics_all() have queue length, and some other stats about the queue. pgmq.delete() and pgmq.archive() both support batch, but this is not implemented in the python lib yet.

I like the requeue idea, and we can probably implement that using pgmq.set_vt(). Also update existing message would be useful. A separate feature people have asked for is something like pgmq.delete_where() that would delete all messages in a queue that have messages containing some value.

Pause and resume would be nice features too. Are you thinking this would prevent new messages from being able to reach the queue, or just prevent existing messages from being consumed?

@tavallaie
Copy link
Contributor Author

I'm sorry for being late :)
I was occupied with PR #222.
I'll add the metrics soon, and then we can decide what's next and evaluate its profitability.

@markbalazon
Copy link

markbalazon commented May 29, 2024

I think the above are great ideas as well, I love the requeue/set_vt() idea and would benefit from a corresponding python implementation as well!

@tavallaie
Copy link
Contributor Author

I am working on features, but lately I was busy with my daily job. So I don't have enough time to implement all. any contributions more than welcome.

@ChuckHend ChuckHend added the enhancement New feature or request label Jun 6, 2024
@tavallaie
Copy link
Contributor Author

Pause and resume would be nice features too. Are you thinking this would prevent new messages from being able to reach the queue, or just prevent existing messages from being consumed?

I think having both options would be very useful:

  1. Pausing new messages from entering the queue.
  2. Pausing the consumption of existing messages.

This will be very helpful for machine learning or other heavy tasks.

By stopping new messages from entering the queue, we can avoid overloading the system. For example, during system maintenance or upgrades, we can pause new messages to ensure no new data interferes with the process.

By pausing the consumption of existing messages, we can manage the processing load better. This is useful when we need to allocate resources to more urgent tasks or when performing intensive computations. These features will help keep the system stable and efficient.

@v0idpwn
Copy link
Collaborator

v0idpwn commented Jun 7, 2024

I think both concerns are probably of the application using the library, not of the library itself.

@ChuckHend
Copy link
Member

probably of the application using the library, not of the library itself.

I think I agree. But, @tavallaie could you explain a bit more about the conditions or scenario where something sets the queue as "paused" for both concerns? In the case of consumers that are executing heavy tasks and are getting overloaded, what I typically see designs where the workers are told to stop consuming, rather than the queue told to stop accepting / distributing messages. A very common use case of a queue is to buffer the messages from the workers, so that the workers can decide when to pull in new work rather than having the works force fed the work, regardless of their ability to take on new work.

Although as an admin functionality, I could see having the ability to set the visibility of all existing messages at once, like a pgmq.set_vt_all() or something.

By stopping new messages from entering the queue, we can avoid overloading the system. For example, during system maintenance or upgrades, we can pause new messages to ensure no new data interferes with the process.

@tavallaie , in this scenario, what are you thinking the producers of the messages would need to do during "paused" periods? Would they receive errors when publishing? It seems like to producers, a "paused" queue would be no different than the database being down?

@tavallaie
Copy link
Contributor Author

tavallaie commented Jun 9, 2024

@ChuckHend, I will test different scenarios to understand these features better. This will show us if they are useful and if they should be in the SDK or done by the user.

Maybe it is better to have a separate issue for this discussion so other things solved in this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants