-
Notifications
You must be signed in to change notification settings - Fork 410
Add modular threads #1795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add modular threads #1795
Conversation
| GetMinThreads() int | ||
| ThreadActivatedNotification(threadId int) | ||
| ThreadDrainNotification(threadId int) | ||
| ThreadDeactivatedNotification(threadId int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a specific use case for these notifications? Otherwise it would allow moving this code to the ext module, which removes a lot of coupling.
If tasks are sent via request, then the workers or even frankenphp don't really need to know about 'external workers', they just need to handle the requests.
It would even be cleaner to expose hooks for thread start/stop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like these worker options:
func WithWorkerOnThreadActivation(hook func(threadId int)) WorkerOption
func WithWorkerOnThreadDrain(hook func(threadId int)) WorkerOption
func WithWorkerOnThreadDeactivation(hook func(threadId int)) WorkerOptionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a specific use case for these notifications? Otherwise it would allow moving this code to the ext module, which removes a lot of coupling.
It can be quite handy to know how many worker threads you have running (especially with autoscaling). That being said, it isn’t necessary. Good call about the decoupling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe can we provide a default implementation as an embeddable struct, but this can be done in a follow up PR.
|
I'm playing with this, I would like to show something using it at API Platform Con, it's almost working but I've a weird issue: the request sent to the worker by ProvidedRequest seems to always be empty PHP-side. I've something like this: func (w *worker) ProvideRequest() *frankenphp.WorkerRequest {
msg := <-w.messages
caddy.Log().Info("Provided request", zap.String("request", string(msg)))
u, _ := url.Parse("https://example.com")
u.Query().Add("request", string(msg))
return &frankenphp.WorkerRequest{
Request: &http.Request{
URL: u,
Body: io.NopCloser(bytes.NewReader(msg)),
},
}
}and both Do you have an idea of what's going on @withinboredom? Another thing that I notice while playing with this: it would be nice to be able to pass PHP values directly to the worker script. In my case, I can compute PHP arrays directly from the Go-side, but here I've to encode the data in JSON Go-side, and decode it PHP-side. This seems unnecessary. We would think to a more low level API allowing to pass extra parameters to the closure in |
| @@ -0,0 +1,108 @@ | |||
| package frankenphp | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
customworker.go?
| // Note: External workers receive the lowest priority when determining thread allocations. If GetMinThreads cannot be | ||
| // allocated, then frankenphp will panic and provide this information to the user (who will need to allocate more | ||
| // total threads). Don't be greedy. | ||
| type WorkerExtension interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| type WorkerExtension interface { | |
| type CustomWorker interface { |
?
threadFramework.go
Outdated
| rq = externalWorker.ProvideRequest() | ||
| }() | ||
|
|
||
| if rq == nil || rq.Request == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a race condition here: the goroutine could not have set rq yet.
threadFramework.go
Outdated
|
|
||
| // startExternalWorkerPipe creates a pipe from an external worker to the main worker. | ||
| func startExternalWorkerPipe(w *worker, externalWorker WorkerExtension, thread *phpThread) { | ||
| go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe could we start startExternalWorkerPipe() using the go keyword on the caller side to simplify the code?
threadFramework.go
Outdated
| func startExternalWorkerPipe(w *worker, externalWorker WorkerExtension, thread *phpThread) { | ||
| go func() { | ||
| defer func() { | ||
| if r := recover(); r != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we let the program panicking instead? To me such errors are unexpected and shouldn't be handled on our side.
threadFramework.go
Outdated
| var rq *WorkerRequest | ||
| func() { | ||
| defer func() { | ||
| if r := recover(); r != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
|
How are you creating your body @dunglas? Here's the gist of what I'm doing to create requests: https://github.com/bottledcode/durable-php/pull/160/files#diff-67975c8380131293fe23decd46264638b7225b03f08297dbd46d1cd581b09ed7R150-R222 If you're looking for an example to show, I have durable php I could show off -- no more Doctrine, no more databases... just regular php code that magically survives requests.
I'm handling this by providing a global object that is magically given context of the current situation. It's not ideal, but I didn't feel like fighting this (yet! I generally follow the principle of 'get it working' before I focus on 'get it working well' and then finally move on to 'get it working fast'). But basically you just do |
|
Actually, the URL is correctly passed to the worker, it was my bad. However the body is empty. Here is my code: func (w *worker) ProvideRequest() *frankenphp.WorkerRequest {
msg := <-w.messages
p := url.Values{}
p.Set("request", string(msg))
u := url.URL{RawQuery: p.Encode()} // This works
req, err := http.NewRequest("POST", u.String(), bytes.NewReader(msg)) // The body is empty
if err != nil {
panic(err)
}
return &frankenphp.WorkerRequest{
Request: req,
}
} |
|
Are you sure that msg actually has bytes after calling |
|
Yes: b, _ := io.ReadAll(req.Body)
caddy.Log().Info("body", zap.String("request", string(b)))It logs the correct string, even in |
|
I'd check in worker.go that the body is still set when it sends the request to php. Otherwise, I can't imagine it where it would go between Go and PHP. |
|
In |
|
Another question: is there an easy way to execute some code when we get the response from PHP? Using a custom implementation of |
|
This is my worker script for this branch: https://github.com/bottledcode/durable-php/blob/61bd04de5367f9729a84a56eb1184a50585b8308/src/Glue/worker.php and it eventually calls this to get the body: https://github.com/bottledcode/durable-php/blob/61bd04de5367f9729a84a56eb1184a50585b8308/src/Glue/Glue.php#L106 I'll have to run it to verify it gets a payload; I'll check it in a bit. |
This is pretty much the minimalist ResponseWriter. We could make this quite a bit simpler. |
|
Ok I figured out the problem with request bodies. For the request body to be available, a ResponseWriter needs to be passed to Line 508 in 960dd20
This works as expected: &frankenphp.WorkerRequest{
Request: req,
Response: httptest.NewRecorder(),
} |
|
Regarding the response writer, I may be missing something but I seen no easy way to be notified when the full body has been received (except playing with the |
|
Ah :) yes. The good ole 'lets just stream a response' problem... Basically PHP won't write to the response writer unless and until you call |
|
(In my case, I was streaming a response one line at a time to send multiple messages. So, this was a bit simpler. Echo a line and then flush.) |
|
If you want to get more experimental, you can also add a new type of thread and pass readonly PHP objects directly via a Basically, a stripped down worker without the whole request/response cycle ( |
|
We don't have handy-dandy (go-side constructed) objects in the generator, yet. Then you get into "types" and there are no generics ... yet. And then you may as well pass back a PSR Request object ... |
457c2cf to
2a7dc7f
Compare
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
* Simplify * remove unused variable * log thread index
2a7dc7f to
2bfac92
Compare
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
Signed-off-by: Robert Landers <[email protected]>
|
@dunglas: fixed a couple of segfaults for when there is no parameter or return value. |
| GetMinThreads() int | ||
| ThreadActivatedNotification(threadId int) | ||
| ThreadDrainNotification(threadId int) | ||
| ThreadDeactivatedNotification(threadId int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe can we provide a default implementation as an embeddable struct, but this can be done in a follow up PR.
| // The request for your worker script to handle | ||
| Request *http.Request | ||
| // Response is a response writer that provides the output of the provided request, it must not be nil to access the request body | ||
| Response http.ResponseWriter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure that these fields are optional (they are useless for many use cases, and we could do some optimisations when they are nil).
Could be done later too.
Co-authored-by: Kévin Dunglas <[email protected]>
|
Hmm I might create my task-worker PR. It's not refined yet and the timing is unfortunate, but those types of workers would probably be better suited for simple message queues since they don't include the request/response cycle. |
|
Having the request itself is actually pretty great. For example, headers can pass metadata out of band with the payload that >90% of the time you don't need, but when you do need it, it is there. Queue messages, GRPC, jobs, etc., all have headers/metadata, like observability tokens that libraries know how to emit spans from out-of-the-box without any changes required. |
This allows for extensions to register as external workers. For example, someone could create an in-process, multi-threaded queue that can be used as a Symfony Messenger:
The external worker may then send "requests" to the registered PHP file that can package them up as whatever object is required. Any responses are sent back to the extension if a ResponseWriter is provided, otherwise it is sent to stdout.