Skip to content

[RFC] Add PSR7 Support, decouple this lib from a http client implementation #455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

joelwurtz
Copy link
Contributor

@joelwurtz joelwurtz commented Aug 22, 2016

Hey,

Since now the endpoint are better decoupled from the transport, this is a PR in order to fully support PSR7 in this library by using HTTPlug abstraction. By using HTTPlug a lot of code can be removed and the maintenance will be more focus on the endpoint and not the http transport layer (as you can see the diff, a lot of code is no longer necessary).

Advantages

However it's an opiniated vision and i would like to add some arguments:

  • End user choice: The end user will have the choice of the http client implementation, which will be, for a majority of user, the guzzle6 library. However some others librairies can be chosen, like Zend/Http or CakePHP one which are already available in their respective framework.

    It, also, offers a good support for people using reactphp, so this is possible to use the event loop of reactphp and async system without hack from the user. List of existing adapters / clients

  • Less maintenance on this library: HTTPlug offer a large feature set (with the usage of the client-common package) which does not need to be rewrite in this library: Retry failed requests, Load Balancing (Client Pool), Failure detection, etc ...

  • Better integration with frameworks (Symfony ATM, laravel in progress): HTTPlug offer an integration to existing frameworks for library using this layer. In example, for Symfony all requests will be logged and traced when in dev mode (time, memory, diff at each modification, etc ....).

  • Shared work: Many libraries already implement HTTPlug or are on the way, this mean, we can better profit for feature offered / wanted by other libraries, i.e. the load balancing client was directly inspired from elasticsearch-php, but it's a awesome addition for others API who works in a cluster mode.

Counterpart

The counterpart of this benefit, is that it needs more work for the end user to setup his elasticsearch-php client, as he need to choose an client and a psr7 implementation.

However, we are fully aware of this counterpart and despite being a design decision (as we want to give the choice to the user) we have many guidelines and packages for api builder to reduce this to the strict minimum:

  • Discovery: There is a discovery package that this lib is bound to, which will automaticaly pick any existant PSR7 and HTTPlug implementation if available
  • Documentation: We recommend to separate the setup documentation in two parts, one for the "quick and run" usage, by giving an one liner composer install (with a default client and psr7 implementation, at your conveniance) and one for the advanced usage which explain how to choose a client and psr7 implementation and give more informations about the feature set of each library available.

Changes on this lib

  • No more guzzle/ring adapter and FutureArrayInterface, it's a BC break, however for common usage of this library it should change nothing as the end user api on the endpoint is still the same.
  • Less configurations options: As it's using an abstract layer for the HTTP Client, all specific options bounds to the client does not make any sense now, as each implementation has his own configuration, it's the user who have the responsability to inject the correct configuration.

Things to be done

This PR, although mainly in a advanced stage, is not complete and there is still works to do:

  • Rewrite documentation on the async part
  • Rewrite the "sniff / ping" feature by extending the pool system (i did not look at this ATM, but it should be easy to re add this support)
  • Better documentation on the quick / advanced installation as explained earlier
  • Combined mode: Not sure if necessary, but we can write a combined client (on httplug) which use a specific http implementation for sync calls and another one for async calls (like it was done in the 2.X) branch, it can make sense in other library than elasticsearch so we would be happy to support this;
  • GuzzleRing Adapter, not sure if necessary but we can write an adapter for this library so it reuses the same implementation as the 2.X branch

I also want to do some benchmarks by updating https://github.com/jolicode/elasticsearch-php-benchmark so we would be able to give the best implementation in term of speed for sync and async calls.

return new CurlHandler();
} else {
throw new \RuntimeException('CurlSingle handler requires cURL.');
if (!is_subclass_of($clientPoolStrategy, HttpClientPool::class)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if-statement could be joined with the one above.

 if (!is_string($clientPoolStrategy) || !is_subclass_of($clientPoolStrategy, HttpClientPool::class)) {

@polyfractal
Copy link
Contributor

Hi @joelwurtz, thanks for the PR. :)

Before anything else, this won't make it for 5.0. It's too large of a change, and I don't feel there is enough time for it to stabilize (and for users to account for the bwc breaks) before 5.0 is out. If we decide to move forward with this PR, I'll branch 5.x off master and this can be targeted at 6.x

With that said, I have some reservations. Large wall of text incoming. I'm not opposed to HTTPlug in theory, but practically speaking I'm having a hard time rationalizing the large breaks this would generate with the utility gained.

Adding PSR-7 support to the client is something that I support, and would like to do sooner than later. But I'm unclear how adding HTTPlug on top of PSR-7 support helps.

  • The increased user-friction is not good. People want to use this client to talk to an Elasticsearch cluster, they generally don't care what underlying transport mechanism is used. Forcing the user to decide these little details up-front is unnecessary friction for 95% of the users.
    • Therefore, we would need to either "suggest" a concrete implementation and provide helper methods to construct a "default" client, along with instructions on how the user can include the concrete implementation in their application. This is not ideal, complicates setup, and forces the user to manage another dependency
    • Alternatively, we could just include a default concrete implementation... but then why bother with HTTPlug in the first place? Transport flexibility can be accomplished simply by supporting PSR-7 and allowing the user to inject their own compliant client.
  • It feels like an abstraction for the sake of abstraction. Depending on HTTPlug doesn't protect us from dependency changes. We just have to respond to changes in HTTPlug instead of changes in the concrete implementation (e.g. Guzzle). Worse, there are now two layers of upstream bugs
  • If the user needs custom/special functionality, they still need to be implementation-aware to
    configure the concrete implementation (four different, unique configurations). That seems to defeat the point of abstracting over concrete implementations in the first place? If I can't get a unified API that works with all clients, why bother?
  • To be brutally honest, 95% of users do not care about the underlying transport mechanism. Everything in PHP-land uses curl (majority) or PHP streams (minority). Everything else is just sugar on top of those two. And imo, no one cares if two different HTTP clients are used in one application (e.g. one by ES-PHP, another by their code or framework). Bloats the /vendor/ directory a bit I guess?
  • The various ConnectionPools need to be reimplemented, as well as the concept of Selectors. E.g. Selectors can choose connections from the Pool based on node information, like allocation tags or hot/cold setup. Cluster state sniffing needs reimplementing
  • fewer dependencies > more dependencies imo.
  • The backwards compat breaks are quite large. Any custom code that users are injecting will now be broken, as well as all exceptions, and Futures/async behavior
  • ES-PHP is already "implementation agnostic" by way of RingPHP handlers. If we were to replace it with something, I'd want it to be a standard (PSR-7) rather than something that sits on top of PSR-7 but is not standardized.
  • There are already various adapters/modules/plugins for popular frameworks based on this library, so I'm not sure the framework support added by HTTPlug is a new benefit?

Also related, comments in these PRs I generally agree with too:

Sorry, I know that was a huge wall of text. Like I said, I'm not opposed to the idea in theory, but in practice I just don't see much benefit. It's a big break for minor gain imo.

@joelwurtz
Copy link
Contributor Author

joelwurtz commented Aug 25, 2016

My turn to do a huge wall of text :D

The increased user-friction is not good. People want to use this client to talk to an Elasticsearch cluster, they generally don't care what underlying transport mechanism is used. Forcing the user to decide these little details up-front is unnecessary friction for 95% of the users.

That's the thing i talk in Counterpart and for that i agree with you (even if 95% seems to be a little high).
However there is more solutions to the problem that the one you exposed:

We could propose this in the installation documentation:

# Quick and run installation
composer require elastic/elasticsearch-php php-http/socket-client

(Or other client, don't matter)

And have a better explanation for advanced usage, so by default user will have all the needed librairies.

There is also the possibility to separate this library in two:

  • All the endpoints and abstractions in a elasticsearch-php-core library
  • All the easy to get started stuff in a elasticsearch-php library which includes a default psr7, client implementation and the core library

Therefore, we would need to either "suggest" a concrete implementation and provide helper methods to construct a "default" client, along with instructions on how the user can include the concrete implementation in their application. This is not ideal, complicates setup, and forces the user to manage another dependency

Helper methods are already available by using the php-http/discovery package, but this package can be dropped if you want to bind to a specific psr7 / client implementation

Alternatively, we could just include a default concrete implementation... but then why bother with HTTPlug in the first place? Transport flexibility can be accomplished simply by supporting PSR-7 and allowing the user to inject their own compliant client.

Allowing an user to inject it's own compliant client would force this library to declare a client interface that send a PSR7 Request to get a PSR7 Response, so this would mean to recreate something close to HTTPlug, and would lead to a NIH syndrom.

It feels like an abstraction for the sake of abstraction. Depending on HTTPlug doesn't protect us from dependency changes. We just have to respond to changes in HTTPlug instead of changes in the concrete implementation (e.g. Guzzle). Worse, there are now two layers of upstream bugs

Yes, it's an abstraction, but we don't make this abstraction for the sake of an abstraction. Depending on HTTPlug, means to have a dependency only on the HttpClient Interface which his, IMO, one of the simplest thing ever to represent how to get a PSR7 Response from a PSR7 Request (and the async interface is also very simple and works well with all async implementations).

All others library, code in php-http are mainly sugar compared to this.

The purpose of this interface is interoperability among all API clients, as an example this lib use guzzlering, other one use socket-client, and many use guzzle6. When you need, as an user, to add a feature on the HTTP request you have to add it for each of your HTTP Client, and PSR7 doesn't help with that (it only help you about the data transfered, but not how to transfer it)

Also having this interface, allow us to add features that works on any existing client. Like now we can support having a HTTP Cache in front of all existing clients, and the only thing you need to do to have that feature is respecting this interface, which is far from being a hard and impossible job.

If the user needs custom/special functionality, they still need to be implementation-aware to
configure the concrete implementation (four different, unique configurations). That seems to defeat the point of abstracting over concrete implementations in the first place? If I can't get a unified API that works with all clients, why bother?

The abstraction is about transfering a request to get a response, and not abstracting the client configuration, maybe we will do that by using factory, but it's clearly not our actual goal.

However we took care that default configuration for each of the client / adapter represent the same set of feature. If one needs to change the default configuration then he will need to understand the implementation, but like you said, it's only for 5% of the people, others don't care...

To be brutally honest, 95% of users do not care about the underlying transport mechanism. Everything in PHP-land uses curl (majority) or PHP streams (minority). Everything else is just sugar on top of those two. And imo, no one cares if two different HTTP clients are used in one application (e.g. one by ES-PHP, another by their code or framework). Bloats the /vendor/ directory a bit I guess?

Agree, but the goal is not only about the 5% client who cares, it's also about all API maintainer (like you) for having an interoperable interface which allows them to create feature on top of an HTTP Client that can be reused (like sharing the same client than http://github.com/ruflin/Elastica in this library), despite having a prefered client implementation.

The various ConnectionPools need to be reimplemented, as well as the concept of Selectors. E.g. Selectors can choose connections from the Pool based on node information, like allocation tags or hot/cold setup. Cluster state sniffing needs reimplementing

I plan to do that in this PR (however ConnectionPools and Selector are already available, they just need to be extended to fullfil the need of this library)

fewer dependencies > more dependencies imo.

To take your argument, it only bloats the vendor directory and have some more seconds on the user setup (also i remove some dependencies here) which IMO doesn't do harm for the user.

The backwards compat breaks are quite large. Any custom code that users are injecting will now be broken, as well as all exceptions, and Futures/async behavior

  • For custom code, sure it will be BC but it only concern <5% who care about that
  • Exceptions, i can remove the Http namespace that i add, and also discuting this PR with other people using this lib made me realize that removing NoNodesAvailableException and the others was a mistake, so i will reintegrate them
  • Futures/async behavior: this behavior is keep for the call, however yes there is no more "lazy" access (calling wait automatically when user wants an info on a FutureArrayInterface), however i think i will add a wrapper around the Promise to keep this behavior and don't have a BC on this).

ES-PHP is already "implementation agnostic" by way of RingPHP handlers. If we were to replace it with something, I'd want it to be a standard (PSR-7) rather than something that sits on top of PSR-7 but is not standardized.

Like i said before, replace handler by PSR7 would only replace the data that it's tranfered but not the way it's transfered.

Futhermore we have the willness to make HTTPlug a standard and that's why we are doing many pull requests among many PHP Client API repositories, (and there is already many librairies that use httplug: https://github.com/KnpLabs/php-github-api, https://github.com/geocoder-php/Geocoder, https://github.com/docker-php/docker-php, https://github.com/FriendsOfSymfony/FOSHttpCache, ...). IMO making something a standard is not by having the PSR on your namespace or the FIG approval, but more about having a strong usage of it (given this, doctrine-cache is more of a standard than PSR6).

Additionally we have plans to propose our Interface as a PSR.

Our intention at HTTPlug is not to force every API Client to use our interface, on the contrary we want to work with all of them to have a better interoperability and, by this way, reduce maintenance cost and allow users to have a unified API and awesome features about all things that concern doing HTTP Requests.

Finally, even if i don't agree, i understand your position, and so i have some questions about PSR7 and this PR:

How do you see integrating PSR7 in this library?: does the endpoint should return a PSR7 Request ? Or it should only be used internally ATM (like this PR) ?

Maybe this PR is going too far for an introduction to HTTPlug, if you want i can make it much smaller by just adding a dependency on HTTPlug and PSR7 (by requiring a default PSR7 implementation) and do a wrapper with HTTPlug interface around the GuzzleRing Handler to enable him to support PSR7, WDYT ?

Sorry, also, about this huge wall of text. I don't quiet agree on argument from thephpleague and yours and wanted to make my mind clear about the subject.

@sagikazarmark
Copy link

I don't really have too much to add to @joelwurtz, just one thing:

If the user needs custom/special functionality, they still need to be implementation-aware to
configure the concrete implementation (four different, unique configurations). That seems to defeat the point of abstracting over concrete implementations in the first place? If I can't get a unified API that works with all clients, why bother?

I am a little bit confused by this argument. Why do you expect configuration to be part of the abstraction? The abstraction provides an API for the library in this case, not the user. The user has almost unlimited flexibility to configure whatever client, which increases the interoperability of the library.

@ruflin
Copy link

ruflin commented Aug 26, 2016

Thanks everyone for all the walls of text. My suggestion would be to do in a first step as many changes as possible without breaking BC. I kind of like the last suggestion by @joelwurtz to initially only add support for PSR7 by adding a handler around GuzzleRing. I'm not 100% sure of all the implications of this but it sounds like this could be done as a non BC break?

@polyfractal
Copy link
Contributor

Going to close this, as I still don't see the need for such a change (and the PR has likely diverged enough that the rebase would be hellish). Thanks for the PR, I do appreciate it, I just don't feel the need for ES-PHP client to adopt HTTPlug at this time.

@polyfractal polyfractal closed this Mar 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants