Skip to content

Should the getKey() method from GeoHashGrid return a String instead of GeoPoint #31579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
colings86 opened this issue Jun 26, 2018 · 9 comments
Closed

Comments

@colings86
Copy link
Contributor

The GeoHashGrid class is used as the interface for the geohash_grid aggregation. It currently has a getKey() method which returns a GeoPoint and a getKeyAsString() method that returns the String of the geohash for that bucket. The REST API only returns the String geohash and this is in the key field of the bucket.

I think it would be better for the getKey() method to return the string representation of the geohash and to get rid of the getKeyAsString() method. Having the ability to get a GeoPoint from a bucket that represents a geohash cell is misleading because the cell does not represent a point but an area. Having it return a bounding box will also not suffice given that with the addition of #30320 the cells may not be rectangular. Therefore I think we should remove the ability to get a GeoPoint from this aggregation and give the client the decision on how to represent the bucket key in their application.

This will also help implementing the high level rest client aspects of #30320 as the client will not need to know about the GeoHashType being used and will only care that there is some string key for the buckets.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@nyurik
Copy link
Contributor

nyurik commented Jun 27, 2018

I think some clients may want to know the centroid of the region - e.g. for placing text or drawing circles. For more complex drawing, they would need to know the details of each hashing algorithm. One option would be for the new HashType enum handlers to provide a utility function of "string" -> "centroid".

@cbuescher
Copy link
Member

some clients may want to know the centroid of the region

The question is if it should be the aggregations (and specifically the buckets) job to provide this kind of information, since its very much a presentation layer thing. The Hash key (as a String) whould already provide enough information about this. A client should know how to interpret this (because it sent the request) and can perform the conversion back much more easily. The proposed change is supposed to simplify matters on the backend so we have less problems carrying the hash type information around as we do have currently in #30320.

@nyurik
Copy link
Contributor

nyurik commented Jun 27, 2018

@cbuescher yep, I agree with you. I simply proposed a consistent workaround for the clients that do know the hashing type of how to get that same information. This does not mean the information will be available in the bucket. Thx for looking into it!

@jpountz
Copy link
Contributor

jpountz commented Jun 28, 2018

+1 to stop returning a GeoPoint, to me this is related to #18518: we need to have a single key as opposed to key + key_as_string, and this key must be of a json type, ie. string, number, boolean, hash or array.

I'm not sure what the key should be exactly, but we should make sure it contains sufficient information to filter documents in one particular bucket so that it's easy to drill down a specific subset of the data. Geo hashes meet this requirement, especially thanks to @imotov's work on #30698, but I agree that we might need more to support other types of grids that do not create rectangles. FWIW I would be ok with having the format of keys depend on the selected grid type.

@cbuescher
Copy link
Member

In the group discussion we agreed we should change this interface to return a String in 7.0. We would like to avoid introducing this as breaking change in 6.x, but there might not be other ways that allow us to introduce the new geo grid types in 6.x, so this might be a tradeoff to be discussed.

@nyurik
Copy link
Contributor

nyurik commented Jul 2, 2018

@cbuescher thx, could you clarify how this impacts the hash type PR? Does it mean we have to wait for 7.x to release this feature? Or is it ok for this API to simply throw an error if it cannot parse the hash string because it was generated by a different algorithm (that's what will currently happen because other algos produce incompatible hash strings)

@cbuescher
Copy link
Member

cbuescher commented Jul 2, 2018

Or is it ok for this API to simply throw an error if it cannot parse the hash string because it was generated by a different algorithm (that's what will currently happen because other algos produce incompatible hash strings)

That would be one option. Maybe there are other we haven't thought about yet. But having a cleaner solution from 7.0 onwards certainly makes choosing less ideal workarounds on 6.x a bit easier. I think this should be discussed on the PR though.

@cbuescher
Copy link
Member

@colings86 I think this can be closed now that we agreed on introducing a new aggregation in #30320 (comment) instead of changing the interface of the existing one. Assuming this is so, I'll close, but please reopen if I missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants