-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for joins? #46
Comments
Hello sarukas, |
Hi, Thanks for a quick reply. One more question: how are totals handled? Are they possible across partitions? E.g. count grouping by country where country is the partition? Our use case is for olap queries, where the lowest level of aggregation is done on several dimension paths, but higher levels would be calculated on the fly. Thanks, Sarunas
|
Hello sarukas, |
Could you provide some details on how SploutSQL integrate with Apache Drill? |
We wrote a plugin for Drill to integrate Splout as another data store that Drill can query. Because Splout is partitioned and indexed, we tell Drill what partition/s to scan and how to execute the query so that it will use the appropriate indexes. If the SQL query has an equality condition on the partition key, then Drill does the same that you would do with the normal Splout SQL API: querying a single partition. Otherwise, as many scans as needed are produced, and Drill takes care of all the rest (grouping / aggregating / etc). Although we didn't test the performance of this system fully, we expect it to behave quite fast for queries that don't impact massive portions of the data (a full-scan of the data would be much more efficient with another underlying store like just Parquet files). Would you be interested in trying this for your use case? |
I would be interested as I am trying to look at SploutSQL without all the complexity of SparkSQL. The main advantage of SploutSQL here is having REST api. Could you share more info as I am still lacking on Drill's concept. |
Hi, If you incorporate Splout SQL for your use case and are already happy with it, but need to be able to support cross-partition queries, then you would move to Drill over Splout. I think it would be better to follow up on your use case on the user list, feel free to write about your use case there and we can help you setup Splout for trying it: https://groups.google.com/forum/?fromgroups#!forum/sploutdb-users |
Sorry for asking this here. Does splout DB support joins? Intended use case is joining large batch-generated table with a small dimension table on the fly.
The text was updated successfully, but these errors were encountered: