-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize ch dicts #7051
base: main
Are you sure you want to change the base?
optimize ch dicts #7051
Conversation
@@ -39,4 +39,5 @@ message Subquery { | |||
repeated string measures = 2; | |||
Expression where = 3; | |||
Expression having = 4; | |||
string raw_sql = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't safe – it would enable even users with just viewer
access to inject arbitrary SQL
Lookup lookup = 8; // Lookup fields for the dimension | ||
} | ||
message Lookup { | ||
string table = 1; | ||
string key_column = 2; | ||
string value_column = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's okay to nest the properties in YAML, but in the protos it's easiest to keep types flat (most places we only use nested types for arrays or maps). E.g.:
string lookup_table = 8;
string lookup_key_column = 9;
string lookup_value_column = 10;
func (d Dialect) DictGetExpr(lookupTable, lookupColumn, lookupKey string) string { | ||
switch d { | ||
case DialectClickHouse: | ||
return fmt.Sprintf("dictGet('%s', '%s', %s)", lookupTable, lookupColumn, lookupKey) | ||
default: | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this easily be generalized for Druid and DuckDB also? E.g. call the function MetricsViewDimensionLookupExpression
and use lookup
in Druid and a subquery in DuckDB?
@@ -40,6 +40,11 @@ type MetricsViewYAML struct { | |||
Ignore bool `yaml:"ignore"` // Deprecated | |||
Unnest bool | |||
URI string | |||
Lookup *struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't make this a pointer, the code gets simpler:
Lookup *struct { | |
Lookup struct { |
@@ -186,6 +187,7 @@ type Condition struct { | |||
} | |||
|
|||
type Subquery struct { | |||
RawSQL string `mapstructure:"raw_sql"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as earlier – dangerous to expose raw_sql
in the Query
type where anyone with access can set it
Value any `mapstructure:"val"` | ||
Condition *Condition `mapstructure:"cond"` | ||
Subquery *Subquery `mapstructure:"subquery"` | ||
Identifier string `mapstructure:"identifier"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment – I think it's unsafe even to expose Identifier
in the metrics query API because it can leak underlying fields that are not exposed through dimension/measure expressions.
|
||
// Rewrites filter on dictionary columns to use the key column instead of the value column. For example, if the dictionary value column is "country" and key column is "country_key", | ||
// then the filter will be rewritten to country_key = 'US' instead of country = 'USA'. | ||
func (e *Executor) rewriteClickhouseDictFilters(qry *Query) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since lookups are introduced as a general concept in the metrics view spec, I wonder if it should be handled directly inside astexpr.go
when converting an expression to SQL?
I think that would also avoid the need for exposing internal fields in the Query
type (which is supposed to be user-facing, as opposed to AST
which is intended to be internal).
Fixes #6909
Checklist: