Skip to content

Optimizer should simplify lower(col) = lower('literal') to enable predicate pushdown #27436

@ethanyzhang

Description

@ethanyzhang

When query predicates wrap columns in lower() — for example lower("col") = lower('some_value') — predicate pushdown to connectors is blocked because the optimizer treats the expression as a function call rather than a simple comparison.

This is common in application-generated SQL where case-insensitive matching is needed. The pattern typically looks like:

SELECT ...
FROM table
WHERE lower("AccountIdentifier") = lower('035778000361')
   OR lower("AccountIdentifier") = lower('135006324083')
   ...

Similarly, position('substring' in lower("col")) > 0 prevents pushdown.

Proposed improvement

The optimizer could recognize and simplify these patterns:

  1. lower(col) = lower('literal') — Since lower('literal') is a constant, fold it at plan time: lower(col) = 'already_lowercase_literal'. This at minimum eliminates redundant lower() calls on constants, and for connectors that support case-insensitive comparison or LOWER, enables pushdown.

  2. lower(col) = 'literal' (already lowercase) — Some connectors (e.g., JDBC) could push this down as LOWER(col) = 'literal' in the generated SQL, letting the underlying database handle it with an index on a computed column or a case-insensitive collation.

  3. position('literal' in lower(col)) > 0 — Could be rewritten as a case-insensitive LIKE '%literal%' pattern for connectors that support it.

Impact

Without pushdown, the connector must scan and return all rows, and Presto evaluates the filter in-engine. For large tables this significantly increases I/O and processing time.

Metadata

Metadata

Type

No type

Projects

Status

🆕 Unprioritized

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions