Skip to content

Latest commit

 

History

History
102 lines (90 loc) · 3.54 KB

File metadata and controls

102 lines (90 loc) · 3.54 KB

Status

General

  • SQL Parser
  • SQL Query Planner
  • Query Optimizer
  • Constant folding
  • Join Reordering
  • Limit Pushdown
  • Projection push down
  • Predicate push down
  • Type coercion
  • Parallel query execution

SQL Support

  • Projection (SELECT)
  • Filter (WHERE)
  • Filter post-aggregate (HAVING)
  • Sorting (ORDER BY)
  • Limit (LIMIT
  • Aggregate (GROUP BY)
  • cast /try_cast
  • VALUES lists
  • String Functions
  • Conditional Functions
  • Time and Date Functions
  • Math Functions
  • Aggregate Functions (SUM, MEDIAN, and many more)
  • Schema Queries
  • Support for nested types (ARRAY/LIST and STRUCT. See #2326 for details)
  • Subqueries
  • Common Table Expressions (CTE)
  • Set Operations (UNION [ALL], INTERSECT [ALL], EXCEPT[ALL])
  • Joins (INNER, LEFT, RIGHT, FULL, CROSS)
  • Window Functions
    • Empty (OVER())
    • Partitioning and ordering: (OVER(PARTITION BY <..> ORDER BY <..>))
    • Custom Window (ORDER BY time ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING))
    • User Defined Window and Aggregate Functions
  • Catalogs
    • Schemas (CREATE / DROP SCHEMA)
    • Tables (CREATE / DROP TABLE, CREATE TABLE AS SELECT)
  • Data Insert
    • INSERT INTO
    • COPY .. INTO ..
    • CSV
    • JSON
    • Parquet
    • Avro

Runtime

  • Streaming Grouping
  • Streaming Window Evaluation
  • Memory limits enforced
  • Spilling (to disk) Sort
  • Spilling (to disk) Grouping
  • Spilling (to disk) Joins

Data Sources

In addition to allowing arbitrary datasources via the TableProvider trait, DataFusion includes built in support for the following formats:

  • CSV
  • Parquet (for all primitive and nested types)
  • JSON
  • Avro
  • Arrow