Skip to content

[Regression] Query using ARRAY_AGG(DISTINCT) causes panicย #10486

Closed
@bellwether-softworks

Description

@bellwether-softworks

Describe the bug

Beginning in v37.0.0, a previously-working query is found to result in a panic:

panicked at /Users/username/.cargo/registry/src/index.crates.io-6f17d22bba15001f/datafusion-physical-expr-37.1.0/src/aggregate/array_agg_distinct.rs:158:99:
:
assertion `left == right` failed: state array should only include 1 row!
  left: 4
 right: 1

To Reproduce

The following query is known to trigger the panic, alongside the accompanying parquet payloads contained in the attached .zip file.

SELECT
        asm.floor,
        CONCAT(bi.name, '_', bi.asm_range_lower,ย ' - ', bi.asm_range_upper) AS name_with_range,
        ARRAY_AGG(DISTINCT s.code) AS codes,
        COUNT(DISTINCT asm.id) AS assembly_count,
        COUNT(DISTINCT s.id) AS item_qty
    FROM 'batch_items.parquet' bi
        INNER JOIN 'target_items.parquet' s
            ON s.id = bi.target_id
        INNER JOIN 'assemblies.parquet' asm
            ON s.assembly_id = asm.id
    GROUP BY
        asm.floor_ordinal,
        asm.floor,
        bi.batch_name,
        bi.name,
        bi.asm_range_lower,
        bi.asm_range_upper
    ORDER BY
        asm.floor_ordinal,
        bi.batch_name,
        bi.name,
        bi.asm_range_lower;

failing-query-assets.zip

Expected behavior

No response

Additional context

I've confirmed that the issue does not present in v36.0.0 or earlier, and is present in v37.0.0 up to v38.0.0. The issue doesn't present when omitting ARRAY_AGG(DISTINCT ...) from my queries.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingregressionSomething that used to work no longer does

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions