Skip to content

[native] Error querying a Parquet file with array columns using native C++ workers (works in Java) #27433

@psantos-denodo

Description

@psantos-denodo

I have a sample table with array columns that fails when I try to query it using Presto 0.296 using C++ workers
The same query works in Presto Java.

Your Environment

  • Presto version used: 0.296
  • Storage (HDFS/S3/GCS..): S3
  • Data source and connector used: hive
  • Deployment (Cloud or On-prem): On-prem
  • presto_error_list_sample.log

Expected Behavior

Same query works in Presto Java

Current Behavior

Fails with the error (full stacktrace attached):

Query failed (#20260325_123821_00329_8vu2a):  Operator::getOutput failed for [operator: TableScan, plan node ID: 0]: vector::_M_range_check: __n (which is 3) >= this->size() (which is 3)
	at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCResultSetImpl.nextRow(JDBCResultSetImpl.java:193)
(...)
Caused by: java.sql.SQLException: Query failed (#20260325_123821_00329_8vu2a):  Operator::getOutput failed for [operator: TableScan, plan node ID: 0]: vector::_M_range_check: __n (which is 3) >= this->size() (which is 3)
	at com.facebook.presto.jdbc.PrestoResultSet.resultsException(PrestoResultSet.java:1841)
(...)
Caused by: VeloxRuntimeError:  Operator::getOutput failed for [operator: TableScan, plan node ID: 0]: vector::_M_range_check: __n (which is 3) >= this->size() (which is 3)
	at Unknown.# 0  _ZN8facebook5velox7process10StackTraceC1Ei(Unknown Source)
	at Unknown.# 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_(Unknown Source)
	at Unknown.# 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_(Unknown Source)

Possible Solution

It is not a solution, but as additional information, if I create a new table from scratch and insert what it looks like equivalent values, it works:

INSERT INTO hive.default.test_list (int64_list,utf8_list) VALUES
	 (ARRAY[1,2,3],ARRAY['abc','efg','hij']),
	 (ARRAY[null,1],null),
	 (ARRAY[4],array['efg',null,'hij','xyz']);

Steps to Reproduce

  1. Store the parquet file inside the list_columns.zip in whatever storage you prefer
  2. Create a table using the following definition, adapting the location to the location containing the parquet file```
    CREATE TABLE hive.default.test_list (
    "int64_list" array(bigint),
    "utf8_list" array(varchar)
    )
    WITH (
    external_location = 's3a://acme/sampletable',
    format = 'PARQUET'
    );
3. Execute select * from hive.default.test_list



Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    🆕 Unprioritized

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions