Skip to content

cargo test -p parquet fails with default ulimit #8406

@alamb

Description

@alamb

Describe the bug
The default parquet tests fail without chaning the number of open files due to ulimit

To Reproduce
Run

cargo test -p parquet --all-features

This results in

...
---- arrow::arrow_writer::tests::statistics_null_counts_only_nulls stdout ----

thread 'arrow::arrow_writer::tests::statistics_null_counts_only_nulls' panicked at parquet/src/arrow/arrow_writer/mod.rs:2325:14:
Unable to get batch: ParquetError("External: Too many open files (os error 24)")


failures:
    arrow::arrow_reader::tests::test_fixed_length_binary_column_reader
    arrow::arrow_reader::tests::test_interval_day_time_column_reader
    arrow::arrow_reader::tests::test_primitive_single_column_reader_test
    arrow::arrow_reader::tests::test_scan_row_with_selection
    arrow::arrow_reader::tests::test_unsigned_primitive_single_column_reader_test
    arrow::arrow_reader::tests::test_utf8_single_column_reader_test
    arrow::arrow_writer::tests::binary_column_bloom_filter
    arrow::arrow_writer::tests::binary_single_column
    arrow::arrow_writer::tests::date32_single_column
    arrow::arrow_writer::tests::duration_microsecond_single_column
    arrow::arrow_writer::tests::duration_millisecond_single_column
    arrow::arrow_writer::tests::duration_nanosecond_single_column
    arrow::arrow_writer::tests::duration_second_single_column
    arrow::arrow_writer::tests::empty_string_null_column_bloom_filter
    arrow::arrow_writer::tests::f32_single_column
    arrow::arrow_writer::tests::f64_single_column
    arrow::arrow_writer::tests::fallback_flush_data_page
    arrow::arrow_writer::tests::fixed_size_binary_single_column
    arrow::arrow_writer::tests::i16_single_column
    arrow::arrow_writer::tests::i32_column_bloom_filter
    arrow::arrow_writer::tests::i32_column_bloom_filter_at_end
    arrow::arrow_writer::tests::i32_single_column
    arrow::arrow_writer::tests::i64_single_column
    arrow::arrow_writer::tests::i8_single_column
    arrow::arrow_writer::tests::interval_day_time_single_column
    arrow::arrow_writer::tests::interval_year_month_single_column
    arrow::arrow_writer::tests::large_binary_single_column
    arrow::arrow_writer::tests::large_string_single_column
    arrow::arrow_writer::tests::list_nested_nulls
    arrow::arrow_writer::tests::list_single_column
    arrow::arrow_writer::tests::statistics_null_counts_only_nulls
    arrow::arrow_writer::tests::string_single_column
    arrow::arrow_writer::tests::string_view_single_column
    arrow::arrow_writer::tests::struct_single_column
    arrow::arrow_writer::tests::test_fixed_size_binary_in_dict
    arrow::arrow_writer::tests::test_list_of_struct_roundtrip
    arrow::arrow_writer::tests::time32_millisecond_single_column
    arrow::arrow_writer::tests::time64_microsecond_single_column
    arrow::arrow_writer::tests::time64_nanosecond_single_column
    arrow::arrow_writer::tests::timestamp_microsecond_single_column
    arrow::arrow_writer::tests::timestamp_nanosecond_single_column
    arrow::arrow_writer::tests::timestamp_second_single_column
    arrow::arrow_writer::tests::u16_single_column
    arrow::arrow_writer::tests::u32_single_column
    arrow::arrow_writer::tests::u64_min_max
    arrow::arrow_writer::tests::u64_single_column

test result: FAILED. 788 passed; 46 failed; 0 ignored; 0 measured; 0 filtered out; finished in 4.34s

Expected behavior
Tests should pass

Additional context
you can fix this by increasing the ulimit

ulimit -n 10000

But I find that annoying to remember and occasionally forget

Metadata

Metadata

Assignees

Labels

bugparquetChanges to the parquet crate

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions