Skip to content

Want better error on edge cases like sorting into a single partition #568

@gitosaurus

Description

@gitosaurus

Feature request

When importing data that is focused on a small area of the sky, the import (with default argument values for highest & lowest pixel order, pixel threshold) fails because all the data sorts into a single partition.

But this is only discernible by considerable forensic analysis afterward. The only errors that the user sees are the usual Dask errors about workers restarting.

On one such recent import, once worker memory was raised to 512 GB, this error was seen:

Failed REDUCING stage for shard: 3 561
  worker address: tcp://127.0.0.1:44625
offset overflow while concatenating arrays
Failed REDUCING stage for shard: 3 561
  worker address: tcp://127.0.0.1:44625
offset overflow while concatenating arrays
2025-07-02 14:05:23,562 - distributed.worker - ERROR - Compute Failed
Key:       reduce_pixel_shards-a9a06c2df6892c90e44caaef409e9ab3
State:     executing
Task:  <Task 'reduce_pixel_shards-a9a06c2df6892c90e44caaef409e9ab3' reduce_pixel_shards(, ...)>
Exception: "ArrowInvalid('offset overflow while concatenating arrays')"
Traceback: '  File "/astro/users/awoldag/.conda/envs/hack_cutouts/lib/python3.13/site-packages/hats_import/catalog/map_reduce.py", line 342, in reduce_pixel_shards\n    raise exception\n  File "/astro/users/awoldag/.conda/envs/hack_cutouts/lib/python3.13/site-packages/hats_import/catalog/map_reduce.py", line 313, in reduce_pixel_shards\n    merged_table = merged_table.sort_by(ordering)\n  File "pyarrow/table.pxi", line 2158, in pyarrow.lib._Tabular.sort_by\n  File "pyarrow/table.pxi", line 2197, in pyarrow.lib._Tabular.take\n  File "/astro/users/awoldag/.conda/envs/hack_cutouts/lib/python3.13/site-packages/pyarrow/compute.py", line 504, in take\n    return call_function(\'take\', [data, indices], options, memory_pool)\n  File "pyarrow/_compute.pyx", line 612, in pyarrow._compute.call_function\n  File "pyarrow/_compute.pyx", line 407, in pyarrow._compute.Function.call\n  File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status\n  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status\n'

If the import is resulting in a shape that will result in PyArrow failing due to its own limit, we should identify and report this earlier.

If this isn't feasible, perhaps some kind of "linter" for importing can be written that would inspect a failed output directory for hints as to why it failed (such as everything being in a single partition).

Before submitting
Please check the following:

  • I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
  • I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
  • If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.

Metadata

Metadata

Assignees

No one assigned

    Labels

    blockedBlocked by other issuesenhancementNew feature or request

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions