-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug?] cannot run integration test #1162
Comments
Been facing the same issue on my Mac (Python 3.11) since yesterday when I rebased latest main branch to my dev branch. Just realized that tests are not running on main branch itself. Error message - |
Thanks for confirming. I see the same issue https://gist.github.com/kevinjqliu/c8310b6253beab52cce93391df03bfe4 And only for commits at and after The commit before ( The CI integration tests are fine. And @sungwy confirmed that running the integration tests via codespace also works |
Full pytest report here: https://gist.github.com/kevinjqliu/a0e8e2199bd8064757eb2b40409e0794 Here's the breakdown of the errors:
|
One realization is that the manifest cache is implemented as a global cache. iceberg-python/pyiceberg/table/snapshots.py Lines 234 to 238 in de47590
And kept in memory until eviction. And since Why does this only affect M1 Mac? I have no idea. |
Hmmm I don't think Manifest file has an IO component. It's only used as an input parameter to one of the class methods: iceberg-python/pyiceberg/manifest.py Lines 555 to 620 in de47590
|
I should be more specific. Reading (and caching) a In iceberg-python/pyiceberg/io/pyarrow.py Lines 258 to 318 in de47590
I think this was part of the issue since I had to increase the system file descriptors limit. |
That's a very interesting observation 👀 I'm so curious to understand what's happening. If that's in fact what's happening, I think your proposed solution sounds promising... I'm still trying to understand this issue thoroughly In the lru cached iceberg-python/pyiceberg/manifest.py Lines 623 to 639 in de47590
|
I took a step back and realized the fundamental issue was the newly introduced cache. Without the cache, everything works fine. Going a layer deeper, this probably means the bug is only for cache hits, as cache misses will just recompute. Fundamentally, there are a couple issues with the function definition
First, the cache key is both io and manifest_list, whereas we just want the key to be manifest_list Here’s an example to showcase the different cache keys
Without digging into where it is breaking or why only for M1 Macs, there are 2 potential solutions:
|
Hi, i'm having the same issue. Is there a workaround I can use as a client? Can you do a new point release with the fix? Thanks! |
@antonioalegria This is currently fixed in the main branch, which we use to run integration tests. I see your comment here are well #1187 (comment) |
I'm hitting this exact same issue as described in this comment (#1162 (comment)), when running polars on an iceberg table I have in S3. |
@antonioalegria thanks for the context. Does the current I'll see if we can do a hotfix release |
Thanks! |
Hi folks, I'm going through a career transition, so my apologies for arriving on this discussion a little late. @antonioalegria @kevinjqliu I don't think the bugged Manifest caching feature was ever released. Here's the version of manifest implementation that was released in 0.7.1: iceberg-python/pyiceberg/table/snapshots.py Lines 233 to 256 in f92994e
Are we sure we are saying the same exact error? Would you mind sharing the exception trace @antonioalegria ? |
Well, maybe the bug is different but I'm getting it when doing a load of a metadata file when doing scan_iceberg in Polars. It only happens with certain objects:
See related issue: apache/arrow#40539 |
@antonioalegria i think thats a different issue
versus the origin issue
|
Apache Iceberg version
main (development)
Please describe the bug 🐞
On Mac, anyone having issue running
make test-integration
on the currentmain
branch?I'm having issues but not sure if its due to my local env
The text was updated successfully, but these errors were encountered: