Skip to content

feat(s3): Port hive.s3.min-part-size and enforce minimum part size#16935

Open
czentgr wants to merge 1 commit intofacebookincubator:mainfrom
czentgr:cz_fix_s3_part_sizes
Open

feat(s3): Port hive.s3.min-part-size and enforce minimum part size#16935
czentgr wants to merge 1 commit intofacebookincubator:mainfrom
czentgr:cz_fix_s3_part_sizes

Conversation

@czentgr
Copy link
Copy Markdown
Collaborator

@czentgr czentgr commented Mar 26, 2026

This PR introduces the hive.s3.min-part-size option available in Presto Java. It defines the minimum size of a buffer before multipart uploads are used. As a result files less than the min-part-size are sent in a single PUT request. Previously, even smaller files were sent through the multi-part upload. However, AWS limits the minimum size of a part to 5MB.
As a result, some S3 backends such as Apache Ozone ignored uploads of parts with less than 5MB leading to errors when finishing the multi-part upload.

The default part size is 10MB to match the existing hard coded value. Every part is of size min-part-size to conform to other S3 backends where each part size must be exactly the same size except the last part. As a result min-part-size is also the effective part size. Therefore, this change may affect performance with the default configuration.

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 26, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit f3dcfb6
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69c57d8d201d61000842b63b

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 26, 2026
@czentgr czentgr marked this pull request as ready for review March 26, 2026 18:30
@czentgr czentgr requested a review from majetideepak as a code owner March 26, 2026 18:30
This PR introduces the hive.s3.min-part-size option available in Presto Java.
It defines the minimum size of a buffer before multipart uploads are used.
As a result files less than the min-part-size are sent in a single PUT request.
Previously, even smaller files were sent through the multi-part upload. However,
AWS limits the minimum size of a part to 5MB.
As a result, some S3 backends such as Apache Ozone ignored uploads of parts
with less than 5MB leading to errors when finishing the multi-part upload.

The default part size is 10MB to match the existing hard coded value.
Every part is of size min-part-size to conform to other S3 backends where
each part size must be exactly the same size except the last part.
As a result min-part-size is also the effective part size.
Therefore, this change may affect performance with the default
configuration.
@czentgr czentgr force-pushed the cz_fix_s3_part_sizes branch from 2231e30 to f3dcfb6 Compare March 26, 2026 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant