Skip to content

Conversation

macvincent
Copy link
Contributor

Differential Revision: D81516697

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 3, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

macvincent added a commit to macvincent/nimble that referenced this pull request Sep 3, 2025
Summary: Pull Request resolved: facebookincubator#240

Differential Revision: D81516697
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

macvincent added a commit to macvincent/nimble that referenced this pull request Sep 4, 2025
Summary: Pull Request resolved: facebookincubator#240

Differential Revision: D81516697
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

macvincent added a commit to macvincent/nimble that referenced this pull request Sep 5, 2025
Summary:

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It works in two phases:

**Phase 1 - Memory Pressure Management:**

The policy monitors total in-memory data size:

*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*   When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief

**Phase 2 - Storage Size Optimization:**

Only executed when no memory pressure exists. Implements compression-aware stripe size prediction:

*  Uses a configurable compression factor as baseline, enhanced with historical compression ratios from previously encoded data when available
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

Differential Revision: D81516697
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

macvincent added a commit to macvincent/nimble that referenced this pull request Sep 10, 2025
Summary:

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It works in two phases:

**Phase 1 - Memory Pressure Management:**

The policy monitors total in-memory data size:

*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*   When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief

**Phase 2 - Storage Size Optimization:**

Only executed when no memory pressure exists. Implements compression-aware stripe size prediction:

*  Uses a configurable compression factor as baseline, enhanced with historical compression ratios from previously encoded data when available
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

Differential Revision: D81516697
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

macvincent added a commit to macvincent/nimble that referenced this pull request Sep 10, 2025
Summary:

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It works in two phases:

**Phase 1 - Memory Pressure Management:**

The policy monitors total in-memory data size:

*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*   When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief

**Phase 2 - Storage Size Optimization:**

Only executed when no memory pressure exists. Implements compression-aware stripe size prediction:

*  Uses a configurable compression factor as baseline, enhanced with historical compression ratios from previously encoded data when available
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

Differential Revision: D81516697
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

macvincent added a commit to macvincent/nimble that referenced this pull request Sep 10, 2025
Summary:

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It works in two phases:

**Phase 1 - Memory Pressure Management:**

The policy monitors total in-memory data size:

*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*   When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief

**Phase 2 - Storage Size Optimization:**

Only executed when no memory pressure exists. Implements compression-aware stripe size prediction:

*  Uses a configurable compression factor as baseline, enhanced with historical compression ratios from previously encoded data when available
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Sep 10, 2025
Summary:

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It s:

**Phase 1 - Memory Pressure Management (shouldChunk)**

The policy monitors total in-memory data size:

*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief

**Phase 2 - Storage Size Optimization (shouldFlush)**

Only executed when no memory pressure exists. Implements compression-aware stripe size prediction:

*  Uses a configurable compression factor as baseline, enhanced with historical compression ratios from previously encoded data when available
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

Differential Revision: D81516697
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81516697

macvincent added a commit to macvincent/nimble that referenced this pull request Oct 9, 2025
…incubator#240)

Summary:
X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 9, 2025
…incubator#240)

Summary:
X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 9, 2025
…incubator#240)

Summary:
X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 9, 2025
…incubator#240)

Summary:
X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 13, 2025
…incubator#240)

Summary:
X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 13, 2025
…incubator#240)

Summary:
X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/velox that referenced this pull request Oct 13, 2025
…incubator#14846)

Summary:

X-link: facebookincubator/nimble#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 13, 2025
…incubator#240)

Summary:
X-link: facebookincubator/velox#14846

Pull Request resolved: facebookincubator#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/velox that referenced this pull request Oct 13, 2025
…incubator#14846)

Summary:
X-link: facebookexternal/presto-facebook#3412


X-link: facebookincubator/nimble#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/velox that referenced this pull request Oct 13, 2025
…incubator#14846)

Summary:
X-link: facebookexternal/presto-facebook#3412


X-link: facebookincubator/nimble#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
@macvincent macvincent force-pushed the export-D81516697 branch 2 times, most recently from 9475859 to ee50dfd Compare October 14, 2025 00:58
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 14, 2025
…incubator#240)

Summary:
X-link: facebookexternal/presto-facebook#3412

X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 14, 2025
…incubator#240)

Summary:
X-link: facebookexternal/presto-facebook#3412

X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
@macvincent macvincent force-pushed the export-D81516697 branch 2 times, most recently from d3aebf3 to f66dc50 Compare October 14, 2025 07:13
macvincent added a commit to macvincent/velox that referenced this pull request Oct 14, 2025
…incubator#14846)

Summary:
X-link: facebookexternal/presto-facebook#3412


X-link: facebookincubator/nimble#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 14, 2025
…incubator#240)

Summary:
X-link: facebookexternal/presto-facebook#3412

X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 14, 2025
…incubator#240)

Summary:
X-link: https://github.com/facebookexternal/presto-facebook/pull/3412

X-link: facebookincubator/velox#14846

Pull Request resolved: facebookincubator#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 14, 2025
…incubator#240)

Summary:
X-link: facebookexternal/presto-facebook#3412

X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references

NOTE: The Velox repo change here is just test copied into an experimental directory that references the flush policy.

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 16, 2025
…incubator#240)

Summary:
X-link: facebookexternal/presto-facebook#3412

X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  While memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the maximum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references.

Reviewed By: helfman

Differential Revision: D81516697
macvincent added a commit to macvincent/velox that referenced this pull request Oct 16, 2025
…incubator#14846)

Summary:
X-link: facebookexternal/presto-facebook#3412


X-link: facebookincubator/nimble#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  While memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the maximum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references.

Reviewed By: helfman

Differential Revision: D81516697
macvincent added a commit to macvincent/nimble that referenced this pull request Oct 16, 2025
…incubator#240)

Summary:
X-link: facebookexternal/presto-facebook#3412

X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  While memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the maximum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references.

Reviewed By: helfman

Differential Revision: D81516697
Summary:

As preparation for our [Nimble chunked encoding](https://fburl.com/gdoc/zjck7lo6) work, we decided to clean up the previous contract to remove unused methods and attributes. Should be a no-op since these methods and attributes were not used. We also clarified the naming of some attributes.

Reviewed By: sdruzkin, helfman

Differential Revision: D81514657
…incubator#240)

Summary:
X-link: facebookexternal/presto-facebook#3412

X-link: facebookincubator/velox#14846


This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  While memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the maximum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references.

Reviewed By: helfman

Differential Revision: D81516697
macvincent added a commit to macvincent/velox that referenced this pull request Oct 16, 2025
…incubator#14846)

Summary:
X-link: facebookexternal/presto-facebook#3412


X-link: facebookincubator/nimble#240

This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It has two phases:

**Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
*  When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
*  While memory remains above the minimum threshold, continues chunking to further reduce memory usage

**Phase 2 - Storage Size Optimization (shouldFlush)**
 Implements compression-aware stripe size prediction:
*   When chunking fails to reduce memory usage effectively and memory stays above the maximum threshold, forces a full stripe flush to guarantee memory relief
*   Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
*   Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold

`shouldChunk` is also now a separate method required by all flush policies. We updated all previous tests and code references.

Reviewed By: helfman

Differential Revision: D81516697
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants