You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
New Flush Policy Implementation With Chunking (#240)
Summary:
This is an implementation of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). It works in two phases:
**Phase 1 - Memory Pressure Management:**
The policy monitors total in-memory data size:
* When memory usage exceeds the maximum threshold, initiates chunking to reduce memory footprint while continuing data ingestion
* When previous chunking attempts succeeded and memory remains above the minimum threshold, continues chunking to further reduce memory usage
* When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, forces a full stripe flush to guarantee memory relief
**Phase 2 - Storage Size Optimization:**
Only executed when no memory pressure exists. Implements compression-aware stripe size prediction:
* Uses a configurable compression factor as baseline, enhanced with historical compression ratios from previously encoded data when available
* Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data
* Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold
Differential Revision: D81516697
0 commit comments