-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fileio] Add file-io.allow-cache for RESTTokenFileIO #5054
Conversation
<td><h5>resolving-file-io.enabled</h5></td> | ||
<td style="word-wrap: break-word;">false</td> | ||
<td>Boolean</td> | ||
<td>Whether to enable resolving fileio, when this option is enabled, in conjunction with the table's property data-file.external-paths, Paimon can read and write to external storage paths, such as OSS or S3. In order to access these external paths correctly, you also need to configure the corresponding access key and secret key.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fileio -> file io same with line 87?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is ok in description, we can just be careful for option key, it is API part.
|
||
@Override | ||
public void close() { | ||
if (!allowCache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means, if allowCache, the fileIo need to close by outside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not allow cache, the fileio should be closed.
If allow cache, the fileio is in the static cache, we cannot close them.
<td><h5>file-io.allow-cache</h5></td> | ||
<td style="word-wrap: break-word;">true</td> | ||
<td>Boolean</td> | ||
<td>Whether to allow static cache in file io implementation. If not allowed, this means that there may be a large number of FileIO instances generated, enabling caching can lead to resource leakage.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question here:
If enabling caching, the number of FileIO instances will be reduce. Why enabling caching can lead to resource leakage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our original design was to place it in the cache of static variables, but for the case of REST Catalog, it may have a lot of FileIO generated, and there may be a FileIO for each table because each table has different file access permissions.
So if there are too many FileIOs in this situation, we cannot cache them in memory casually. If there are too many, it will lead to too many resources, that is, resource leakage.
LGTM |
Purpose
HadoopCompliantFileIO
, it should useMap.computeIfAbsent
.new Configuration(false)
to not load files in configuration for improving performance.Tests
API and Format
Documentation