-
Notifications
You must be signed in to change notification settings - Fork 494
SNOW-2057867 refactor and fixes to make stage related bind and pandas… #2300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-2057867 refactor and fixes to make stage related bind and pandas… #2300
Conversation
… write work for Python sprocs - relax the condition of using stage solution for array bind (so we will use bind_size >= threshold instead of bind_size > threshold) - use _upload_stream to implement BindUploadAgent - use _upload to implement write_pandas - correctly use scoped keyword and temp naming pattern for write_pandas - fix a minor bug where file_transfer_agent returns meta.dst_file_name as file name in download (dst_file_name may contain sub-directories, whereas name will be the base file name without any sub-directory prefixes. We should keep it consistent with upload, where we use meta.name as file name ### Tests - the new test_direct_file_operation_utils.py to validate parse_file_operation for _upload and _upload_stream - existing test test_bindings.py to make sure the change does not break existing bind upload logic - existing test_pandas_tools.py to make sure the change does not break existing write_pandas logic
This should be broken down into two PRs at least
|
"""Parses a file operation by constructing SQL and getting the SQL parsing result from server.""" | ||
options_in_sql, option_bind_values = self._process_options_for_upload(options) | ||
|
||
if command_type == CMD_TYPE_UPLOAD: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not look like a pure refactor. We do an assertion, a file name split and some escaping. These do not exist in the original code. Why are they required?
self.cursor._upload_stream( | ||
input_stream=f, | ||
stage_location=os.path.join(self.stage_path, f"{row_idx}.csv"), | ||
options={}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? The original function use PUT
which is somewhat equivalent to _upload
?
@@ -557,6 +561,7 @@ def mocked_execute(*args, **kwargs): | |||
) | |||
|
|||
|
|||
# @pytest.mark.skipolddriver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unintended change
Closing this PR, because we split it into 2 sub PRs
I will copy the comments over and address them in the split PRs instead |
… write work for Python sprocs
_upload_stream
to implement BindUploadAgent_upload
to implement write_pandasTests
Please answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #NNNN
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Please write a short description of how your code change solves the related issue.
(Optional) PR for stored-proc connector: