Skip to content

SNOW-2057867 refactor and fixes to make pandas write work for Python … #2304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: zyao-SNOW-2057867-refactor-stage-bind-to-make-it-work-for-sprocs
Choose a base branch
from

Conversation

sfc-gh-zyao
Copy link
Contributor

@sfc-gh-zyao sfc-gh-zyao commented Apr 28, 2025

…sprocs

Note that this is based on top of an in-review change #2303

  • use _upload to implement write_pandas
  • correctly use scoped keyword and temp naming pattern for write_pandas

Tests

  • existing test_pandas_tools.py to make sure the change does not break existing write_pandas logic

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes #NNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am modifying authorization mechanisms
    • I am adding new credentials
    • I am modifying OCSP code
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

  4. (Optional) PR for stored-proc connector:

…sprocs

- use _upload to implement write_pandas
- correctly use scoped keyword and temp naming pattern for write_pandas
- fix a minor bug where file_transfer_agent returns meta.dst_file_name as file name in download (dst_file_name may contain sub-directories, whereas name will be the base file name without any sub-directory prefixes. We should keep it consistent with upload, where we use meta.name as file name

### Tests
- existing test_pandas_tools.py to make sure the change does not break existing write_pandas logic
@@ -818,7 +818,7 @@ def result(self) -> dict[str, Any]:

rowset.append(
[
meta.dst_file_name,
meta.name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the returned result to user, right? Would this imply a behavior change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is the first item returned result, and yeah, I think I should do it in a separate PR and go through behavior change process.
Currently behavior is that

  1. @stage_foo/test.txt will have the result test.txt
  2. @stage_foo/dir_x/test.txt will have the result test.txt
  3. @stage_foo/dir_x/subdir_y/test.txt will have the result subdir_y/test.txt

With this change, the first 2 scenarios will remain the same, but the 3rd scenario will be test.txt.

Let me separate this change out to a standalone PR. Is there a standard process for me to follow for such behavior change(s)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, let's follow up with the connector team to see if this is the intended behavior. It seems like your change is more desired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let me start the conversation with connector team, and do follow-up changes based on the outcome of discussion.

…or-sprocs' into zyao-SNOW-2057867-refactor-and-fix-write-pandas-to-make-it-work-for-sprocs
@sfc-gh-zyao sfc-gh-zyao added NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md DO_NOT_PORT_CHANGES_TO_SP Add this label when changes in this PR do not need to be port to SP connector labels Apr 29, 2025
…or-sprocs' into zyao-SNOW-2057867-refactor-and-fix-write-pandas-to-make-it-work-for-sprocs
…or-sprocs' into zyao-SNOW-2057867-refactor-and-fix-write-pandas-to-make-it-work-for-sprocs
…or-sprocs' into zyao-SNOW-2057867-refactor-and-fix-write-pandas-to-make-it-work-for-sprocs
…or-sprocs' into zyao-SNOW-2057867-refactor-and-fix-write-pandas-to-make-it-work-for-sprocs
…or-sprocs' into zyao-SNOW-2057867-refactor-and-fix-write-pandas-to-make-it-work-for-sprocs
- this has no effect for regular / non-sproc use cases because it is the default option
- sproc require it to be explicitly present, so we need it here (we will have a future server side change to make it optional as well for sproc)
…r regular temp object

- the naming pattern applies to both regular temp and scoped temp, but here we only use that naming pattern for scoped temp, this is incorrect
  - it works fine for non-sproc (for now), but it will break for sproc
- this change of naming pattern has no effect on customers because these are creating intermediate results, which is not consumed directly by customers
@sfc-gh-zyao
Copy link
Contributor Author

Hi @sfc-gh-sfan commit 30ce232 and e841eb4 are the ones that I mentioned earlier, could you please take a look and let me know if it looks good to you?

@sfc-gh-sfan
Copy link
Contributor

could you please take a look and let me know if it looks good to you?

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DO_NOT_PORT_CHANGES_TO_SP Add this label when changes in this PR do not need to be port to SP connector NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants