-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-536706: Binding Parameters to Array Variables #540
Comments
Hi @mihaitodor, thank you for your inquiry. The long and short of it is that bulk array binds are meant to be used internally to distinguish whether to simply bind the array through a conventional REST request or go through a staging mechanism to achieve the same thing; it's not something we leave configurable to the user. In terms of variant binding, we currently don't support it, which is in line with the failure post 1 element of the array. Please let me know if this answered your questions! The first part is something that we've decided not to expose and the second is something in our backlog but currently not available. |
@sfc-gh-jbahk Thank you for the update! I guess that makes sense and I’m looking forward to having support for variant binding. Would be great to get an update once that’s done. Should I open a separate issue for that? Also, is there some explanation for the discrepancies I described above between the two queries? To me, they should behave identically. I’m somewhat surprised that you don’t want to allow users to tweak the threshold for creating a temporary stage and it makes me uncomfortable about using this insert-based mechanism for bulk uploads, because, obviously, you could change the threshold without notice and that would have a big impact on performance. This leads me to think that you prefer to have an explicit Snowpipe as the default mechanism for bulk inserts, but, unfortunately, this library doesn’t offer much support for it, since I had to do quite a bit of digging to figure out how to construct a valid JWT and then I had to construct the REST endpoint manually. It’s not a huge burden and I’ll upstream my work as open source to https://benthos.dev (open issue here), but since there’s no guidance in the docs on which mechanism to use, I wasn’t sure which path to follow. I can add support for both, but I’d like to write some clear guidance for future users. |
No need to open a separate issue for variant binding - it's already something in our backlog (#217). The driver's implementation of it depends on a lot of things so it will take some time before it's fully rolled out. I think the discrepancy comes from how the server decides to handle this. The threshold is something we've decided to not let users control for a multitude of reasons, including the fact that the stage creation/insertion/copy mechanism puts a different type of strain where there's a tradeoff considering the overhead of setting it up in the backend. The number may change in the future, but it will be internal for the time being. I appreciate your patience with these matters. |
When inserting large amounts of data, the `snowflake_put` output should outperform `sql_insert` because it uses Snowpipe directly. However, the Snowflake SQL driver also attempts to send the data to a temporary stage when the number of messages in a batch exceeds an [undocumented](https://pkg.go.dev/github.com/snowflakedb/gosnowflake#hdr-Batch_Inserts_and_Binding_Parameters) threshold, but it's not very consistent. Details [here](snowflakedb/gosnowflake#540). This implementation should satisfy the basic use cases, but it might be possible to optimise inserts later by binding array variables to parameters. Details [here](https://pkg.go.dev/github.com/snowflakedb/gosnowflake#hdr-Binding_Parameters_to_Array_Variables). Unfortunately, I was unable to add integration tests based on MySQL as described [here](snowflakedb/gosnowflake#279) because [this code](https://github.com/snowflakedb/gosnowflake/blob/74e351e5e110c5b4c409730b47d5fa4058ab1c6f/auth.go#L45-L91) does not allow me to set the `authenticator` parameter to `tokenaccessor`, so I can't get the driver to bypass the authentication step. There might be other blockers too, not sure.
Issue description
I'm unable to figure out what is the right query to do a bulk insert of n values using array binding, where
1 < n < CLIENT_STAGE_ARRAY_BINDING_THRESHOLD
. Using the debugger, I determined thatCLIENT_STAGE_ARRAY_BINDING_THRESHOLD
is set to65280
.I get an error if I try to run
ALTER SESSION SET CLIENT_STAGE_ARRAY_BINDING_THRESHOLD = 1
, but I don't think I really need to change this parameter and I'm guessing it's just some internal field that you don't intend to expose (it doesn't show up when I runSHOW PARAMETERS
, but I saw it in your tests), so let's ignore this part now.The problem is that I can't seem to figure out which is the right way of inserting multiple records into a table with a
variant
column. I tried multiple options, but can't figure out one that works in all cases and I'd like to avoid hardcoding65280
as a threshold, since it's not a documented limit. I guess I can create a plain SQL statement without binding parameters when I need to insert between1
and65279
rows, but that's a really nasty hack...Please advise.
Example code
I am using the following query to create a table:
And the following Go code:
If I set
rows
to1
, it fails with this log:out1.log
(
SQL compilation error: Invalid expression [PARSE_JSON(?)] in VALUES clause
)If I set
rows
to any number between2
and65279
, it fails with these logs:out2.log
out65279.log
(
SQL compilation error: error line 1 at position 36 Bind variable ? not set.
)If I set
rows
to65280
, it succeeds with this log:out65280.log
Then I tried to change the Go code to this:
If I set
rows
to1
, it succeeds with this log:out1.log
But, if I set
rows
between2
and65279
, it fails with these logs:out2.log
out65279.log
(
SQL compilation error: Array bind currently not supported for this query type.
)And, finally, if I set
rows
to65280
, it fails with this log:out65280.log
(
SQL compilation error: error line 1 at position 58 Bind variable ? not set.
)Error log
See attached inline above
Configuration
Driver version (or git SHA):
v1.6.6
Go version: run
go version
in your console1.17.5
Server version:
6.1.1
Client OS:
macOS Catalina 10.15.7
The text was updated successfully, but these errors were encountered: