Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Snowflake SQL driver to SQL components #1507

Merged
merged 1 commit into from
Oct 11, 2022

Conversation

mihaitodor
Copy link
Collaborator

When inserting large amounts of data, the snowflake_put output should outperform sql_insert because it uses Snowpipe directly. However, the Snowflake SQL driver also attempts to send the data to a temporary stage when the number of messages in a batch exceeds an undocumented threshold, but it's not very consistent. Details here.

This implementation should satisfy the basic use cases, but it might be possible to optimise inserts later by binding array variables to parameters. Details here.

Unfortunately, I was unable to add integration tests based on MySQL as described here because this code does not allow me to set the authenticator parameter to tokenaccessor, so I can't get the driver to bypass the authentication step. There might be other blockers too, not sure.

When inserting large amounts of data, the `snowflake_put` output
should outperform `sql_insert` because it uses Snowpipe directly.
However, the Snowflake SQL driver also attempts to send the data
to a temporary stage when the number of messages in a batch
exceeds an [undocumented](https://pkg.go.dev/github.com/snowflakedb/gosnowflake#hdr-Batch_Inserts_and_Binding_Parameters)
threshold, but it's not very consistent. Details [here](snowflakedb/gosnowflake#540).

This implementation should satisfy the basic use cases, but it
might be possible to optimise inserts later by binding array
variables to parameters. Details [here](https://pkg.go.dev/github.com/snowflakedb/gosnowflake#hdr-Binding_Parameters_to_Array_Variables).

Unfortunately, I was unable to add integration tests based on
MySQL as described [here](snowflakedb/gosnowflake#279)
because [this code](https://github.com/snowflakedb/gosnowflake/blob/74e351e5e110c5b4c409730b47d5fa4058ab1c6f/auth.go#L45-L91)
does not allow me to set the `authenticator` parameter to
`tokenaccessor`, so I can't get the driver to bypass the
authentication step. There might be other blockers too, not sure.
@mihaitodor mihaitodor requested a review from Jeffail as a code owner October 11, 2022 00:31
Copy link
Collaborator

@Jeffail Jeffail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, epic docs ❤️

@Jeffail Jeffail merged commit 062b1aa into redpanda-data:main Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants