Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-983911 Add support for querying the vector data type #997

Conversation

sfc-gh-rpanchapakesan
Copy link

@sfc-gh-rpanchapakesan sfc-gh-rpanchapakesan commented Dec 10, 2023

Description

Added support for parsing vectors from the serialized result set, so queries that return vectors can now be made. To do so, I also had to update execResponseRowType to add two new fields that are now being sent back, matching the change I recently made to the Python connector.

Checklist

  • Code compiles correctly
  • Run make fmt to fix inconsistent formats
  • Run make lint to get lint errors and fix all of them
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary

@sfc-gh-rpanchapakesan sfc-gh-rpanchapakesan force-pushed the rpanchapakesan-SNOW-983911-vector-deserialization branch from 932ed9d to 47af7e8 Compare December 10, 2023 05:03
@sfc-gh-rpanchapakesan sfc-gh-rpanchapakesan marked this pull request as ready for review December 12, 2023 05:12
@sfc-gh-rpanchapakesan sfc-gh-rpanchapakesan requested a review from a team as a code owner December 12, 2023 05:12
@@ -91,10 +91,10 @@ func goTypeToSnowflake(v driver.Value, tsmode snowflakeType) snowflakeType {
}

// snowflakeTypeToGo translates Snowflake data type to Go data type.
func snowflakeTypeToGo(dbtype snowflakeType, scale int64) reflect.Type {
switch dbtype {
func snowflakeTypeToGo(rowType execResponseRowType) reflect.Type {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of this. This function was independent of API response and it was good - separaton of concernts. I'm wondering if some specific struct can be used here instead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could parse the response type in the caller of this function and pass down some type of options struct with the vector metadata. However, since this function is only called in one place right now (and in the context of a request), it felt cleaner to have the function take in execResponseRowType.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that separations of layers (API/mapping) is good :) I would prefer to have API parsing closer to API functions, and this function preserve with just types at it was before.

}
switch getSnowflakeType(rowType.Fields[0].Type) {
case fixedType:
return reflect.TypeOf([]int32{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the documentation, in V2 there may be different number of bits per type. Does this PR cover only V1?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we only currently support 32 bit values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mention V1 in the commit message (so for now - in PR name only)?

driver_test.go Outdated Show resolved Hide resolved
driver_test.go Outdated Show resolved Hide resolved
driver_test.go Outdated Show resolved Hide resolved
driver_test.go Outdated
@@ -1263,6 +1264,48 @@ func testArray(t *testing.T, json bool) {
})
}

func TestVector(t *testing.T) {
testVector(t, false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are lacking the test for JSON, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests for JSON to old_driver_test.go.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that adding it close to the Arrow tests is better for now.

driver_test.go Outdated Show resolved Hide resolved
@@ -610,6 +646,33 @@ func arrowToValue(
}
}
return err
case vectorType:
vectorData := srcValue.(*array.FixedSizeList)
datatype := vectorData.DataType().(*arrow.FixedSizeListType)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work only for arrow? What should happen if a client uses ALTER SESSION SET GO_QUERY_RESULT_FORMAT = JSON?

@@ -1263,6 +1264,102 @@ func testArray(t *testing.T, json bool) {
})
}

func TestVectorInt(t *testing.T) {
testVectorInt(t, false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about testing with JSON?

},
}
runDBTest(t, func(dbt *DBTest) {
dbt.mustExec("alter session set enable_vector_data_type='Enable'")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use uppercase for parameters, can you change it?

driver_test.go Outdated
@@ -1263,6 +1264,48 @@ func testArray(t *testing.T, json bool) {
})
}

func TestVector(t *testing.T) {
testVector(t, false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that adding it close to the Arrow tests is better for now.

@@ -131,6 +135,40 @@ func dataTypeMode(v driver.Value) (tsmode snowflakeType, err error) {
return tsmode, nil
}

type vectorElements interface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering about one thing. In the future we are expected to implement structured types, which contains arrays. Array is a very similar structure to vector, just more generic I'd say. Maybe we can prepare the code here to use these arrays instead of native backend type as vector?

}{
{
msg: "1 element vector",
query: `select [-5]::vector(int, 1)`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a test when the response contains empty/null value?

@github-actions github-actions bot locked and limited conversation to collaborators Apr 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants