-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-983911 Add support for querying the vector data type #997
SNOW-983911 Add support for querying the vector data type #997
Conversation
932ed9d
to
47af7e8
Compare
@@ -91,10 +91,10 @@ func goTypeToSnowflake(v driver.Value, tsmode snowflakeType) snowflakeType { | |||
} | |||
|
|||
// snowflakeTypeToGo translates Snowflake data type to Go data type. | |||
func snowflakeTypeToGo(dbtype snowflakeType, scale int64) reflect.Type { | |||
switch dbtype { | |||
func snowflakeTypeToGo(rowType execResponseRowType) reflect.Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of this. This function was independent of API response and it was good - separaton of concernts. I'm wondering if some specific struct can be used here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could parse the response type in the caller of this function and pass down some type of options struct with the vector metadata. However, since this function is only called in one place right now (and in the context of a request), it felt cleaner to have the function take in execResponseRowType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that separations of layers (API/mapping) is good :) I would prefer to have API parsing closer to API functions, and this function preserve with just types at it was before.
} | ||
switch getSnowflakeType(rowType.Fields[0].Type) { | ||
case fixedType: | ||
return reflect.TypeOf([]int32{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the documentation, in V2 there may be different number of bits per type. Does this PR cover only V1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we only currently support 32 bit values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we mention V1 in the commit message (so for now - in PR name only)?
driver_test.go
Outdated
@@ -1263,6 +1264,48 @@ func testArray(t *testing.T, json bool) { | |||
}) | |||
} | |||
|
|||
func TestVector(t *testing.T) { | |||
testVector(t, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are lacking the test for JSON, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests for JSON to old_driver_test.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that adding it close to the Arrow tests is better for now.
@@ -610,6 +646,33 @@ func arrowToValue( | |||
} | |||
} | |||
return err | |||
case vectorType: | |||
vectorData := srcValue.(*array.FixedSizeList) | |||
datatype := vectorData.DataType().(*arrow.FixedSizeListType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work only for arrow? What should happen if a client uses ALTER SESSION SET GO_QUERY_RESULT_FORMAT = JSON
?
@@ -1263,6 +1264,102 @@ func testArray(t *testing.T, json bool) { | |||
}) | |||
} | |||
|
|||
func TestVectorInt(t *testing.T) { | |||
testVectorInt(t, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about testing with JSON?
}, | ||
} | ||
runDBTest(t, func(dbt *DBTest) { | ||
dbt.mustExec("alter session set enable_vector_data_type='Enable'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use uppercase for parameters, can you change it?
driver_test.go
Outdated
@@ -1263,6 +1264,48 @@ func testArray(t *testing.T, json bool) { | |||
}) | |||
} | |||
|
|||
func TestVector(t *testing.T) { | |||
testVector(t, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that adding it close to the Arrow tests is better for now.
@@ -131,6 +135,40 @@ func dataTypeMode(v driver.Value) (tsmode snowflakeType, err error) { | |||
return tsmode, nil | |||
} | |||
|
|||
type vectorElements interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering about one thing. In the future we are expected to implement structured types, which contains arrays. Array is a very similar structure to vector, just more generic I'd say. Maybe we can prepare the code here to use these arrays instead of native backend type as vector?
}{ | ||
{ | ||
msg: "1 element vector", | ||
query: `select [-5]::vector(int, 1)`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also have a test when the response contains empty/null value?
Description
Added support for parsing vectors from the serialized result set, so queries that return vectors can now be made. To do so, I also had to update
execResponseRowType
to add two new fields that are now being sent back, matching the change I recently made to the Python connector.Checklist
make fmt
to fix inconsistent formatsmake lint
to get lint errors and fix all of them