You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've tried to upload a wide dataframe into an existing table using write_pandas and was struggling with various type-related problems, most prominently this one:
snowflake.connector.errors.ProgrammingError: 002023 (22000): SQL compilation error:
Expression type does not match column data type, expecting BINARY(20) but got VARIANT for column BOM_SYS_HD
I tried every method under the sun to force and convert the data types of these columns of this dataframe to exactly what the target table expects, but nothing worked. Note that the data in the dataframe is correct and correctly typed. Essentially, I read from a snowflake table into a dataframe, perform some calculations and write it back.
When I was close to throwing in the towel, I took a look at the code of write_pandas and noticed that when auto_create_table=True is set, a schema inference step is performed. And lo and behold, that made it work. The data arrived correctly in the already existing target table.
What is the desired behavior?
Given that auto_create_table generates a CREATE ... IF NOT EXISTS statement, it’s relatively safe to use even if you already have an existing table. Nonetheless, I’d appreciate a more explicit (and even safer) way to achieve what I need: performing only the schema inference step.
This could be implemented with a new parameter for write_pandas, e.g., infer_schema, defaulting to False. I’ve implemented this in a local copy of this package, and it works well for me. I can submit a pull request if you’re interested.
How would this improve snowflake-connector-python?
Might solve more issues such as mine with a relatively low-friction parameter.
References and other background
No response
The text was updated successfully, but these errors were encountered:
github-actionsbot
changed the title
Extend write_pandas by a parameter for schema inference
SNOW-2019088: Extend write_pandas by a parameter for schema inference
Apr 1, 2025
hi - thanks for raising this with us. If it's within your possibilities, please submit a PR and the team will review. Otherwise, we'll consider this enhancement request for later planning.
One more reason why I think this is useful: a user might want/need the schema inference but may not have the necessary access to CREATE TABLE, thus, can't use auto_create_table as a workaround.
What is the current behavior?
This feature request is born out of an issue I've had with
write_pandas()
frompandas _tools
: https://github.com/snowflakedb/snowflake-connector-python/blob/main/src/snowflake/connector/pandas_tools.py#L250I've tried to upload a wide dataframe into an existing table using
write_pandas
and was struggling with various type-related problems, most prominently this one:I tried every method under the sun to force and convert the data types of these columns of this dataframe to exactly what the target table expects, but nothing worked. Note that the data in the dataframe is correct and correctly typed. Essentially, I read from a snowflake table into a dataframe, perform some calculations and write it back.
When I was close to throwing in the towel, I took a look at the code of
write_pandas
and noticed that whenauto_create_table=True
is set, a schema inference step is performed. And lo and behold, that made it work. The data arrived correctly in the already existing target table.What is the desired behavior?
Given that
auto_create_table
generates aCREATE ... IF NOT EXISTS
statement, it’s relatively safe to use even if you already have an existing table. Nonetheless, I’d appreciate a more explicit (and even safer) way to achieve what I need: performing only the schema inference step.This could be implemented with a new parameter for
write_pandas
, e.g.,infer_schema
, defaulting toFalse
. I’ve implemented this in a local copy of this package, and it works well for me. I can submit a pull request if you’re interested.How would this improve
snowflake-connector-python
?Might solve more issues such as mine with a relatively low-friction parameter.
References and other background
No response
The text was updated successfully, but these errors were encountered: