Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1896084: Nan / None values do not get converted to null in Snowflake when using write_pandas from snowflake-connector-python #2147

Open
smuniveruddin-mwb opened this issue Jan 24, 2025 · 0 comments

Comments

@smuniveruddin-mwb
Copy link

Python version

3.8.20

Operating system and processor architecture

python:3.8-slim-bookworm docker image - Linux-5.10.230-223.885.amzn2.x86_64-x86_64-with-glibc2.34

Installed packages

snowflake-connector-python==3.12.4
apache-airflow==2.6.3
pandas==2.0.3

What did you do?

import pandas as pd
from airflow.providers.snowflake.hooks.snowflake import SnowflakeHook
from snowflake.connector.pandas_tools import write_pandas

snowflake_hook = SnowflakeHook(snowflake_conn_id='snowflake_stage')
connection = snowflake_hook.get_conn()

data = {
    "id": [1, 2],
    "otp_sent_at": [pd.Timestamp("2025-01-24 12:00:00"), pd.NaT],
    "otp_token_available": [None, "abc123"]
}

df = pd.DataFrame(data)
for col in df.columns:
    df[col] = df[col].astype(str)

write_pandas(conn=connection, df=df, table_name="temp_table", database="test", schema="test", auto_create_table=True, overwrite=False, table_type="temp")

What did you expect to see?

The Actual data is extracted from RDS Postgres DB, Where for some columns with null pandas DF extract reads it as None(string) / NaT(timestamp). Hence converting the types for all columns to string NaT -> NaN (I guess !!!).

Expected Results in Snowflake:

Image

Actual Results:

Image

Can you set logging to DEBUG and collect the logs?

import logging
import os

for logger_name in ('snowflake.connector',):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)
@github-actions github-actions bot changed the title Nan / None values do not get converted to null in Snowflake when using write_pandas from snowflake-connector-python SNOW-1896084: Nan / None values do not get converted to null in Snowflake when using write_pandas from snowflake-connector-python Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant