Skip to content

Fix: pandas NaT in datetime/timedelta columns stored as garbage values#33

Merged
adsharma merged 5 commits into
LadybugDB:mainfrom
mdbenito:fix/not-a-time
Jun 24, 2026
Merged

Fix: pandas NaT in datetime/timedelta columns stored as garbage values#33
adsharma merged 5 commits into
LadybugDB:mainfrom
mdbenito:fix/not-a-time

Conversation

@mdbenito

Copy link
Copy Markdown
Contributor

Currently NaT/None in datetime64 and timedelta64 DataFrame columns are stored as the sentinel value (INT64_MIN ~= 1677-09-21) instead of NULL.

This PR addresses that.

To reproduce before the PR:

import datetime
import numpy as np
import pandas as pd
import ladybug

db = ladybug.Database()
conn = ladybug.Connection(db)

# datetime NaT
conn.execute("CREATE NODE TABLE t_dt (id INT64, ts TIMESTAMP, PRIMARY KEY (id))")

nat = np.datetime64("NaT", "ns")
df_dt = pd.DataFrame(
    {
        "id": [1, 2],
        "ts": np.array([np.datetime64("2024-01-15"), nat], dtype="datetime64[ns]"),
    }
)
conn.execute(
    "COPY t_dt FROM (LOAD FROM $df RETURN CAST(id AS INT64) AS id, CAST(ts AS TIMESTAMP) AS ts)",
    {"df": df_dt},
)

for row in (
    conn.execute("MATCH (t:t_dt) RETURN t.id, t.ts ORDER BY t.id")
    .get_as_df()
    .itertuples(index=False)
):
    print(f"id={row[0]:<2}  ts={row[1]!r}")

# timedelta NaT
conn.execute("CREATE NODE TABLE t_td (id INT64, dur INTERVAL, PRIMARY KEY (id))")

nat_td = np.timedelta64("NaT", "ns")
df_td = pd.DataFrame(
    {
        "id": [1, 2],
        "td": np.array(
            [np.timedelta64(3600000000000, "ns"), nat_td], dtype="timedelta64[ns]"
        ),
    }
)
conn.execute(
    "COPY t_td FROM (LOAD FROM $df RETURN CAST(id AS INT64) AS id, CAST(td AS INTERVAL) AS dur)",
    {"df": df_td},
)

for row in (
    conn.execute("MATCH (t:t_td) RETURN t.id, t.dur ORDER BY t.id")
    .get_as_df()
    .itertuples(index=False)
):
    print(f"id={row[0]:<2}  dur={row[1]!r}")

Running that script yields:

id=1   ts=Timestamp('2024-01-15 00:00:00')
id=2   ts=Timestamp('1677-09-21 00:12:43.145225')
id=1   dur=Timedelta('0 days 01:00:00')
id=2   dur=Timedelta('-106752 days +00:12:43.145225')

And after the fix:

id=1   ts=Timestamp('2024-01-15 00:00:00')
id=2   ts=NaT
id=1   dur=Timedelta('0 days 01:00:00')
id=2   dur=NaT

@adsharma adsharma merged commit 64f5f6b into LadybugDB:main Jun 24, 2026
2 checks passed
@mdbenito mdbenito deleted the fix/not-a-time branch June 24, 2026 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants