Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] support freq='S' in id_time_grid #157

Merged
merged 14 commits into from
Mar 27, 2025

Conversation

elephaint
Copy link
Contributor

The following code fails without the fix, which is caused by the capital S not correctly being recognized in numpy's timedelta64. We prevent the error by making sure the frequency is s whenever the pd.offset is Second, and similar for sub-second frequencies.

from utilsforecast.data import generate_series
from utilsforecast.preprocessing import id_time_grid

freq = "S"
df = generate_series(n_series = 10,
                     freq = freq)

grid = id_time_grid(df, freq = freq)

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@elephaint elephaint requested a review from jmoralez March 3, 2025 13:27
@jmoralez
Copy link
Member

Can you restore the outputs?

@elephaint
Copy link
Contributor Author

elephaint commented Mar 26, 2025

Can you restore the outputs?

Out of interest - we clear the outputs in every other repo, why should they be kept here?

@jmoralez
Copy link
Member

Because we actually use them as docs here

@jmoralez
Copy link
Member

Can you elaborate on the following?

We prevent the error by making sure the frequency is s whenever the pd.offset is Second, and similar for sub-second frequencies.

The only problematic one seems to be seconds because it can be either S or s in pandas and just s in numpy, but all other aliases match, see the tables for pandas and numpy.

@jmoralez
Copy link
Member

Ah nevermind, seems like NS and US also work, which is nice.

@jmoralez
Copy link
Member

So please just remove the microseconds, since MS is month start.

@elephaint
Copy link
Contributor Author

elephaint commented Mar 27, 2025

So please just remove the microseconds, since MS is month start.

How is that an issue? Given freq='MS' as input, the offset will be <MonthBegin>, which will end up in the try pd.Timedelta where it fails?

So in fact we should be adding Month as well, i.e.

elif isinstance(offset.base, (pd.offsets.MonthBegin, pd.offsets.MonthEnd)):
    freq = 'M'

which captures all of M, MS and ME strings, and is much better than the try / except where it now falls into, no?

@jmoralez
Copy link
Member

If you want to go that way then remove the try except

@jmoralez
Copy link
Member

jmoralez commented Mar 27, 2025

And also add quarter which I think also fails atm, quarter should translate to 3months, so n=3, freq='M'

@elephaint
Copy link
Contributor Author

And also add quarter which I think also fails atm, quarter should translate to 3months, so n=3, freq='M'

Quarter was already included?

@elephaint
Copy link
Contributor Author

If you want to go that way then remove the try except

Yeah I'm thinking whether there's any easy case we're then missing by removing it. Maybe let's keep it like this for now....

@jmoralez jmoralez added the fix bug fix label Mar 27, 2025
@jmoralez jmoralez changed the title [FIX] Add capital seconds to id_time_grid [FIX] support freq='S' in id_time_grid Mar 27, 2025
@elephaint elephaint merged commit 2d808c1 into main Mar 27, 2025
21 checks passed
@elephaint elephaint deleted the fix/time_grid_failing_on_second_offset branch March 27, 2025 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants