Operate along vertical (time) dimension after isolating data above threshold #9788
Unanswered
geacomputing
asked this question in
Q&A
Replies: 1 comment
-
Update: This is my mask, after computing and applying an above/below threshold condition.
This is, in essence, how I would like to have my mask: multidim array, instead of timeseries. I am in the process of running a cumsum along time: Maybe this?
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all. My name's Marco, this is my first post asking for help. I have searched long and wide but am still struggling to get to a decent solution to my problem. Hope to find some constructive comments and help here. Let me explain.
I have a sea surface temperature (SST), for a chunk of ocean, no vertical level. Say 30 lats and longs (at 0.05 deg resolution), over 40 years, daily values.
I want to investigate marine heatwaves. In that respect, I need to see when and to what extent my SST exceeds (custom) climatology.
The climatology is quantile based, involving two steps.
1st step:
t=sst.groupby("time.dayofyear").quantile(0.9)
I am creating a first threshold.
2 step:
I apply a rolling mean to first step, named rt, with one sided amplitude of 5 days:
rt=t.rolling(time=11, center=True).mean()
The amplitude is 11 because:
5+5+1=11
In my rolling window, I am using dimension time (not dayofyear) as I am combining (subtracting) it with original SST (grouped by day): as a result rolling climatology has the exact same dimensions and number of elements.
Now, I am creating a mask by doing:
mask=sst-rt>0
This mask has the same dimensions of SST and tells me when, where and to what extent my SST exceeds the rolling threshold (rt). In essence when and where a marine wave MIGHT occurs. This is where I stand now.
Question:
From here I am trying to work along dimension time ONLY. That means that along my matrix I want to apply a function that, over the sole time domain, detects events whose uninterrupted duration is at least N days, separating heat spikes and marine heat waves. Whatever survives this test, is a heat wave. All the rest is a heat spike.
Expected outcome:
By so doing, and by selecting a time, I could:
see on a map the spatial distribution of events (heatwaves).
quantify their spatial extent. For instance counting the grid cells within a shape file (buffer zone of 10 km off the coast).
compile a taxonomy. I would add another variable, with same dimensions, to the result of the operation. I could populate that variable with categories (mild, moderate, intense). Also, I could append the traits and fingerprints of each event (parameters defining taxonomy).
I am trying to vectorize everything. At the moment, for a test, I'm only working with a time varying box of 30*30 lats, lons, buty dataset is far bigger than that.so everything should stay away from for loops as much as possible, in favour of a vectorized approach.
I have a function that works on 1D time-series, but would prefer to perform (vectorized) bulk operations in the whole dataset.
I am trying to use apply ufunc, but the syntax is not so clear to me (core in and core out).
What is the best strategy? What should I consider or avoid? Any constructive comment is more than welcome.
Thank you so much.
Marco
Beta Was this translation helpful? Give feedback.
All reactions