Autoregressive prewhitening (per channel) #60

kkappler · 2021-08-07T21:12:42Z

First difference prewhitening mitigated most of the spectral leakage artefacts in the synthetic test dataset. Gary suggested that we add autoregressive prewhitening as well, say to order 3 or 4. This would be implemented per channel, and thus would need to recolored.

This is a different issue than Gary's segment-by-segment prewhitening from the original FORTRAN codes, which is a seperate issue but can be implemented (see issue #3 for some details)

kkappler · 2021-12-09T05:22:19Z

To Do this practically we need:

A way to fit an ARMA(p,q) model to an input time-series that returns as a function of input arguments p, q, two sequences of floats, (ar_1, ar_2, ... ar_p), & (ma_1, ma_2, ... ma_q),
While these are expressed as coefficients in a time domain model, they may be equivalent to a polynomial fit of the same order to the frequency domain spectra ... in which case more terms would be needed when spectral lines were present ... in general, this method suffers from a lack of justification for the highest order term
https://en.wikipedia.org/wiki/Autoregressive%E2%80%93moving-average_model
https://www.informs.org/content/view/full/273349
https://www.statsmodels.org/v0.11.1/_modules/statsmodels/tsa/arima_model.html#ARIMAResults

kkappler · 2022-08-07T19:54:53Z

This approach, and prewhitening in general can be applied either to the whole time series before windowing/detrending, or after windowing it can be applied in a "window-by-window" way. Obviously for first (or second) difference there should be little if any difference, but when the prewhitening involves fitting a statistical model, this choice of when along the pipeline to apply this transformation matters, i.e. we expect slightly different ARMA coefficients for each window, and moreover, these window-by-window fits will differ from the ARMA coefficients that wouls be obtained if we fit a model to the entire time series.

In general the pipeline should be modified to give the user the option (via config) of when to apply prewhitening

give the user the option (via config) of when to apply prewhitening

kkappler · 2022-08-28T19:29:28Z

I took a crack at this on the fix_issue_60 branch, but it was not trivial. ToDo:

Setup a discussion with Gary about how to best implement this

kkappler · 2022-11-05T02:04:35Z

Created a jupyter notebook called arma_primer.ipynb which shows how to use statsmodels package to fit an arma model and handle a few gory details. At this point I am satified that given an arbitrary time series we can prewhiten it with ARMA, then FFT and then compensate for the ARMA coefficients on the frequency domain side.

The following need to be addressed:

Note: this ticket is about adding prewhitening per channel, but there has been no statement yet about whether the operation happens within a channel "per-window" or "per time series". Discussions with Gary hinted that "we may as well" just fit to the entire time series, but those discussions predated observations that the time series can get very (very) long ... i.e. larger than RAM in some cases. Because of this, I think we need to specify a prewhitening window length. The default window lengths can be either "whole time series", or "fft window" ... anything else will wind up being complicated, for what are likely diminishing returns.

Ultimately, I had hoped to show that ARMA PW on the "whole run" would yield PW quality around the same as if we had applied it "per FFT window", because in that case, I had hoped to deprecate the intrinsic windowing code and just use the built-in spectrogram method of scipy.signal. However ... showing this, in any sort of definitive way seems like a study on it's own, and is in some sense beyond the scope of the current aurora objectives.

So I think we want to

Add support in the processing metadata for this operation
- prewhitening type = "first difference", "ARMA, P, Q, fft_window", "ARMA, P, Q, total_run".
  At risk of both a performance hit, missing a novel approach, and simplicity, I will implement the 'fft_window' method first, and possibly create a follow-up ticket for 'total run'
Put a tap in the code to transform a data window into data' = {ARMA coefficients, ARMA residuals}
[ ]write a method that acts the same as run_ts_to_stft, taking as input the run_ts, and treating it as data' (or parsing it into many data' s) and returns the FFT array, with appropriate recoloring ... *this one requires careful thought,
polish the algorithm that actually computes and then applied the ARMA transformations
stitch it all together and test on the synthetic
Don't forget to look at ARIMA, which is to say, we need to decide
- whether or not to first difference the data up front, and how many times, and
- whether the differencing happens before or after calling statsmodels
  ....in fact, might it be true that a stasmodels ARIMA(0,1,0) on total run is equivalent to first difference filtering???

Finally, consider that once the time series is decomposed into white noise and an ARMA filter, the FFT of the ARMA filter is still being transformed in the ipynb with scipy.signal.freqz, which calls fft under the hood ... aren't we subject to leakage from that transformation? If so, how do we know it is less severe than fft without the ARMA decomposition? Empirically it looks like there is less leakage (high frequencies of the spectra have lower amplitudes) but it isn't obvious (to me) how exactly the leakage is being mitigated, nor by how much.

-it works for low order MA, but not for ARMA(2,3) for example Issue(s): #60

THese are incomplete and I had intended to merge them all toghether, but after noticing the bug in arma_primar, I want to get an archived copy of primer4. Primer4 doesn't suffer infinitely lfilter response because the time series is only 1024 points, not 2048 [Issue(s): #60]

kkappler added Phase 2 enhancement New feature or request labels Oct 8, 2021

kkappler mentioned this issue Nov 11, 2021

Test Autoregressive prewhitening per segment #133

Open

kkappler self-assigned this Nov 10, 2022

kkappler added a commit that referenced this issue Nov 14, 2022

Point out bug in ARMA primer

be7a30d

-it works for low order MA, but not for ARMA(2,3) for example Issue(s): #60

kkappler removed the Phase 2 label Mar 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoregressive prewhitening (per channel) #60

Autoregressive prewhitening (per channel) #60

kkappler commented Aug 7, 2021

kkappler commented Dec 9, 2021 •

edited

Loading

kkappler commented Aug 7, 2022 •

edited

Loading

kkappler commented Aug 28, 2022

kkappler commented Nov 5, 2022 •

edited

Loading

Autoregressive prewhitening (per channel) #60

Autoregressive prewhitening (per channel) #60

Comments

kkappler commented Aug 7, 2021

kkappler commented Dec 9, 2021 • edited Loading

kkappler commented Aug 7, 2022 • edited Loading

kkappler commented Aug 28, 2022

kkappler commented Nov 5, 2022 • edited Loading

kkappler commented Dec 9, 2021 •

edited

Loading

kkappler commented Aug 7, 2022 •

edited

Loading

kkappler commented Nov 5, 2022 •

edited

Loading