Skip to content

Lazy-compute-ESS #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Sep 30, 2021
Merged

Lazy-compute-ESS #52

merged 15 commits into from
Sep 30, 2021

Conversation

ParadaCarleton
Copy link
Member

No description provided.

@ParadaCarleton
Copy link
Member Author

ParadaCarleton commented Sep 27, 2021

@sethaxen This PR adds a keyword arg for ESS calculations. If set to false, psis replaces the ESS values with a singleton matrix containing one NaN. It should avoid duplicate ESS computations and it makes the code a bit shorter; do you think it's good?

@codecov
Copy link

codecov bot commented Sep 27, 2021

Codecov Report

Merging #52 (58ef87f) into main (8650285) will decrease coverage by 0.27%.
The diff coverage is 82.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #52      +/-   ##
==========================================
- Coverage   82.40%   82.13%   -0.28%     
==========================================
  Files          11       11              
  Lines         341      347       +6     
==========================================
+ Hits          281      285       +4     
- Misses         60       62       +2     
Impacted Files Coverage Δ
src/ModelComparison.jl 86.79% <ø> (ø)
src/TuringHelpers.jl 100.00% <ø> (ø)
src/ImportanceSampling.jl 77.88% <80.00%> (-0.47%) ⬇️
src/ESS.jl 80.00% <100.00%> (ø)
src/GPD.jl 96.00% <100.00%> (ø)
src/LeaveOneOut.jl 84.48% <100.00%> (-0.27%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8650285...58ef87f. Read the comment docs.

@sethaxen
Copy link
Member

What I don't like about this approach is that it's still allocating a vector that the user doesn't necessarily want. I still prefer a lazy computation of the vector only if the user requests it, which can happen one of two ways:

  1. The approach in Avoid computing ess and sup_ess in psis #51, using getproperty
  2. OR overload psis_ess(::Psis) and psis_sup_ess(::Psis) and export psis_ess and psis_sup_ess

I chose (1) earlier so that the same API is supported, but I slightly prefer something like (2).

@ParadaCarleton
Copy link
Member Author

What I don't like about this approach is that it's still allocating a vector that the user doesn't necessarily want. I still prefer a lazy computation of the vector only if the user requests it, which can happen one of two ways:

  1. The approach in Avoid computing ess and sup_ess in psis #51, using getproperty
  2. OR overload psis_ess(::Psis) and psis_sup_ess(::Psis) and export psis_ess and psis_sup_ess

I chose (1) earlier so that the same API is supported, but I slightly prefer something like (2).

Do you think the extra allocation is a meaningful problem? When ess=false is passed the function allocates a singleton, so it takes ~no time or memory.

That being said, I just realized that memoizing psis_ess should accomplish the same goal, so I suppose I can do that instead if you want.

@sethaxen
Copy link
Member

Do you think the extra allocation is a meaningful problem? When ess=false is passed the function allocates a singleton, so it takes ~no time or memory.

It's not about the extra allocation, it's about complexity of the code and interface. Your proposal introduces an additional keyword argument and allocates a vector that is essentially empty, and this seems unnecessary, when you can use either of the two options I suggested. The first would do neither of these things, while the second would export one additional function; in neither case is the vector allocated unless the user requests it.

That being said, I just realized that memoizing psis_ess should accomplish the same goal, so I suppose I can do that instead if you want.

There's no need to memoize. A user is unlikely to call psis_ess on the same Psis object twice, and this just adds code complexity and a new dependency.

What do you dislike about either of the options I suggested?

@ParadaCarleton
Copy link
Member Author

Do you think the extra allocation is a meaningful problem? When ess=false is passed the function allocates a singleton, so it takes ~no time or memory.

It's not about the extra allocation, it's about complexity of the code and interface. Your proposal introduces an additional keyword argument and allocates a vector that is essentially empty, and this seems unnecessary, when you can use either of the two options I suggested. The first would do neither of these things, while the second would export one additional function; in neither case is the vector allocated unless the user requests it.

That being said, I just realized that memoizing psis_ess should accomplish the same goal, so I suppose I can do that instead if you want.

There's no need to memoize. A user is unlikely to call psis_ess on the same Psis object twice, and this just adds code complexity and a new dependency.

What do you dislike about either of the options I suggested?

I'd be very surprised if no users at all want to see the ESS twice, given the ESS displays whenever a Psis object is called in the REPL. I suppose I have three concerns:

  1. Wasted computation if the ESS is recalculated every time the user wants to take a look at the ESS values.
  2. Calculating the ESS lazily makes performance behavior very unintuitive. It's weird for running a function to be fast, but then have it be very slow to access a property of a struct, which is usually close to instant. It's especially weird for it to be very slow to read a display table in the REPL when computation is finished.
  3. I feel as though the implementation here is less complex rather than more, since the code is a bit shorter. An extra keyword argument adds almost no complexity relative to having to overload several methods.

If you have concerns about returning an empty array for psis_ess, we can make it so that trying to access these fields when they don't exist throws an error. I think 1 and 2 are obviously not a big deal for people who don't want to access the ESS, but most people should be reading those -- we want them to be checking diagnostics.

@ParadaCarleton ParadaCarleton merged commit 39b5e66 into main Sep 30, 2021
@delete-merged-branch delete-merged-branch bot deleted the Lazy-compute-ESS branch September 30, 2021 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants