Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented credibility_interval() #188
base: master
Are you sure you want to change the base?
Implemented credibility_interval() #188
Changes from 35 commits
00c61e5
8098bfd
f6afe0a
6c816eb
e3c1f21
c22e304
3f8f2cb
a39c038
be0eb50
3b20f1c
4a66d74
0d53bce
d6b8bb9
da7ebaf
92f24be
19b3e1a
e97526e
12a6bc8
bcf2ae1
095a64a
cf95f46
7f56e78
d9c4076
25e3843
693abef
f6e9b40
7d634b5
f33decf
0b938e1
3d26a41
e5cb4b6
30c70bf
36c58d2
4f79a09
dccfe4b
84ed5b6
d1aaf02
2527895
b466d1a
0d59b7c
2a0d3b8
ed4607f
b4b78e6
9911dd2
5ac5fca
86d3836
dbd1303
7ce8b59
f016fa5
fb42c59
5c14599
1997ffd
b0862e1
a92f822
4d16340
1e78eb7
7fdd99a
8a44f17
1327055
16a0d42
49a3d1b
48444c4
d1cdd8c
f641a40
5fb72f1
bad654c
26ce56a
39337dc
010a02e
f05f3fd
f1e4d11
707376b
463ae88
49863cc
1ec5b8e
0c63e6e
54bf848
292aba5
b85b84d
525381c
9fabe8d
3da3115
56a0c6c
8aa2bb8
5832781
48b1a8b
202d754
fe25aeb
3e6bd4b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more appropriate to normalise as follows?
Or would it be better to handle bound errors within
interp1d
? (see below)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly it doesn't matter since
atol=1e-9
, this is just dealing with python float precision. So lets stick to the simpler version to avoid looking like we're doing some actual normalisationThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly prefer for there not to be an assert statement (which the normalisation would fix). This kind of thing would be infuriating as part of a large automated pipeline where floating point errors derail a larger workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MCMCSamples
orNestedSamples
), it would probably be good for all function arguments to appear in the docstring.**interpolation_kwargs
, which we then pass on tosample_cdf
, which uses{'kind': 'linear'}
as default, but which allows to use other interpolation kinds? Would it be possible to implement that in a way that allows to dynamically choose to opt for the discrete empirical distribution function instead through one of'nearest'
,'previous'
, or'next'
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated the docstring!
I would rather not add
interpolation_kwargs
because the interpolation should not matter (thousands of data points). In any case where the interpolation does matter (10-ish points) the method is going to be wrong in any case, and there exists no good interpolation one could choose.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that for a lot of samples the interpolation method hardly matters. But a lot of the discussion above was specifically about exploring different interpolation options. So I would suggest to pick your favourite as default, but give the user full flexibility. For one, this would allow the user to at least try different interpolation methods to get a sense of how much that might affect the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the random value array
u
be part of this function? I feel like this is combining two things (sample compression and computation of the C.I.) into one function, where it would make more sense to handle them separately...?Also
Shouldn't we try and fold both those uncertainties into our estimates? Currently we account for only the first point, right? If we were to use a different
u
for each iteration (i.e. for each of then_iter
samples), then that would contain the uncertainty from the sample compression...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason for this to be here was to allow to get reproducible outputs, i.e. pass through the
u
argument fromcompress_samples
. But I realize thatnp.random.seed(0)
before the function call already does this so theu
is not necessary; deleted.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot account for that because the user passes only a single data-sample. If that sample is just very unlucky, there's nothing we can do to know that.
I think this is the best we can do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we don't have precise knowledge of the population. But I think we can make our results more robust to variability coming from the sample compression (now that we got rid of the
u
). If we move the lineinto our iteration loop, then each iteration will use a slightly different subsample, giving us a handle at the (sub-)sample variance (or maybe it is more accurate to call this the compression variance...?).