Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

90 average calibration functions in utils.jl #97

Closed
wants to merge 48 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
733312a
function empirical_frequency
pasq-cat Jun 9, 2024
25ea642
fixed the docstring.
pasq-cat Jun 9, 2024
c290ed8
added sharpness and binary classification. i have yet to test them pr…
pasq-cat Jun 14, 2024
4ff22f4
added trapz to the list of dependencies.
pasq-cat Jun 15, 2024
6a22210
added Distributions to theproject
pasq-cat Jun 15, 2024
df3d60d
working version
pasq-cat Jun 15, 2024
09f25e8
ops forgot to add sharpness for the classification case
pasq-cat Jun 15, 2024
07b318f
working release.. changed changelog, glm_predictive_distribution, pr…
pasq-cat Jun 21, 2024
eafa7bd
function empirical_frequency
pasq-cat Jun 9, 2024
f66e08e
fixed the docstring.
pasq-cat Jun 9, 2024
5355281
added sharpness and binary classification. i have yet to test them pr…
pasq-cat Jun 14, 2024
2efaa99
added trapz to the list of dependencies.
pasq-cat Jun 15, 2024
26643ee
added Distributions to theproject
pasq-cat Jun 15, 2024
b79ca39
working version
pasq-cat Jun 15, 2024
0d71736
ops forgot to add sharpness for the classification case
pasq-cat Jun 15, 2024
5f772cf
working release.. changed changelog, glm_predictive_distribution, pr…
pasq-cat Jun 21, 2024
d146d1d
Merge branch '90-average-calibration-in-utilsjl' of https://github.co…
pasq-cat Jun 21, 2024
7af9378
changed docstrings in predicting.jl
pasq-cat Jun 21, 2024
2c42236
fixed glm_predictive_distribution
pasq-cat Jun 22, 2024
9d67ddc
Update src/utils.jl
pasq-cat Jun 22, 2024
9f07583
Update src/utils.jl
pasq-cat Jun 22, 2024
f81d226
Update src/utils.jl
pasq-cat Jun 22, 2024
6cdc503
Update src/baselaplace/predicting.jl
pasq-cat Jun 22, 2024
89bb19b
Update src/baselaplace/predicting.jl
pasq-cat Jun 22, 2024
6fe01a2
JuliaFormatter
pasq-cat Jun 22, 2024
0bba488
fixed docstrings
pasq-cat Jun 23, 2024
8311de3
made docstrings a lil bit shorter
pasq-cat Jun 23, 2024
7837333
docstrings again (added output)
pasq-cat Jun 24, 2024
b0518b2
fixed binary classification case, exported function from utils.
pasq-cat Jun 24, 2024
6a9ee1b
juliaformatter
pasq-cat Jun 24, 2024
203513d
add n_bins as argument to functions
pasq-cat Jun 29, 2024
dce9bdb
ops forgot default value
pasq-cat Jun 29, 2024
b906c3b
ops forgot default value and removed a line
pasq-cat Jun 29, 2024
2059bed
Merge branch '90-average-calibration-in-utilsjl' of https://github.co…
pasq-cat Jun 29, 2024
3258618
juliaformatter----
pasq-cat Jun 29, 2024
c86dc25
fixed small error in pred_avg
pasq-cat Jun 30, 2024
3d2ebd6
fixed error in empirical_frequency_regression
pasq-cat Jun 30, 2024
4ab04f6
Update src/utils.jl
pasq-cat Jun 30, 2024
267b8f4
docstrings fixes and predict update
pasq-cat Jul 2, 2024
d188daf
fixed typos
pasq-cat Jul 2, 2024
270b70a
moved sharpness functions units tests in calibration.jl. changed run…
pasq-cat Jul 2, 2024
3320063
more sharpness unit tests
pasq-cat Jul 2, 2024
3750dbe
fixes and more unit tests
pasq-cat Jul 2, 2024
39d4bdc
small stuff
pasq-cat Jul 3, 2024
56c3b66
fix. there is still an issue with the shape of the input to use.
pasq-cat Jul 3, 2024
908c804
fixed logit.md ,moved functions to new file, removed changes to predi…
pasq-cat Jul 4, 2024
f468803
removed calibration_plots.md
pasq-cat Jul 4, 2024
459b2fe
test plot
pasq-cat Jul 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
working version
pasq-cat committed Jun 21, 2024
commit b79ca39273a9c11ab41656d4f208cbd38e172136
1 change: 1 addition & 0 deletions src/baselaplace/predicting.jl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could just keep this consistent and return everything in both cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: my bad, let's indeed as discussed just add an option for classification to return distribution. By default, we should still return probabilities for now, but at least we give the option and add that to the docstring.

Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
using Distributions
"""
functional_variance(la::AbstractLaplace, 𝐉::AbstractArray)

50 changes: 26 additions & 24 deletions src/utils.jl
Original file line number Diff line number Diff line change
@@ -94,41 +94,43 @@ We group the p_t into intervals I-j for j= 1,2,...,m that form a partition of [0
the observed average p_j= T^-1_j ∑_{t:p_t ∈ I_j} y_j in each interval I_j.
The function was suggested by Kuleshov(2018) in https://arxiv.org/abs/1807.00263
Arguments:
Y_val: the array of outputs y_t numerically coded . 1 for the target class, 0 for the negative result.
y_binary: the array of outputs y_t numerically coded . 1 for the target class, 0 for the negative result.
sampled_distributions: an array of sampled distributions stacked column-wise where in the first row
there is the probability for the target class y_1=1 and in the second row y_0=0.
"""
function empirical_frequency_binary_classification(Y_cal,sampled_distributions)
function empirical_frequency_binary_classification(y_binary,sampled_distributions)

#unique_elements = unique(Y_cal)
# Create the mapping
#mapping = Dict(unique_elements[1] => 0, unique_elements[2] => 1)

# Convert categorical data to numeric data
#numeric_array = [mapping[c] for c in categorical_array]

# Create bins
num_bins=10
bins = range(0, stop=1, length=num_bins+1)

# Initialize arrays to hold predicted and empirical averages
pred_avg = zeros(num_bins)
emp_avg = zeros(num_bins)
total_pj_per_intervalj = zeros(num_bins)
pred_avg= collect(range(0,step=0.1,stop=0.9))
emp_avg = []
total_pj_per_intervalj = []
class_probs = sampled_distributions[1, :]

class_indices = (Y_cal .== 1)
for j in 1:10
j_float = j / 10.0 -0.1
push!(total_pj_per_intervalj,sum( j_float.<class_probs.<j_float+0.1))


if total_pj_per_intervalj[j]== 0
#println("it's zero $j")
push!(emp_avg, 0)
#push!(pred_avg, 0)
else
indices = findall(x -> j_float < x <j_float+0.1, class_probs)



push!(emp_avg, 1/total_pj_per_intervalj[j] * sum(y_binary[indices]))
println(" numero $j")
pred_avg[j] = 1/total_pj_per_intervalj[j] * sum(sampled_distributions[1,indices])
end

end

class_probs = sampled_distributions[1, :]

for j in 0:0.1:0.9

push!(total_pj_per_intervalj,sum( j<class_probs<j+0.1))

push!(emp_avg, 1/total_pj_per_intervalj[j] * sum( Int.( j<class_probs<j+0.1).*Y_val ) )
push!(pred_avg, 1/total_pj_per_intervalj[j] * sum( Int.( j<class_probs<j+0.1).*sampled_distributions[1,:] ) )


end
return (total_pj_per_intervalj,emp_avg,pred_avg)