Skip to content

Commit

Permalink
Merge pull request #94 from JuliaTrustworthyAI/93-fixing-code-in-the-…
Browse files Browse the repository at this point in the history
…tutorials

Fixed tutorials code snippets
  • Loading branch information
pat-alt authored Jun 15, 2024
2 parents 00bdfc2 + 16ed50d commit d7d0ad7
Show file tree
Hide file tree
Showing 35 changed files with 13,393 additions and 11,289 deletions.
4 changes: 2 additions & 2 deletions _freeze/docs/src/tutorials/logit/execute-results/md.json
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{
"hash": "82a273f4f8234df1eacbcd8ef1235d6c",
"hash": "885baddb8b06f5422ad14af1d1cccd19",
"result": {
"engine": "jupyter",
"markdown": "```@meta\nCurrentModule = LaplaceRedux\n```\n\n# Bayesian Logistic Regression\n\n\n\nWe will use synthetic data with linearly separable samples:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\n# Number of points to generate.\nxs, ys = LaplaceRedux.Data.toy_data_linear(100)\nX = hcat(xs...) # bring into tabular format\ndata = zip(xs,ys)\n```\n:::\n\n\nLogistic regression with weight decay can be implemented in Flux.jl as a single dense (linear) layer with binary logit crossentropy loss:\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nnn = Chain(Dense(2,1))\nλ = 0.5\nsqnorm(x) = sum(abs2, x)\nweight_regularization(λ=λ) = 1/2 * λ^2 * sum(sqnorm, Flux.params(nn))\nloss(x, y) = Flux.Losses.logitbinarycrossentropy(nn(x), y) + weight_regularization()\n```\n:::\n\n\nThe code below simply trains the model. After about 50 training epochs training loss stagnates.\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing Flux.Optimise: update!, Adam\nopt = Adam()\nepochs = 50\navg_loss(data) = mean(map(d -> loss(d[1],d[2]), data))\nshow_every = epochs/10\n\nfor epoch = 1:epochs\n for d in data\n gs = gradient(Flux.params(nn)) do\n l = loss(d...)\n end\n update!(opt, Flux.params(nn), gs)\n end\n if epoch % show_every == 0\n println(\"Epoch \" * string(epoch))\n @show avg_loss(data)\n end\nend\n```\n:::\n\n\n## Laplace approximation\n\nLaplace approximation for the posterior predictive can be implemented as follows:\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nla = Laplace(nn; likelihood=:classification, λ=λ, subset_of_weights=:last_layer)\nfit!(la, data)\nla_untuned = deepcopy(la) # saving for plotting\noptimize_prior!(la; verbose=true, n_steps=500)\n```\n:::\n\n\nThe plot below shows the resulting posterior predictive surface for the plugin estimator (left) and the Laplace approximation (right).\n\n\n\n",
"markdown": "```@meta\nCurrentModule = LaplaceRedux\n```\n\n# Bayesian Logistic Regression\n\n## Libraries\n\n::: {.cell execution_count=1}\n``` {.julia .cell-code}\nusing Pkg; Pkg.activate(\"docs\")\n# Import libraries\nusing Flux, Plots, TaijaPlotting, Random, Statistics, LaplaceRedux, LinearAlgebra\ntheme(:lime)\n```\n:::\n\n\n## Data\n\nWe will use synthetic data with linearly separable samples:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\n# Number of points to generate.\nxs, ys = LaplaceRedux.Data.toy_data_linear(100)\nX = hcat(xs...) # bring into tabular format\ndata = zip(xs,ys)\n```\n:::\n\n\n## Model\n\nLogistic regression with weight decay can be implemented in Flux.jl as a single dense (linear) layer with binary logit crossentropy loss:\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nnn = Chain(Dense(2,1))\nλ = 0.5\nsqnorm(x) = sum(abs2, x)\nweight_regularization(λ=λ) = 1/2 * λ^2 * sum(sqnorm, Flux.params(nn))\nloss(x, y) = Flux.Losses.logitbinarycrossentropy(nn(x), y) + weight_regularization()\n```\n:::\n\n\nThe code below simply trains the model. After about 50 training epochs training loss stagnates.\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing Flux.Optimise: update!, Adam\nopt = Adam()\nepochs = 50\navg_loss(data) = mean(map(d -> loss(d[1],d[2]), data))\nshow_every = epochs/10\n\nfor epoch = 1:epochs\n for d in data\n gs = gradient(Flux.params(nn)) do\n l = loss(d...)\n end\n update!(opt, Flux.params(nn), gs)\n end\n if epoch % show_every == 0\n println(\"Epoch \" * string(epoch))\n @show avg_loss(data)\n end\nend\n```\n:::\n\n\n## Laplace approximation\n\nLaplace approximation for the posterior predictive can be implemented as follows:\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nla = Laplace(nn; likelihood=:classification, λ=λ, subset_of_weights=:last_layer)\nfit!(la, data)\nla_untuned = deepcopy(la) # saving for plotting\noptimize_prior!(la; verbose=true, n_steps=500)\n```\n:::\n\n\nThe plot below shows the resulting posterior predictive surface for the plugin estimator (left) and the Laplace approximation (right).\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\nzoom = 0\np_plugin = plot(la, X, ys; title=\"Plugin\", link_approx=:plugin, clim=(0,1))\np_untuned = plot(la_untuned, X, ys; title=\"LA - raw (λ=$(unique(diag(la_untuned.prior.P₀))[1]))\", clim=(0,1), zoom=zoom)\np_laplace = plot(la, X, ys; title=\"LA - tuned (λ=$(round(unique(diag(la.prior.P₀))[1],digits=2)))\", clim=(0,1), zoom=zoom)\nplot(p_plugin, p_untuned, p_laplace, layout=(1,3), size=(1700,400))\n```\n:::\n\n\n",
"supporting": [
"logit_files"
],
Expand Down
572 changes: 572 additions & 0 deletions _freeze/docs/src/tutorials/logit/figure-commonmark/cell-output-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions _freeze/docs/src/tutorials/mlp/execute-results/md.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"hash": "aabc49848897fcaeba97ed162f03b881",
"hash": "979c6c2ea309744c0cf978cf3346899a",
"result": {
"engine": "jupyter",
"markdown": "```@meta\nCurrentModule = LaplaceRedux\n```\n\n# Bayesian MLP\n\n\n\nThis time we use a synthetic dataset containing samples that are not linearly separable:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\n# Number of points to generate.\nxs, ys = LaplaceRedux.Data.toy_data_non_linear(200)\nX = hcat(xs...) # bring into tabular format\ndata = zip(xs,ys)\n```\n:::\n\n\nFor the classification task we build a neural network with weight decay composed of a single hidden layer.\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nn_hidden = 10\nD = size(X,1)\nnn = Chain(\n Dense(D, n_hidden, σ),\n Dense(n_hidden, 1)\n) \nloss(x, y) = Flux.Losses.logitbinarycrossentropy(nn(x), y) \n```\n:::\n\n\nThe model is trained until training loss stagnates.\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing Flux.Optimise: update!, Adam\nopt = Adam(1e-3)\nepochs = 100\navg_loss(data) = mean(map(d -> loss(d[1],d[2]), data))\nshow_every = epochs/10\n\nfor epoch = 1:epochs\n for d in data\n gs = gradient(Flux.params(nn)) do\n l = loss(d...)\n end\n update!(opt, Flux.params(nn), gs)\n end\n if epoch % show_every == 0\n println(\"Epoch \" * string(epoch))\n @show avg_loss(data)\n end\nend\n```\n:::\n\n\n## Laplace Approximation\n\nLaplace approximation can be implemented as follows:\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nla = Laplace(nn; likelihood=:classification, subset_of_weights=:all)\nfit!(la, data)\nla_untuned = deepcopy(la) # saving for plotting\noptimize_prior!(la; verbose=true, n_steps=500)\n```\n:::\n\n\nThe plot below shows the resulting posterior predictive surface for the plugin estimator (left) and the Laplace approximation (right).\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\n# Plot the posterior distribution with a contour plot.\nzoom=0\np_plugin = plot(la, X, ys; title=\"Plugin\", link_approx=:plugin, clim=(0,1))\np_untuned = plot(la_untuned, X, ys; title=\"LA - raw (λ=$(unique(diag(la_untuned.prior.P₀))[1]))\", clim=(0,1), zoom=zoom)\np_laplace = plot(la, X, ys; title=\"LA - tuned (λ=$(round(unique(diag(la.prior.P₀))[1],digits=2)))\", clim=(0,1), zoom=zoom)\nplot(p_plugin, p_untuned, p_laplace, layout=(1,3), size=(1700,400))\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n![](mlp_files/figure-commonmark/cell-7-output-1.svg){}\n:::\n:::\n\n\nZooming out we can note that the plugin estimator produces high-confidence estimates in regions scarce of any samples. The Laplace approximation is much more conservative about these regions.\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nzoom=-50\np_plugin = plot(la, X, ys; title=\"Plugin\", link_approx=:plugin, clim=(0,1))\np_untuned = plot(la_untuned, X, ys; title=\"LA - raw (λ=$(unique(diag(la_untuned.prior.P₀))[1]))\", clim=(0,1), zoom=zoom)\np_laplace = plot(la, X, ys; title=\"LA - tuned (λ=$(round(unique(diag(la.prior.P₀))[1],digits=2)))\", clim=(0,1), zoom=zoom)\nplot(p_plugin, p_untuned, p_laplace, layout=(1,3), size=(1700,400))\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n![](mlp_files/figure-commonmark/cell-8-output-1.svg){}\n:::\n:::\n\n\n",
"markdown": "```@meta\nCurrentModule = LaplaceRedux\n```\n\n# Bayesian MLP\n\n## Libraries\n\n::: {.cell execution_count=1}\n``` {.julia .cell-code}\nusing Pkg; Pkg.activate(\"docs\")\n# Import libraries\nusing Flux, Plots, TaijaPlotting, Random, Statistics, LaplaceRedux, LinearAlgebra\ntheme(:lime)\n```\n:::\n\n\n## Data\n\nThis time we use a synthetic dataset containing samples that are not linearly separable:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\n# Number of points to generate.\nxs, ys = LaplaceRedux.Data.toy_data_non_linear(200)\nX = hcat(xs...) # bring into tabular format\ndata = zip(xs,ys)\n```\n:::\n\n\n## Model\nFor the classification task we build a neural network with weight decay composed of a single hidden layer.\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nn_hidden = 10\nD = size(X,1)\nnn = Chain(\n Dense(D, n_hidden, σ),\n Dense(n_hidden, 1)\n) \nloss(x, y) = Flux.Losses.logitbinarycrossentropy(nn(x), y) \n```\n:::\n\n\nThe model is trained until training loss stagnates.\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing Flux.Optimise: update!, Adam\nopt = Adam(1e-3)\nepochs = 100\navg_loss(data) = mean(map(d -> loss(d[1],d[2]), data))\nshow_every = epochs/10\n\nfor epoch = 1:epochs\n for d in data\n gs = gradient(Flux.params(nn)) do\n l = loss(d...)\n end\n update!(opt, Flux.params(nn), gs)\n end\n if epoch % show_every == 0\n println(\"Epoch \" * string(epoch))\n @show avg_loss(data)\n end\nend\n```\n:::\n\n\n## Laplace Approximation\n\nLaplace approximation can be implemented as follows:\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nla = Laplace(nn; likelihood=:classification, subset_of_weights=:all)\nfit!(la, data)\nla_untuned = deepcopy(la) # saving for plotting\noptimize_prior!(la; verbose=true, n_steps=500)\n```\n:::\n\n\nThe plot below shows the resulting posterior predictive surface for the plugin estimator (left) and the Laplace approximation (right).\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\n# Plot the posterior distribution with a contour plot.\nzoom=0\np_plugin = plot(la, X, ys; title=\"Plugin\", link_approx=:plugin, clim=(0,1))\np_untuned = plot(la_untuned, X, ys; title=\"LA - raw (λ=$(unique(diag(la_untuned.prior.P₀))[1]))\", clim=(0,1), zoom=zoom)\np_laplace = plot(la, X, ys; title=\"LA - tuned (λ=$(round(unique(diag(la.prior.P₀))[1],digits=2)))\", clim=(0,1), zoom=zoom)\nplot(p_plugin, p_untuned, p_laplace, layout=(1,3), size=(1700,400))\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n![](mlp_files/figure-commonmark/cell-7-output-1.svg){}\n:::\n:::\n\n\nZooming out we can note that the plugin estimator produces high-confidence estimates in regions scarce of any samples. The Laplace approximation is much more conservative about these regions.\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nzoom=-50\np_plugin = plot(la, X, ys; title=\"Plugin\", link_approx=:plugin, clim=(0,1))\np_untuned = plot(la_untuned, X, ys; title=\"LA - raw (λ=$(unique(diag(la_untuned.prior.P₀))[1]))\", clim=(0,1), zoom=zoom)\np_laplace = plot(la, X, ys; title=\"LA - tuned (λ=$(round(unique(diag(la.prior.P₀))[1],digits=2)))\", clim=(0,1), zoom=zoom)\nplot(p_plugin, p_untuned, p_laplace, layout=(1,3), size=(1700,400))\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n![](mlp_files/figure-commonmark/cell-8-output-1.svg){}\n:::\n:::\n\n\n",
"supporting": [
"mlp_files"
"mlp_files\\figure-commonmark"
],
"filters": []
}
Expand Down
1,766 changes: 878 additions & 888 deletions _freeze/docs/src/tutorials/mlp/figure-commonmark/cell-7-output-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,962 changes: 998 additions & 964 deletions _freeze/docs/src/tutorials/mlp/figure-commonmark/cell-8-output-1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions _freeze/docs/src/tutorials/multi/execute-results/md.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"hash": "989e65afa56b460667f2aecfad7f1ce4",
"hash": "1f8a1e12de094fb4dac9c6a4336bb51d",
"result": {
"engine": "jupyter",
"markdown": "---\ntitle: Multi-class problem\n---\n\n\n\n\n\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing LaplaceRedux.Data\nx, y = Data.toy_data_multi()\nX = hcat(x...)\ny_train = Flux.onehotbatch(y, unique(y))\ny_train = Flux.unstack(y_train',1)\n```\n:::\n\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\ndata = zip(x,y_train)\nn_hidden = 3\nD = size(X,1)\nout_dim = length(unique(y))\nnn = Chain(\n Dense(D, n_hidden, σ),\n Dense(n_hidden, out_dim)\n) \nloss(x, y) = Flux.Losses.logitcrossentropy(nn(x), y)\n```\n:::\n\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing Flux.Optimise: update!, Adam\nopt = Adam()\nepochs = 100\navg_loss(data) = mean(map(d -> loss(d[1],d[2]), data))\nshow_every = epochs/10\n\nfor epoch = 1:epochs\n for d in data\n gs = gradient(Flux.params(nn)) do\n l = loss(d...)\n end\n update!(opt, Flux.params(nn), gs)\n end\n if epoch % show_every == 0\n println(\"Epoch \" * string(epoch))\n @show avg_loss(data)\n end\nend\n```\n:::\n\n\n## Laplace Approximation\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nla = Laplace(nn; likelihood=:classification)\nfit!(la, data)\noptimize_prior!(la; verbose=true, n_steps=100)\n```\n:::\n\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\n_labels = sort(unique(y))\nplt_list = []\nfor target in _labels\n plt = plot(la, X, y; target=target, clim=(0,1))\n push!(plt_list, plt)\nend\nplot(plt_list...)\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n![](multi_files/figure-commonmark/cell-7-output-1.svg){}\n:::\n:::\n\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\n_labels = sort(unique(y))\nplt_list = []\nfor target in _labels\n plt = plot(la, X, y; target=target, clim=(0,1), link_approx=:plugin)\n push!(plt_list, plt)\nend\nplot(plt_list...)\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n![](multi_files/figure-commonmark/cell-8-output-1.svg){}\n:::\n:::\n\n\n",
"markdown": "---\ntitle: Multi-class problem\n---\n\n\n\n## Libraries\n\n\n::: {.cell execution_count=1}\n``` {.julia .cell-code}\nusing Pkg; Pkg.activate(\"docs\")\n# Import libraries\nusing Flux, Plots, TaijaPlotting, Random, Statistics, LaplaceRedux\ntheme(:lime)\n```\n:::\n\n\n## Data\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing LaplaceRedux.Data\nx, y = Data.toy_data_multi()\nX = hcat(x...)\ny_train = Flux.onehotbatch(y, unique(y))\ny_train = Flux.unstack(y_train',1)\n```\n:::\n\n\n## MLP\n\nWe set up a model\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\ndata = zip(x,y_train)\nn_hidden = 3\nD = size(X,1)\nout_dim = length(unique(y))\nnn = Chain(\n Dense(D, n_hidden, σ),\n Dense(n_hidden, out_dim)\n) \nloss(x, y) = Flux.Losses.logitcrossentropy(nn(x), y)\n```\n:::\n\n\ntraining:\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing Flux.Optimise: update!, Adam\nopt = Adam()\nepochs = 100\navg_loss(data) = mean(map(d -> loss(d[1],d[2]), data))\nshow_every = epochs/10\n\nfor epoch = 1:epochs\n for d in data\n gs = gradient(Flux.params(nn)) do\n l = loss(d...)\n end\n update!(opt, Flux.params(nn), gs)\n end\n if epoch % show_every == 0\n println(\"Epoch \" * string(epoch))\n @show avg_loss(data)\n end\nend\n```\n:::\n\n\n## Laplace Approximation\n\nThe Laplace approximation can be implemented as follows:\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nla = Laplace(nn; likelihood=:classification)\nfit!(la, data)\noptimize_prior!(la; verbose=true, n_steps=100)\n```\n:::\n\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\n_labels = sort(unique(y))\nplt_list = []\nfor target in _labels\n plt = plot(la, X, y; target=target, clim=(0,1))\n push!(plt_list, plt)\nend\nplot(plt_list...)\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n![](multi_files/figure-commonmark/cell-7-output-1.svg){}\n:::\n:::\n\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\n_labels = sort(unique(y))\nplt_list = []\nfor target in _labels\n plt = plot(la, X, y; target=target, clim=(0,1), link_approx=:plugin)\n push!(plt_list, plt)\nend\nplot(plt_list...)\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n![](multi_files/figure-commonmark/cell-8-output-1.svg){}\n:::\n:::\n\n\n",
"supporting": [
"multi_files"
"multi_files\\figure-commonmark"
],
"filters": []
}
Expand Down
Loading

0 comments on commit d7d0ad7

Please sign in to comment.