Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the No-U-Turn sampler #132

Open
wants to merge 22 commits into
base: advancedHMC
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
bd90b03
changed name of previous hmc routine, changed to column major order
Red-Portal Dec 21, 2019
891eb85
Added effective size diagnostic, StatsBase dependecy
Red-Portal Dec 21, 2019
7df6981
added the No-U-Turn sampler, nuts, based on AdvancedHMC.jl
Red-Portal Dec 21, 2019
f968d5c
CompatHelper: bump compat for "Distributions" to "0.22"
github-actions[bot] Jan 6, 2020
26bf6a5
fix Documenter versions conflict
maximerischard Jan 12, 2020
cb72748
CompatHelper: bump compat for "SpecialFunctions" to "0.10"
github-actions[bot] Jan 20, 2020
3d1d1e0
Update Project.toml
fairbrot Jan 22, 2020
9561252
add a sinusoid to underlying function in optim tests
maximerischard Jan 25, 2020
6090cb7
CompatHelper: bump compat for "Optim" to "0.20"
github-actions[bot] Jan 23, 2020
1bf5f47
CompatHelper: bump compat for "RecipesBase" to "0.8"
github-actions[bot] Feb 6, 2020
db4b978
Merge branch 'master' of https://github.com/STOR-i/GaussianProcesses.…
Red-Portal Feb 10, 2020
0c56059
removed effective sample size calculation, StatsBase dependency
Red-Portal Feb 10, 2020
aee587e
removed StatsBase dependency
Red-Portal Feb 10, 2020
5c7b351
removed effective sample size calculation, StatsBase dependency
Red-Portal Feb 10, 2020
9a4ebc7
Merge branch 'advancedHMC' of github.com:Red-Portal/GaussianProcesses…
Red-Portal Feb 10, 2020
4f5d333
fixed bug in the elliptical slice implementation
Red-Portal Feb 10, 2020
a2ccbfb
Merge pull request #138 from STOR-i/compathelper/new_version/2020-02-…
chris-nemeth Feb 10, 2020
f8b65e3
Merge pull request #139 from Red-Portal/master
thomaspinder Feb 11, 2020
139a91b
Update notebook with ESS
thomaspinder Feb 11, 2020
40f3017
Version update
fairbrot Feb 12, 2020
a80e44c
fixed StanHMCAdaptor deprication warning
Red-Portal Mar 11, 2020
f3795e7
Merge branch 'master' of https://github.com/STOR-i/GaussianProcesses.…
Red-Portal Mar 11, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ ScikitLearnBase = "6e75b9c4-186b-50bd-896f-2d2496a4843e"
SpecialFunctions = "276daf66-3868-5448-9aa4-cd146d93841b"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
StatsFuns = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"

Expand Down
3 changes: 2 additions & 1 deletion src/GaussianProcesses.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ using StatsFuns, SpecialFunctions

using LinearAlgebra, Printf, Random, Statistics
import Statistics: mean, cov
import StatsBase: autocor!
import Base: size
import PDMats: dim, Matrix, diag, pdadd!, *, \, inv, logdet, eigmax, eigmin, whiten!, unwhiten!, quad, quad!, invquad, invquad!, X_A_Xt, Xt_A_X, X_invA_Xt, Xt_invA_X

Expand All @@ -14,7 +15,7 @@ import PDMats: dim, Matrix, diag, pdadd!, *, \, inv, logdet, eigmax, eigmin, whi
export GPBase, GP, GPE, GPA, ElasticGPE, Approx, predict_f, predict_y, Kernel, Likelihood, CompositeKernel, SumKernel, ProdKernel, Masked, FixedKernel, fix, Noise, Const, SE, SEIso, SEArd, Periodic, Poly, RQ, RQIso, RQArd, Lin, LinIso, LinArd, Matern, Mat12Iso, Mat12Ard, Mat32Iso, Mat32Ard, Mat52Iso, Mat52Ard, #kernel functions
MeanZero, MeanConst, MeanLin, MeanPoly, SumMean, ProdMean, MeanPeriodic, #mean functions
GaussLik, BernLik, ExpLik, StuTLik, PoisLik, BinLik, #likelihood functions
mcmc, ess, lss, optimize!, vi, var_exp, dv_var_exp, elbo, initialise_Q, #inference functions
mcmc, ess, hmc, nuts, nuts_hamiltonian, lss, optimize!, vi, var_exp, dv_var_exp, elbo, initialise_Q, #inference functions
set_priors!,set_params!, update_target!, autodiff, update_Q!
using ForwardDiff: GradientConfig, Dual, partials, copyto!, Chunk
import ForwardDiff: seed!
Expand Down
131 changes: 115 additions & 16 deletions src/mcmc.jl
Original file line number Diff line number Diff line change
@@ -1,13 +1,111 @@
using ProgressMeter

"""
mcmc(gp::GPBase; kwargs...)
effective_sample_size(chain::AbstractMatrix)

Runs Hamiltonian Monte Carlo algorithm for estimating the hyperparameters of Gaussian process `GPE` and the latent function in the case of `GPA`.
Routine for computing the effective sample size as in,

Gelman, Andrew, et al., 2013. Bayesian Data Analysis. 3rd.

The samples are assuemd to be in column major order.
"""
function mcmc(gp::GPBase; nIter::Int=1000, burn::Int=1, thin::Int=1, ε::Float64=0.1,

function effective_sample_size(X::AbstractMatrix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does compute_ess need to be defined inside of effective_sample_size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is indeed confusing. I'll change it.

function compute_ess(ρ_scalar)
τ_inv = 1 + 2 * ρ_scalar[1]
K = 2
for k = 2:2:N-2
Δ = ρ_scalar[k] + ρ_scalar[k + 1]
if all(Δ < 0)
break
else
τ_inv += 2*Δ
end
end
return min(1 / τ_inv, one(τ_inv))
end

N = size(X, 1)
lags = collect(1:N-1)
ρ = zeros(length(lags), size(X, 2))
autocor!(ρ, X, lags)
return N * [compute_ess(ρ[:,i]) for i = 1:size(X,2)]
end

function mcmc(gp::GPBase; nIter::Int=1000, burn::Int=100, thin::Int=1, ε::Float64=0.1,
Lmin::Int=5, Lmax::Int=15, lik::Bool=true, noise::Bool=true,
domean::Bool=true, kern::Bool=true)
return hmc(gp, nIter=nIter, burn=burn, thin=thin, ε=ε, Lmin=Lmin, Lmax=Lmax,
lik=lik, noise=noise, domean=domean, kern=kern)
end

"""
nuts_hamiltonian(gp::GPBase, metric::AdvancedHMC.AbstractMetric)

Generate Hamiltonian for the GP target. A stupid API hack but it works?
"""
function nuts_hamiltonian(gp::GPBase; lik::Bool=true, noise::Bool=true, domean::Bool=true, kern::Bool=true,
metric=DiagEuclideanMetric(num_params(gp; get_params_kwargs(gp, domean=domean, kern=kern, noise=noise, lik=lik)...)))
precomp = init_precompute(gp)
params_kwargs = get_params_kwargs(gp; domean=domean, kern=kern, noise=noise, lik=lik)
function calc_target_and_dtarget!(θ::AbstractVector)
set_params!(gp, θ; params_kwargs...)
# Cholesky exceptions are handled by DynamicHMC
update_target_and_dtarget!(gp, precomp; params_kwargs...)
return (gp.target, gp.dtarget)
end

function calc_target!(θ::AbstractVector)
set_params!(gp, θ; params_kwargs...)
update_target!(gp; params_kwargs...)
return gp.target
end
return Hamiltonian(metric, calc_target!, calc_target_and_dtarget!)
end

"""
nuts(gp::GPBase; kwargs...)

Runs Hamiltonian Monte Carlo algorithm for estimating the hyperparameters of
Gaussian process `GPE` and the latent function in the case of `GPA`.
Refer to AdvancedHMC.jl for more info about the keyword options.
"""
function nuts(gp::GPBase; nIter::Int=1000, burn::Int=100, thin::Int=1,
lik::Bool=true, noise::Bool=true, domean::Bool=true, kern::Bool=true,
metric=DiagEuclideanMetric(num_params(gp; get_params_kwargs(gp, domean=domean, kern=kern, noise=noise, lik=lik)...)),
hamiltonian=nuts_hamiltonian(gp; metric=metric),
ε::Float64=find_good_eps(hamiltonian, get_params(gp; get_params_kwargs(gp, domean=domean, kern=kern, noise=noise, lik=lik)...)),
maxDepth::Int64=10, δ::Float64=0.8, integrator=Leapfrog(ε),
proposals=NUTS{MultinomialTS, GeneralisedNoUTurn}(integrator, maxDepth),
adaptor=StanHMCAdaptor(burn, Preconditioner(metric), NesterovDualAveraging(δ, integrator)),
progress=true)
params_kwargs = get_params_kwargs(gp; domean=domean, kern=kern, noise=noise, lik=lik)
θ_init = get_params(gp; params_kwargs...)
dim = length(θ_init)
post, stats = sample(hamiltonian, proposals, θ_init, nIter - burn, adaptor,
burn; drop_warmup=true, progress=progress, verbose=false)
post = hcat(post...)
post = post[:,1:thin:end]
set_params!(gp, θ_init; params_kwargs...)

step_stats = [[step_stat.acceptance_rate, step_stat.tree_depth] for step_stat in stats]
avg_accept, avg_depth = mean(step_stats)
ε = stats[end-1].step_size
@printf("Number of iterations = %d, Thinning = %d, Burn-in = %d \n", nIter,thin,burn)
@printf("Step size = %f, Average tree depth = %f \n", ε,avg_depth)
@printf("Acceptance rate: %f \n", avg_accept)
@printf("Average effective sample size: %f\n", mean(effective_sample_size(post')))
return post
end

"""
hmc(gp::GPBase; kwargs...)

Runs Hamiltonian Monte Carlo algorithm for estimating the hyperparameters of
Gaussian process `GPE` and the latent function in the case of `GPA`.
"""
function hmc(gp::GPBase; nIter::Int=1000, burn::Int=100, thin::Int=1, ε::Float64=0.1,
Lmin::Int=5, Lmax::Int=15, lik::Bool=true, noise::Bool=true,
domean::Bool=true, kern::Bool=true)
precomp = init_precompute(gp)
params_kwargs = get_params_kwargs(gp; domean=domean, kern=kern, noise=noise, lik=lik)
count = 0
Expand All @@ -33,8 +131,8 @@ function mcmc(gp::GPBase; nIter::Int=1000, burn::Int=1, thin::Int=1, ε::Float64
θ_cur = get_params(gp; params_kwargs...)
D = length(θ_cur)
leapSteps = 0 #accumulator to track number of leap-frog steps
post = Array{Float64}(undef, nIter, D) #posterior samples
post[1,:] = θ_cur
post = Array{Float64}(undef, D, nIter) #posterior samples
post[:,1] = θ_cur

@assert calc_target!(gp, θ_cur)
target_cur, grad_cur = gp.target, gp.dtarget
Expand All @@ -61,7 +159,7 @@ function mcmc(gp::GPBase; nIter::Int=1000, burn::Int=1, thin::Int=1, ε::Float64
ν -= 0.5*ε * grad

if reject
post[t,:] = θ_cur
post[:,t] = θ_cur
else
α = target - 0.5 * ν'ν - target_cur + 0.5 * ν_cur'ν_cur
u = log(rand())
Expand All @@ -72,19 +170,19 @@ function mcmc(gp::GPBase; nIter::Int=1000, burn::Int=1, thin::Int=1, ε::Float64
target_cur = target
grad_cur = grad
end
post[t,:] = θ_cur
post[:,t] = θ_cur
end
end
post = post[burn:thin:end,:]
post = post[:,burn:thin:end]
set_params!(gp, θ_cur; params_kwargs...)
@printf("Number of iterations = %d, Thinning = %d, Burn-in = %d \n", nIter,thin,burn)
@printf("Step size = %f, Average number of leapfrog steps = %f \n", ε,leapSteps/nIter)
println("Number of function calls: ", count)
@printf("Acceptance rate: %f \n", num_acceptances/nIter)
return post'
@printf("Average effective sample size: %f\n", mean(effective_sample_size(post')))
return post
end


"""
ess(gp::GPBase; kwargs...)

Expand All @@ -95,7 +193,7 @@ Journal of Machine Learning Research 9 (2010): 541-548.

Requires hyperparameter priors to be Gaussian.
"""
function ess(gp::GPE; nIter::Int=1000, burn::Int=1, thin::Int=1, lik::Bool=true,
function ess(gp::GPE; nIter::Int=1000, burn::Int=100, thin::Int=1, lik::Bool=true,
noise::Bool=true, domean::Bool=true, kern::Bool=true)
params_kwargs = get_params_kwargs(gp; domean=domean, kern=kern, noise=noise, lik=lik)
count = 0
Expand Down Expand Up @@ -141,20 +239,21 @@ function ess(gp::GPE; nIter::Int=1000, burn::Int=1, thin::Int=1, lik::Bool=true,
total_proposals = 0
θ_cur = get_params(gp; params_kwargs...)
D = length(θ_cur)
post = Array{Float64}(undef, nIter, D)
post = Array{Float64}(undef, D, nIter)

for i = 1:nIter
θ_cur, num_proposals = sample!(θ_cur)
post[i,:] = θ_cur
post[:,i] = θ_cur
total_proposals += num_proposals
end

post = post[burn:thin:end,:]
post = post[:,burn:thin:end]
set_params!(gp, θ_cur; params_kwargs...)
@printf("Number of iterations = %d, Thinning = %d, Burn-in = %d \n", nIter,thin,burn)
println("Number of function calls: ", count)
@printf("Acceptance rate: %f \n", nIter / total_proposals)
return post'
@printf("Average effective sample size: %f\n", mean(effective_sample_size(post')))
return post
end


56 changes: 53 additions & 3 deletions test/mcmc.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
module TestMCMC
using GaussianProcesses, Distributions
using AdvancedHMC
using Test, Random

Random.seed!(1)
Expand All @@ -13,18 +14,67 @@ Random.seed!(1)
kern = RQ(1.0, 1.0, 1.0)

# Just checks that it doesn't crash
@testset "Legacy MCMC" begin
@testset "Without likelihood" begin
gp = GP(X, y, MeanZero(), kern)
set_priors!(gp.kernel, [Distributions.Normal(-1.0, 1.0) for i in 1:3])
global hmc_chain = mcmc(gp, ε=0.05)
end

@testset "With likelihood" begin
lik = GaussLik(-1.0)
gp = GP(X, y, MeanZero(), kern, lik)
set_priors!(gp.kernel, [Distributions.Normal(-1.0, 1.0) for i in 1:3])
mcmc(gp, ε=0.05)
end
end

@testset "HMC" begin
@testset "Without likelihood" begin
gp = GP(X, y, MeanZero(), kern)
set_priors!(gp.kernel, [Distributions.Normal(-1.0, 1.0) for i in 1:3])
global hmc_chain = mcmc(gp)
global hmc_chain = hmc(gp, ε=0.05)
end

@testset "With likelihood" begin
lik = GaussLik(-1.0)
gp = GP(X, y, MeanZero(), kern, lik)
set_priors!(gp.kernel, [Distributions.Normal(-1.0, 1.0) for i in 1:3])
mcmc(gp)
hmc(gp, ε=0.05)
end
end

@testset "AdvancedHMC" begin
@testset "Without likelihood" begin
gp = GP(X, y, MeanZero(), kern)
set_priors!(gp.kernel, [Distributions.Normal(-1.0, 1.0) for i in 1:3])
global hmc_chain = nuts(gp, progress=false)
end

@testset "With likelihood" begin
lik = GaussLik(-1.0)
gp = GP(X, y, MeanZero(), kern, lik)
set_priors!(gp.kernel, [Distributions.Normal(-1.0, 1.0) for i in 1:3])
nuts(gp, nIter=1000, burn=200, progress=true)
end

@testset "Use" begin
lik = GaussLik(-1.0)
gp = GP(X, y, MeanZero(), kern, lik)
set_priors!(gp.kernel, [Distributions.Normal(-1.0, 1.0) for i in 1:3])
kwargs = GaussianProcesses.get_params_kwargs(
gp; domean=true, kern=true, noise=true, lik=true)

metric = AdvancedHMC.DenseEuclideanMetric(
GaussianProcesses.num_params(gp; kwargs...))
hamiltonian = nuts_hamiltonian(gp, metric=metric)
ε = 0.1
integrator = AdvancedHMC.Leapfrog(ε)
prop = AdvancedHMC.NUTS{SliceTS, ClassicNoUTurn}(integrator)
adaptor = AdvancedHMC.NaiveHMCAdaptor(
Preconditioner(metric), NesterovDualAveraging(0.8, integrator))
nuts(gp, nIter=1000, burn=100, metric=metric, hamiltonian=hamiltonian,
ε=ε, integrator=integrator, proposals=prop, adaptor=adaptor, progress=false)
end
end

Expand All @@ -34,6 +84,6 @@ Random.seed!(1)
set_priors!(gpess.logNoise, [Distributions.Normal(-1.0, 1.0)])
global ess_chain = ess(gpess)
end

end
end