Skip to content

Performance dip when using ForwardDiff compared to Turing #81

@burtonjosh

Description

@burtonjosh

I've noticed that there's a performance dip when using ForwardDiff with a model defined in TuringGLM, compared to defining the model directly in Turing. I've set up a MWE to show this.

First I set up 4 models, two in TuringGLM (with and without custom priors), and two in Turing, with the default and custom priors given to the TuringGLM models.

using Turing, TuringGLM, TuringBenchmarking, BenchmarkTools
using ReverseDiff: ReverseDiff
using CSV, DataFrames, LinearAlgebra

hibbs_df = CSV.read(
    download("https://raw.githubusercontent.com/avehtari/ROS-Examples/master/ElectionsEconomy/data/hibbs.dat"),
    DataFrame
);

# TuringGLM model
f = @formula(vote ~ growth)
m_glm = turing_model(f, hibbs_df)

# TuringGLM model with custom priors
priors = CustomPrior(Normal(0, 10), Normal(52, 14), nothing)
m_glm_custom = turing_model(f, hibbs_df; priors=priors)

# extract data for Turing models
y = TuringGLM.data_response(f, hibbs_df)
X = TuringGLM.data_fixed_effects(f, hibbs_df)

# model with default priors
@model function regression_default(X, y; residual=std(y))
    α ~ 50.755 + TDist(3.0)*6.071256084780443
    β ~ filldist(TDist(3.0), size(X,2))
    σ ~ Exponential(residual)

    y ~ MvNormal(α .+ X*β, σ^2*I)
end

m_turing = regression_default(X, y; residual=std(y))

# model with custom priors
@model function regression_custom(X, y; residual=std(y))
    α ~ Normal(52, 14)
    β ~ filldist(Normal(0, 10), size(X,2))
    σ ~ Exponential(residual)

    y ~ MvNormal(α .+ X*β, σ^2*I)
end

m_turing_custom = regression_custom(X, y; residual=std(y))

Then using TuringBenchmarking.jl, I benchmark each of the four models with both Forward and Reverse diff backends:

The results of the benchmark are shown in the table below. You can see that for Reversediff the benchmarks are the same, but with ForwardDiff TuringGLM is ~20-30% slower than Turing (I've included the full results below).

Model ForwardDiff, linked (time, μs) ReverseDiff, linked (time, μs) ForwardDiff, not linked (time, μs) ReverseDiff, not linked (time, μs)
TuringGLM (default prior) 3.967 2.772 3.976 1.990
Turing (default prior) 3.046 2.676 3.059 1.931
TuringGLM (custom prior) 4.013 2.102 3.905 1.868
Turing (custom prior 2.776 1.986 2.827 1.829
Click here for in detail output

TuringGLM model 1 (default priors)

suite_glm = TuringBenchmarking.make_turing_suite(
    m_glm,
    adbackends = [TuringBenchmarking.ForwardDiffAD{40}(), TuringBenchmarking.ReverseDiffAD{true}()]
)
run(suite_glm)

Output:

2-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(2.882 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(2.772 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(3.967 μs)
  "not_linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(2.836 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(1.990 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(3.976 μs)

Turing model 1 (default priors)

suite_turing = TuringBenchmarking.make_turing_suite(
    m_turing,
    adbackends = [TuringBenchmarking.ForwardDiffAD{40}(), TuringBenchmarking.ReverseDiffAD{true}()]
)
run(suite_turing)

Output:

2-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(1.256 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(2.676 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(3.046 μs)
  "not_linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(1.207 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(1.931 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(3.059 μs)

TuringGLM model 2 (custom priors)

suite_glm_custom = TuringBenchmarking.make_turing_suite(
    m_glm_custom,
    adbackends = [TuringBenchmarking.ForwardDiffAD{40}(), TuringBenchmarking.ReverseDiffAD{true}()]
)
run(suite_glm_custom)

Output:

2-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(2.724 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(2.102 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(4.013 μs)
  "not_linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(2.737 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(1.868 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(3.905 μs)

Turing model 2 (custom priors)

suite_turing_custom = TuringBenchmarking.make_turing_suite(
    m_turing_custom,
    adbackends = [TuringBenchmarking.ForwardDiffAD{40}(), TuringBenchmarking.ReverseDiffAD{true}()]
)
run(suite_turing_custom)

Output:

2-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(1.176 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(1.986 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(2.776 μs)
  "not_linked" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "evaluation" => Trial(1.160 μs)
          "Turing.Essential.ReverseDiffAD{true}()" => Trial(1.829 μs)
          "Turing.Essential.ForwardDiffAD{40, true}()" => Trial(2.827 μs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions