Skip to content
/ splintr Public

Natural cubic SPLines with interpretable INTercepts

License

Notifications You must be signed in to change notification settings

simisc/splintr

Repository files navigation

splintr

check-standard DOI Licence Lifecycle codecov

Natural cubic splines with interpretable intercepts: ‘centres’ a basis generated using splines::ns() on a specified x-value. When used in a model formula, this allows the model intercept to be interpreted with respect to that central x-value, rather than with respect to the x-value of the first splines::ns() knot.

Installation

You can install splintr from github with:

# install.packages("devtools")
devtools::install_github("simisc/splintr")

Example

library(tidyverse)
library(splines)
library(splintr)
library(broom)
library(knitr)
library(ggthemes)
library(ggrepel)
women <- women %>%
  mutate(height_centred = height - mean(height))
women %>%
  head() %>%
  kable()
height weight height_centred
58 115 -7
59 117 -6
60 120 -5
61 123 -4
62 126 -3
63 129 -2

Different parametrisations of the same model:

  • fit0: using ns with raw heights
  • fit1: using ns with centred heights
  • fit2: using splintr with centred heights
  • fit3: using splintr with raw heights and explicit (arbitrary) centre
model_formulae <- list(
  fit0 = weight ~ ns(height, df = 3),
  fit1 = weight ~ ns(height_centred, df = 3),
  fit2 = weight ~ splintr(height_centred, df = 3),
  fit3 = weight ~ splintr(height, df = 3, centre = explicit_centre)
)

explicit_centre = 68.45
model_fits <- map(model_formulae, lm, data = women)

The models are identical:

model_fits %>%
  map_dfr(glance, .id = "model") %>%
  select(model, sigma, logLik, deviance) %>%
  kable()
model sigma logLik deviance
fit0 0.321018 -1.914046 1.133578
fit1 0.321018 -1.914046 1.133578
fit2 0.321018 -1.914046 1.133578
fit3 0.321018 -1.914046 1.133578

But they have different intercepts:

model_fits %>%
  map_dfr(~tidy(.x, quick = TRUE), .id = "model") %>%
  filter(term == "(Intercept)") %>%
  select(model, estimate, std.error, statistic) %>%
  kable()
model estimate std.error statistic
fit0 114.5595 0.2455372 466.5666
fit1 114.5595 0.2455372 466.5666
fit2 135.1067 0.1288238 1048.7711
fit3 147.8057 0.1513002 976.9034

Using splines::ns(), the model intercept represents the estimated value of weight at the first boundary knot, i.e. when height takes its minimum value of 58. This is unchanged by centring the predictor, so fit0 and fit1 have identical intercepts.

Using splintr() instead, the intercept representes the estimated value of weight when the predictor height_centred takes a value of zero. An arbitrary “centre” can be specified directly in the splintr() call.

int_pts <- tibble(
  model = paste0("fit", 0:3),
  x = c(
    min(women$height),
    min(women$height),
    mean(women$height),
    explicit_centre
  ),
  y = map_dbl(model_fits, ~coef(.x)[1])
)

augment(model_fits[[1]], data = women) %>%
  ggplot(aes(x = height)) +
  geom_point(aes(y = weight)) +
  geom_line(aes(y = .fitted), col = "blue") +
  geom_point(data = int_pts, aes(x = x, y = y), col = "red") +
  geom_label_repel(data = int_pts, aes(x = x, y = y, label = model),
                   col = "red", nudge_y = 7) +
  theme_few()

About

Natural cubic SPLines with interpretable INTercepts

Resources

License

Stars

Watchers

Forks

Languages