|
| 1 | +--- |
| 2 | +output: github_document |
| 3 | +--- |
| 4 | + |
| 5 | +<!-- README.md is generated from README.Rmd. Please edit that file --> |
| 6 | + |
| 7 | +```{r, echo = FALSE} |
| 8 | +knitr::opts_chunk$set( |
| 9 | + collapse = TRUE, |
| 10 | + comment = "#>", |
| 11 | + fig.path = "README-" |
| 12 | +) |
| 13 | +``` |
| 14 | + |
| 15 | +[](https://travis-ci.org/jacob-long/panelr)[](https://ci.appveyor.com/project/jacob-long/panelr)[](https://codecov.io/github/jacob-long/panelr?branch=master) |
| 16 | + |
| 17 | + |
| 18 | +# panelr |
| 19 | + |
| 20 | +This is an R package designed to aid in the analysis of panel data, |
| 21 | +designs in which the same group of respondents/entities are contacted/measured |
| 22 | +multiple times. `panelr` provides some useful infrastructure, like a |
| 23 | +`panel_data` object class, as well as automating some emerging methods for |
| 24 | +analyses of these data. |
| 25 | + |
| 26 | +It automates the "within-between" (also known as |
| 27 | +"between-within" and "hybrid") specification that combines the |
| 28 | +desirable aspects of both fixed effects and random effects econometric models |
| 29 | +and fits them using the lme4 package in the backend. Bayesian estimation of |
| 30 | +these models is supported by interfacing with the brms package. |
| 31 | + |
| 32 | +## Installation |
| 33 | + |
| 34 | +At the moment, `panelr` is only available through Github. A submission to |
| 35 | +CRAN is coming soon. |
| 36 | + |
| 37 | +```{r eval = FALSE} |
| 38 | +install.packages("devtools") |
| 39 | +devtools::install_github("jacob-long/panelr") |
| 40 | +``` |
| 41 | + |
| 42 | +Note the several dependencies: `dplyr`, `tidyr`, `lme4`, `pbkrtest`, `jtools`, |
| 43 | +`magrittr`, `stringr`, and `rlang`. You will need `brms` (and its dependencies, |
| 44 | +like `rstan`) to do Bayesian estimation. |
| 45 | + |
| 46 | +## Usage |
| 47 | + |
| 48 | +### `panel_data` frames |
| 49 | + |
| 50 | +While not strictly required, the best way to start is to declare your data |
| 51 | +as panel data. I'll load the example data `WageData` to demonstrate. |
| 52 | + |
| 53 | +```{r} |
| 54 | +library(panelr) |
| 55 | +data("WageData") |
| 56 | +colnames(WageData) |
| 57 | +``` |
| 58 | + |
| 59 | +The two key variables here are `t` and `id`. `t` is the wave of the survey the |
| 60 | +row of the data refers to while `id` is the survey respondent. This is a |
| 61 | +perfectly balanced data set, so there are 7 observations for each of the 595 |
| 62 | +respondents. We will use those two pieces of information to create a |
| 63 | +`panel_data` object. |
| 64 | + |
| 65 | +```{r} |
| 66 | +wages <- panel_data(WageData, id = id, wave = t) |
| 67 | +``` |
| 68 | + |
| 69 | +We have to tell `panel_data()` which column refers to the unique identifiers |
| 70 | +for respondents/entities (the latter when you have something like countries |
| 71 | +or companies instead of people) and which column refers to the period/wave of |
| 72 | +data collection. If the waves are not numeric and indexed starting at 1, |
| 73 | +the function will attempt to coerce them to that kind of numbering scheme. |
| 74 | + |
| 75 | +Note that the resulting `panel_data` object will always use the column names |
| 76 | +`id` and `wave`, so it will overwrite those columns if they already exist in the |
| 77 | +source data. `panel_data` frames are modified tibbles |
| 78 | +([`tibble` package](http://tibble.tidyverse.org/)) that are grouped by entity. |
| 79 | + |
| 80 | +### `wbm` --- the within-between model |
| 81 | + |
| 82 | +Anyone can fit a within-between model without the use of this package as it is |
| 83 | +just a particular specification of a multilevel model. With that said, it's |
| 84 | +something that will require some programming and could be rather prone to |
| 85 | +error. In the best case, it is cumbersome and inefficient to create the |
| 86 | +necessary variables. |
| 87 | + |
| 88 | +`wbm` is the primary function that you'll use from this package and it fits |
| 89 | +within-between models for you, utilizing |
| 90 | +[`lme4`](https://cran.r-project.org/web/packages/lme4/index.html) as a |
| 91 | +backend. |
| 92 | + |
| 93 | +A three-part model syntax is used that goes like this: |
| 94 | + |
| 95 | +`dv ~ varying_variables | invariant_variables | cross_level_interactions` |
| 96 | + |
| 97 | +It works like a typical formula otherwise. The bars just tell `panelr` how to |
| 98 | +treat the variables. Note also that you can specify random slopes using |
| 99 | +`lme4`-style syntax in the third part of the formula as well. |
| 100 | + |
| 101 | +Lagged variables are supported as well through the `lag` function. Unlike base |
| 102 | +R, `panelr` lags the variables correctly --- wave 1 observations will have NA |
| 103 | +values for the lagged variable rather than taking the final wave value of the |
| 104 | +previous entity. |
| 105 | + |
| 106 | +Here we will specify a model using the `wages` data. We will predict |
| 107 | +logged wages (`lwage`) using two time-varying variables --- lagged |
| 108 | +union membership (`union`) and contemporaneous weeks worked (`wks`) --- along |
| 109 | +with a time-invariant predictor, a binary indicator for black race (`blk`). |
| 110 | +For demonstrative purposes, we'll fit a random slope for `wks` and an |
| 111 | +interaction between `blk` and `lag(union)`. |
| 112 | + |
| 113 | +```{r message = FALSE} |
| 114 | +model <- wbm(lwage ~ lag(union) + wks | blk | blk * lag(union) + (wks | id), |
| 115 | + data = wages) |
| 116 | +summary(model) |
| 117 | +``` |
| 118 | + |
| 119 | +Note that `imean` is an internal function that calculates the individual-level |
| 120 | +mean, which represents the between-subjects effects of the time-varying |
| 121 | +predictors. The within effects are the time-varying predictors at the occasion |
| 122 | +level with the individal-level mean subtracted. If you want the model specified |
| 123 | +such that the occasion level predictors do not have the mean subtracted, use |
| 124 | +the `model = "contextual"` argument. The "contextual" label refers to the way |
| 125 | +these terms are normally interpreted when it is specified that way. |
| 126 | + |
| 127 | + |
| 128 | +## Contributing |
| 129 | + |
| 130 | +I'm happy to receive bug reports, suggestions, questions, and (most of all) |
| 131 | +contributions to fix problems and add features. I prefer you use the Github |
| 132 | +issues system over trying to reach out to me in other ways. Pull requests for |
| 133 | +contributions are encouraged. |
| 134 | + |
| 135 | +Please note that this project is released with a |
| 136 | +[Contributor Code of Conduct](CONDUCT.md). By participating in this project you |
| 137 | +agree to abide by its terms. |
| 138 | + |
| 139 | +## License |
| 140 | + |
| 141 | +The source code of this package is licensed under the |
| 142 | +[MIT License](http://opensource.org/licenses/mit-license.php). |
0 commit comments