diff --git a/_freeze/html/factors/execute-results/html.json b/_freeze/html/factors/execute-results/html.json index 7ad80ee5..fae3bea1 100644 --- a/_freeze/html/factors/execute-results/html.json +++ b/_freeze/html/factors/execute-results/html.json @@ -1,7 +1,8 @@ { "hash": "49601350c49c834abf551b0a9230b77d", "result": { - "markdown": "---\ntitle: \"Factors with forcats :: Cheatsheet\"\ndescription: \" \"\nimage-alt: \"\"\nexecute:\n eval: true\n output: false\n warning: false\n---\n\n::: {.cell .column-margin}\n\"Hex\n

\n

Download PDF

\n\"\"/\n
\n

Translations (PDF)

\n* Japanese\n* Spanish\n:::\n\n\nThe **forcats** package provides tools for working with factors, which are R's data structure for categorical data.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(forcats)\n```\n:::\n\n\n\n\n## Factors\n\nR represents categorical data with factors.\nA **factor** is an integer vector with a **levels** attribute that stores a set of mappings between integers and categorical values.\nWhen you view a factor, R displays not the integers but the levels associated with them.\n\nFor example, R will display `c(\"a\", \"c\", \"b\", \"a\")` with levels `c(\"a\", \"b\", \"c\")` but will store `c(1, 3, 2, 1)` where 1 = a, 2 = b, and 3 = c.\n\nR will display:\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\n[1] a c b a\nLevels: a b c\n```\n:::\n:::\n\n\nR will store:\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\n[1] 1 3 2 1\nattr(,\"levels\")\n[1] \"a\" \"b\" \"c\"\n```\n:::\n:::\n\n\nCreate a factor with `factor()`:\n\n- `factor(x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA)`: Convert a vector to a factor.\n Also `as_factor()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"c\", \"b\", \"a\"), levels = c(\"a\", \"b\", \"c\"))\n ```\n :::\n\n\nReturn its levels with `levels()`:\n\n- `levels(x)`: Return/set the levels of a factor.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n levels(f)\n levels(f) <- c(\"x\", \"y\", \"z\")\n ```\n :::\n\n\nUse `unclass()` to see its structure.\n\n## Inspect Factors\n\n- `fct_count(f, sort = FALSE, prop = FALSE)`: Count the number of values with each level.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_count(f)\n ```\n :::\n\n\n- `fct_match(f, lvls)`: Check for `lvls` in `f`.\n\n\n \n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_match(f, \"a\")\n ```\n :::\n\n\n- `fct_unique(f)`: Return the unique values, removing duplicates.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unique(f)\n ```\n :::\n\n\n## Combine Factors\n\n- `fct_c(...)`: Combine factors with different levels.\n Also `fct_cross()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f1 <- factor(c(\"a\", \"c\"))\n f2 <- factor(c(\"b\", \"a\"))\n fct_c(f1, f2)\n ```\n :::\n\n\n- `fct_unify(fs, levels = lvls_union(fs))`: Standardize levels across a list of factors.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unify(list(f2, f1))\n ```\n :::\n\n\n## Change the order of levels\n\n- `fct_relevel(.f, ..., after = 0L)`: Manually reorder factor levels.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_relevel(f, c(\"b\", \"c\", \"a\"))\n ```\n :::\n\n\n- `fct_infreq(f, ordered = NA)`: Reorder levels by the frequency in which they appear in the data (highest frequency first).\n Also `fct_inseq()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f3 <- factor(c(\"c\", \"c\", \"a\"))\n fct_infreq(f3)\n ```\n :::\n\n\n- `fct_inorder(f, ordered = NA)`: Reorder levels by order in which they appear in the data.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_inorder(f2)\n ```\n :::\n\n\n- `fct_rev(f)`: Reverse level order.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f4 <- factor(c(\"a\",\"b\",\"c\"))\n fct_rev(f4)\n ```\n :::\n\n\n- `fct_shift(f)`: Shift levels to left or right, wrapping around end.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shift(f4)\n ```\n :::\n\n\n- `fct_shuffle(f, n = 1L)`: Randomly permute order of factor levels.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shuffle(f4)\n ```\n :::\n\n\n- `fct_reorder(.f, .x, .fun = median, ..., .desc = FALSE)`: Reorder levels by their relationship with another variable.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n boxplot(PlantGrowth, weight ~ fct_reorder(group, weight))\n ```\n :::\n\n\n- `fct_reorder2(.f, .x, .y, .fun = last2, ..., .desc = TRUE)`: Reorder levels by their final values when plotted with two other variables.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n ggplot(\n diamonds,\n aes(carat, price, color = fct_reorder2(color, carat, price))\n ) + \n geom_smooth()\n ```\n :::\n\n\n## Change the value of levels\n\n- `fct_recode(.f, ...)`: Manually change levels.\n Also `fct_relabel()` which obeys `purrr::map` syntax to apply a function or expression to each level.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_recode(f, v = \"a\", x = \"b\", z = \"c\")\n fct_relabel(f, ~ paste0(\"x\", .x))\n ```\n :::\n\n\n- `fct_anon(f, prefix = \"\")`: Anonymize levels with random integers.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_anon(f)\n ```\n :::\n\n\n- `fct_collapse(.f, …, other_level = NULL)`: Collapse levels into manually defined groups.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_collapse(f, x = c(\"a\", \"b\"))\n ```\n :::\n\n\n- `fct_lump_min(f, min, w = NULL, other_level = \"Other\")`: Lumps together factors that appear fewer than `min` times.\n Also `fct_lump_n()`, `fct_lump_prop()`, and `fct_lump_lowfreq()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_lump_min(f, min = 2)\n ```\n :::\n\n\n- `fct_other(f, keep, drop, other_level = \"Other\")`: Replace levels with \"other.\"\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_other(f, keep = c(\"a\", \"b\"))\n ```\n :::\n\n\n## Add or drop levels\n\n- `fct_drop(f, only)`: Drop unused levels.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f5 <- factor(c(\"a\",\"b\"),c(\"a\",\"b\",\"x\"))\n f6 <- fct_drop(f5)\n ```\n :::\n\n\n- `fct_expand(f, ...)`: Add levels to a factor.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_expand(f6, \"x\")\n ```\n :::\n\n\n- `fct_na_value_to_level(f, level = \"(Missing)\")`: Assigns a level to NAs to ensure they appear in plots, etc.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"b\", NA))\n fct_na_value_to_level(f, level = \"(Missing)\")\n ```\n :::\n\n\n------------------------------------------------------------------------\n\nCC BY SA Posit Software, PBC • [info\\@posit.co](mailto:info@posit.co) • [posit.co](https://posit.co)\n\nLearn more at [forcats.tidyverse.org](https://forcats.tidyverse.org).\n\nUpdated: 2023-06.\n\n\n::: {.cell}\n\n```{.r .cell-code}\npackageVersion(\"forcats\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] '1.0.0'\n```\n:::\n:::\n\n\n------------------------------------------------------------------------\n", + "engine": "knitr", + "markdown": "---\ntitle: \"Factors with forcats :: Cheatsheet\"\ndescription: \" \"\nimage-alt: \"\"\nexecute:\n eval: true\n output: false\n warning: false\n---\n\n::: {.cell .column-margin}\n\"Hex\n

\n

Download PDF

\n\"\"/\n
\n

Translations (PDF)

\n* Japanese\n* Portuguese\n* Spanish\n:::\n\n\n\nThe **forcats** package provides tools for working with factors, which are R's data structure for categorical data.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(forcats)\n```\n:::\n\n\n\n\n\n## Factors\n\nR represents categorical data with factors.\nA **factor** is an integer vector with a **levels** attribute that stores a set of mappings between integers and categorical values.\nWhen you view a factor, R displays not the integers but the levels associated with them.\n\nFor example, R will display `c(\"a\", \"c\", \"b\", \"a\")` with levels `c(\"a\", \"b\", \"c\")` but will store `c(1, 3, 2, 1)` where 1 = a, 2 = b, and 3 = c.\n\nR will display:\n\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] a c b a\nLevels: a b c\n```\n\n\n:::\n:::\n\n\n\nR will store:\n\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 1 3 2 1\nattr(,\"levels\")\n[1] \"a\" \"b\" \"c\"\n```\n\n\n:::\n:::\n\n\n\nCreate a factor with `factor()`:\n\n- `factor(x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA)`: Convert a vector to a factor.\n Also `as_factor()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"c\", \"b\", \"a\"), levels = c(\"a\", \"b\", \"c\"))\n ```\n :::\n\n\n\nReturn its levels with `levels()`:\n\n- `levels(x)`: Return/set the levels of a factor.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n levels(f)\n levels(f) <- c(\"x\", \"y\", \"z\")\n ```\n :::\n\n\n\nUse `unclass()` to see its structure.\n\n## Inspect Factors\n\n- `fct_count(f, sort = FALSE, prop = FALSE)`: Count the number of values with each level.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_count(f)\n ```\n :::\n\n\n\n- `fct_match(f, lvls)`: Check for `lvls` in `f`.\n\n\n\n \n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_match(f, \"a\")\n ```\n :::\n\n\n\n- `fct_unique(f)`: Return the unique values, removing duplicates.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unique(f)\n ```\n :::\n\n\n\n## Combine Factors\n\n- `fct_c(...)`: Combine factors with different levels.\n Also `fct_cross()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f1 <- factor(c(\"a\", \"c\"))\n f2 <- factor(c(\"b\", \"a\"))\n fct_c(f1, f2)\n ```\n :::\n\n\n\n- `fct_unify(fs, levels = lvls_union(fs))`: Standardize levels across a list of factors.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unify(list(f2, f1))\n ```\n :::\n\n\n\n## Change the order of levels\n\n- `fct_relevel(.f, ..., after = 0L)`: Manually reorder factor levels.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_relevel(f, c(\"b\", \"c\", \"a\"))\n ```\n :::\n\n\n\n- `fct_infreq(f, ordered = NA)`: Reorder levels by the frequency in which they appear in the data (highest frequency first).\n Also `fct_inseq()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f3 <- factor(c(\"c\", \"c\", \"a\"))\n fct_infreq(f3)\n ```\n :::\n\n\n\n- `fct_inorder(f, ordered = NA)`: Reorder levels by order in which they appear in the data.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_inorder(f2)\n ```\n :::\n\n\n\n- `fct_rev(f)`: Reverse level order.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f4 <- factor(c(\"a\",\"b\",\"c\"))\n fct_rev(f4)\n ```\n :::\n\n\n\n- `fct_shift(f)`: Shift levels to left or right, wrapping around end.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shift(f4)\n ```\n :::\n\n\n\n- `fct_shuffle(f, n = 1L)`: Randomly permute order of factor levels.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shuffle(f4)\n ```\n :::\n\n\n\n- `fct_reorder(.f, .x, .fun = median, ..., .desc = FALSE)`: Reorder levels by their relationship with another variable.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n boxplot(PlantGrowth, weight ~ fct_reorder(group, weight))\n ```\n :::\n\n\n\n- `fct_reorder2(.f, .x, .y, .fun = last2, ..., .desc = TRUE)`: Reorder levels by their final values when plotted with two other variables.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n ggplot(\n diamonds,\n aes(carat, price, color = fct_reorder2(color, carat, price))\n ) + \n geom_smooth()\n ```\n :::\n\n\n\n## Change the value of levels\n\n- `fct_recode(.f, ...)`: Manually change levels.\n Also `fct_relabel()` which obeys `purrr::map` syntax to apply a function or expression to each level.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_recode(f, v = \"a\", x = \"b\", z = \"c\")\n fct_relabel(f, ~ paste0(\"x\", .x))\n ```\n :::\n\n\n\n- `fct_anon(f, prefix = \"\")`: Anonymize levels with random integers.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_anon(f)\n ```\n :::\n\n\n\n- `fct_collapse(.f, …, other_level = NULL)`: Collapse levels into manually defined groups.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_collapse(f, x = c(\"a\", \"b\"))\n ```\n :::\n\n\n\n- `fct_lump_min(f, min, w = NULL, other_level = \"Other\")`: Lumps together factors that appear fewer than `min` times.\n Also `fct_lump_n()`, `fct_lump_prop()`, and `fct_lump_lowfreq()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_lump_min(f, min = 2)\n ```\n :::\n\n\n\n- `fct_other(f, keep, drop, other_level = \"Other\")`: Replace levels with \"other.\"\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_other(f, keep = c(\"a\", \"b\"))\n ```\n :::\n\n\n\n## Add or drop levels\n\n- `fct_drop(f, only)`: Drop unused levels.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f5 <- factor(c(\"a\",\"b\"),c(\"a\",\"b\",\"x\"))\n f6 <- fct_drop(f5)\n ```\n :::\n\n\n\n- `fct_expand(f, ...)`: Add levels to a factor.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_expand(f6, \"x\")\n ```\n :::\n\n\n\n- `fct_na_value_to_level(f, level = \"(Missing)\")`: Assigns a level to NAs to ensure they appear in plots, etc.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"b\", NA))\n fct_na_value_to_level(f, level = \"(Missing)\")\n ```\n :::\n\n\n\n------------------------------------------------------------------------\n\nCC BY SA Posit Software, PBC • [info\\@posit.co](mailto:info@posit.co) • [posit.co](https://posit.co)\n\nLearn more at [forcats.tidyverse.org](https://forcats.tidyverse.org).\n\nUpdated: 2024-05.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\npackageVersion(\"forcats\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] '1.0.0'\n```\n\n\n:::\n:::\n\n\n\n------------------------------------------------------------------------\n", "supporting": [ "factors_files" ], diff --git a/_freeze/html/factors/figure-html/unnamed-chunk-21-1.png b/_freeze/html/factors/figure-html/unnamed-chunk-21-1.png index 3c857a85..e1f898b3 100644 Binary files a/_freeze/html/factors/figure-html/unnamed-chunk-21-1.png and b/_freeze/html/factors/figure-html/unnamed-chunk-21-1.png differ diff --git a/factors.pdf b/factors.pdf index 9d2bfda8..9829d9d0 100644 Binary files a/factors.pdf and b/factors.pdf differ diff --git a/html/images/logo-forcats.png b/html/images/logo-forcats.png index df00f945..f189970c 100644 Binary files a/html/images/logo-forcats.png and b/html/images/logo-forcats.png differ diff --git a/keynotes/factors.key b/keynotes/factors.key index 0aebca41..ca631364 100644 Binary files a/keynotes/factors.key and b/keynotes/factors.key differ diff --git a/pngs/factors.png b/pngs/factors.png index 2c6cb84c..d13d0e3f 100644 Binary files a/pngs/factors.png and b/pngs/factors.png differ