Skip to content

weim-mkt/decision-tree-teaching-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Decision Tree Teaching Demo

An interactive Shiny application for demonstrating how decision trees are trained step by step. Upload a CSV dataset, pick your target and predictors, and explore how split candidates are evaluated, how impurity changes, and what the resulting tree looks like.

✨ Features

  • CSV upload with flexible parsing – toggle header handling and delimiter options before loading your data.
  • Dynamic variable selection – choose the target column and include/exclude predictors on the fly.
  • Classification or regression – automatically picks the correct rpart mode or let you override it.
  • Interactive training stepper – scrub through each split in the fitted tree to inspect node sizes and impurity metrics.
  • Candidate split analytics – inspect primary vs. competing splits, visualize threshold gain curves, and review summary tables.
  • Decision tree visualization – render the tree with rpart.plot, including class probabilities or regression summaries.

🚀 Getting Started

Prerequisites

  • R 4.0 or higher
  • The app relies on the following packages (installed automatically on first run via pacman):
    • shiny
    • rpart and rpart.plot
    • dplyr and purrr
    • ggplot2
    • DT

If pacman is missing it will be installed for you, but you can pre-install everything manually:

install.packages(c("pacman", "shiny", "rpart", "rpart.plot", "dplyr", "purrr", "ggplot2", "DT"))

Run the app locally

  1. Clone this repository or download the source.
  2. Open an R session in the project directory.
  3. Launch the Shiny app:
shiny::runApp("app.R")

The application will start in your browser. Any console messages about package installation will appear in the R session.

📂 Repository structure

├── app.R       # Shiny UI + server logic for the demo
└── README.md   # Project documentation (this file)

🧪 Using your own data

  • Upload a tidy CSV file with one target column and one or more predictor columns.
  • Toggle the Header row checkbox if your file contains column names.
  • Adjust the Separator option when working with semicolon- or tab-delimited files.
  • Pick the Target variable and at least one Predictor variable.
  • Choose the tree method (auto/classification/regression), maximum depth, and minimum split size to suit your dataset.
  • Walk through each split with the Training step slider to inspect impurity changes and sample coverage.

Data tips

  • Classification mode requires at least two target classes; regression mode needs a numeric target.
  • Rows containing missing values in the selected variables are dropped before training—watch the notification area for counts.
  • Categorical predictors are automatically converted to factors; logical columns become two-level factors.

🛠️ Implementation notes

The app is a single-file Shiny deployment with several helper utilities:

  • calc_impurity() computes Gini impurity for factors and variance for numeric targets.
  • build_tree_steps() extracts per-split metrics (size, impurity, candidate rules) from the fitted rpart model, powering the stepper UI.
  • compute_split_curve() generates gain curves across possible thresholds for numeric features.
  • The UI combines tabbed outputs for data preview, tree visualization, and candidate analysis with slider-driven interactivity.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages