Skip to content

mragetsars/Statistical-Distributions-and-Random-Variable-Transformations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Statistical Distributions and Random Variable Transformations

Engineering Probability and Statistics – University of Tehran – Department of Electrical & Computer Engineering

Language Notebook Tool Status License

Overview

This repository contains Statistical Distributions and Random Variable Transformations, an R and Jupyter Notebook implementation of statistical-distribution simulations, analytical PMF/CDF calculations, normal-approximation analysis, exponential memorylessness, and transformations of random variables. This project was developed as the First Computer Assignment for the Engineering Probability and Statistics course at the University of Tehran.

The project follows a complete experimental pipeline, including assignment interpretation, stochastic simulation, manual probability-function implementation, approximation-error analysis, timing and cost comparison, visualization, and final technical interpretation across four standardized notebooks.

Project Objectives

  • ✅ Simulate and analyze the hypergeometric distribution for an election-audit scenario.
  • ✅ Compare theoretical and empirical mean/variance values for simulated random variables.
  • ✅ Implement base-R PMF and CDF calculations for binomial and hypergeometric models.
  • ✅ Evaluate normal approximation with and without continuity correction.
  • ✅ Study the memoryless property of the exponential distribution through simulation.
  • ✅ Transform uniform random variables into exponential and standard-normal random variables.

Methodology

1️⃣ Hypergeometric and Binomial Distribution Analysis

The first notebook models an election-audit process using a finite population of polling stations. It simulates the hypergeometric distribution, compares empirical statistics against theoretical formulas, and investigates the relationship between hypergeometric sampling without replacement and binomial sampling with replacement.

2️⃣ Binomial CDF and Normal Approximation

The second notebook computes the exact binomial CDF for a product-invitation scenario, then compares it with normal approximations. It evaluates approximation error, visualizes CDF behavior near the mean, measures runtime, and applies a cost model to decide when exact computation is preferable.

3️⃣ Exponential Memorylessness

The third notebook simulates customer interarrival times using the inverse-transform method for the exponential distribution. It compares residual waiting-time distributions after a 12-minute threshold and checks the theoretical memoryless identity:

P(X > s + t | X > s) = P(X > t)

4️⃣ Random Variable Transformations

The fourth notebook verifies two transformation techniques. The logarithmic transform converts a uniform random variable into an exponential random variable, while the Box-Muller transform converts two independent uniform variables into standard-normal variables.

Repository Structure

The project is organized as follows:

statistical-distributions-random-variable-transformations/
├── description/          # Original assignment statement
│   └── EPS_CA1.pdf
├── notebooks/            # Standardized R Jupyter notebooks
│   ├── 01-hypergeometric-binomial-distributions.ipynb
│   ├── 02-binomial-normal-approximation.ipynb
│   ├── 03-memoryless-exponential-distribution.ipynb
│   └── 04-random-variable-transformations.ipynb
├── .gitignore            # Git ignore rules for R, Jupyter, and local files
├── LICENSE               # MIT license
└── README.md             # Project documentation

Setup & Usage

The notebooks are written in R and require a Jupyter environment with the R kernel installed.

  1. Install R and Jupyter Notebook.

  2. Install and register the R kernel from an R session:

install.packages("IRkernel")
IRkernel::installspec(user = TRUE)
  1. Clone the repository and open the notebooks:
git clone <repo-url>
cd statistical-distributions-random-variable-transformations
jupyter notebook notebooks/
  1. Run each notebook from top to bottom.

No external dataset is required. All simulations are generated programmatically inside the notebooks using base R functions.

Results

The notebooks produce simulation histograms, PMF/CDF plots, approximation-error plots, runtime comparisons, and distribution-transformation checks. The key findings are:

  • empirical hypergeometric statistics converge toward the theoretical mean and variance as the number of simulations increases;
  • continuity correction improves the normal approximation of the binomial CDF;
  • exponential residual waiting times remain consistent with the memoryless property;
  • logarithmic and Box-Muller transformations produce the expected exponential and standard-normal distributions.

Notes

The notebooks rely only on base R functionality and the standard Jupyter R kernel. Notebook outputs were cleaned so the repository remains lightweight and can be regenerated reproducibly from the fixed seed.

Author

License

This project is licensed under the MIT License. See LICENSE for details.

About

An R and Jupyter Notebook implementation of statistical-distribution simulations, binomial-normal approximation analysis, exponential memorylessness, and random-variable transformations. This project was developed as the First Computer Assignment for the Engineering Probability and Statistics course at the University of Tehran.

Topics

Resources

License

Stars

Watchers

Forks

Contributors