Engineering Probability and Statistics – University of Tehran – Department of Electrical & Computer Engineering
This repository contains Statistical Distributions and Random Variable Transformations, an R and Jupyter Notebook implementation of statistical-distribution simulations, analytical PMF/CDF calculations, normal-approximation analysis, exponential memorylessness, and transformations of random variables. This project was developed as the First Computer Assignment for the Engineering Probability and Statistics course at the University of Tehran.
The project follows a complete experimental pipeline, including assignment interpretation, stochastic simulation, manual probability-function implementation, approximation-error analysis, timing and cost comparison, visualization, and final technical interpretation across four standardized notebooks.
- ✅ Simulate and analyze the hypergeometric distribution for an election-audit scenario.
- ✅ Compare theoretical and empirical mean/variance values for simulated random variables.
- ✅ Implement base-R PMF and CDF calculations for binomial and hypergeometric models.
- ✅ Evaluate normal approximation with and without continuity correction.
- ✅ Study the memoryless property of the exponential distribution through simulation.
- ✅ Transform uniform random variables into exponential and standard-normal random variables.
The first notebook models an election-audit process using a finite population of polling stations. It simulates the hypergeometric distribution, compares empirical statistics against theoretical formulas, and investigates the relationship between hypergeometric sampling without replacement and binomial sampling with replacement.
The second notebook computes the exact binomial CDF for a product-invitation scenario, then compares it with normal approximations. It evaluates approximation error, visualizes CDF behavior near the mean, measures runtime, and applies a cost model to decide when exact computation is preferable.
The third notebook simulates customer interarrival times using the inverse-transform method for the exponential distribution. It compares residual waiting-time distributions after a 12-minute threshold and checks the theoretical memoryless identity:
P(X > s + t | X > s) = P(X > t)
The fourth notebook verifies two transformation techniques. The logarithmic transform converts a uniform random variable into an exponential random variable, while the Box-Muller transform converts two independent uniform variables into standard-normal variables.
The project is organized as follows:
statistical-distributions-random-variable-transformations/
├── description/ # Original assignment statement
│ └── EPS_CA1.pdf
├── notebooks/ # Standardized R Jupyter notebooks
│ ├── 01-hypergeometric-binomial-distributions.ipynb
│ ├── 02-binomial-normal-approximation.ipynb
│ ├── 03-memoryless-exponential-distribution.ipynb
│ └── 04-random-variable-transformations.ipynb
├── .gitignore # Git ignore rules for R, Jupyter, and local files
├── LICENSE # MIT license
└── README.md # Project documentation
The notebooks are written in R and require a Jupyter environment with the R kernel installed.
-
Install R and Jupyter Notebook.
-
Install and register the R kernel from an R session:
install.packages("IRkernel")
IRkernel::installspec(user = TRUE)- Clone the repository and open the notebooks:
git clone <repo-url>
cd statistical-distributions-random-variable-transformations
jupyter notebook notebooks/- Run each notebook from top to bottom.
No external dataset is required. All simulations are generated programmatically inside the notebooks using base R functions.
The notebooks produce simulation histograms, PMF/CDF plots, approximation-error plots, runtime comparisons, and distribution-transformation checks. The key findings are:
- empirical hypergeometric statistics converge toward the theoretical mean and variance as the number of simulations increases;
- continuity correction improves the normal approximation of the binomial CDF;
- exponential residual waiting times remain consistent with the memoryless property;
- logarithmic and Box-Muller transformations produce the expected exponential and standard-normal distributions.
The notebooks rely only on base R functionality and the standard Jupyter R kernel. Notebook outputs were cleaned so the repository remains lightweight and can be regenerated reproducibly from the fixed seed.
This project is licensed under the MIT License. See LICENSE for details.