Skip to content

25f2005869-glitch/basic-statistics-projects

Repository files navigation

📊 Applied Statistics Portfolio (Statistics 1 & 2)

Excel Python Jupyter IIT Madras

A structured portfolio of extra activities, empirical practice problems, and Python mini-projects for:

  • Statistics 1: Descriptive & Foundational Methods
  • Statistics 2: Inferential & Probability Models

Explore how theoretical statistics are brought to life in spreadsheets and Python—from hand-calculated frequency tables to advanced algorithmic simulations.


🎯 Portfolio Core Focus

  • Descriptive Statistics: Modeling of mean, median, mode, variance, and frequency distributions
  • Probability & Simulations: Custom Monte Carlo frameworks—algorithmic dice rolls & randomness
  • Theoretical Probability Models: Normal distribution fits on real datasets, empirical rule checks
  • Central Limit Theorem (CLT): Sampling engine proving asymptotic normality
  • Bivariate Relationships: Covariance matrix & scatter analysis on financial ratios

🛠 Analytical Tech Stack

  • Spreadsheet Modeling: Microsoft Excel (advanced formulas, Data Analysis Toolpak)
  • Python/Notebooks: Google Colab, Jupyter (.ipynb)
  • Core Libraries: numpy, pandas, matplotlib, scipy.stats, openpyxl

📂 Repository Architecture

basic-statistics-projects/
├── Statistics 1 — Descriptive Foundations/
│   ├── STATISTICS-1-EXTRA_ACTIVITY_2/   # Frequency distributions
│   ├── STATISTICS-1-EXTRA_ACTIVITY_3/   # Central tendency
│   └── STATISTICS-1-EXTRA_ACTIVITY_4/   # Categorical/continuous plotting
│
└── Statistics 2 — Probability & Inference/
    ├── STATISTICS-2-EXTRA_ACTIVITY_1/   # Netflix dataset exploration
    ├── STATISTICS-2-EXTRA_ACTIVITY_2/   # Monte Carlo Dice Simulation
    ├── STATISTICS-2-EXTRA_ACTIVITY_3/   # Corporate Ratio Covariance
    ├── STATISTICS-2-EXTRA_ACTIVITY_4/   # Normal Curve Profit Analysis
    └── STATISTICS-2-EXTRA_ACTIVITY_5/   # Central Limit Theorem (CLT)

🚀 Deep Dive: Statistical Experiments

🎲 Monte Carlo Dice Simulation

  • Concept: Law of Large Numbers (LLN)
  • Implementation: Python .ipynb, stochastic dice-rolls, live distribution plots

📉 Corporate Ratio Covariance

  • Concept: Correlation & Covariance ($r$, $\sigma_{XY}$)
  • Implementation: Balance-sheet ratios (Current, Quick, Cash), scatter-matrix viz

🔔 Normal Model on Google Profits

  • Concept: Gaussian fit, Z-scores
    $$f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}$$
  • Implementation: Google quarterly profits fit to Normal, probability densities of future returns

⚖️ Central Limit Theorem (CLT)

  • Concept: Sampling means, limit theorem
  • Implementation: Python engine simulating non-normal populations; batch sampling proves readout is normal as size $n\uparrow$

🎮 How to Run / Interact

A. Excel Workspace

  1. Open relevant STATISTICS-* activity folder, download .xlsx, and explore formulas.

B. Python & Jupyter

pip install numpy pandas matplotlib scipy openpyxl notebook
jupyter notebook

Alternatively: Drop .ipynb into Google Colab for instant cloud execution.


📈 Learning Outcomes

  • Hands-on descriptive and inferential stats via spreadsheet & code
  • Real financial & business data translation to statistical models
  • Visualization, reporting, and result interpretation for research portfolios

👩‍💻 Author

Saloni Tiwari
🎓 IIT Madras BS Data Science
🎓 B.Sc Mathematics


⭐️

If you found this portfolio useful, please ⭐ on GitHub!

Releases

No releases published

Packages

 
 
 

Contributors