This project is a collection of data analyses and visualizations exploring the statistical profile of NBA teams, with a focus on identifying what metrics correlate most strongly with winning and winning championships.
Using NBA game data from 1980 to 2023, this analysis cleans and aggregates team-level statistics for each season to answer several key questions:
- Do "defense wins championships"?
- Is it better to take more shots or better shots?
- Which offensive stats have the strongest correlation with a high win percentage?
- What does the statistical "fingerprint" of a championship-caliber team look like?
This is the central analysis of the project. It calculates a weighted Offensive Rating and Defensive Rating for every team in every season since 1985. The results are plotted on a "Championship Quadrant" chart.
- Plot: A Plotly scatter plot with all team-seasons as faint gray dots and championship-winning teams highlighted as gold stars.
- Quadrants: The plot is divided into four quadrants based on the league-average offense and defense for that season:
- Championship Zone (High Offense, High Defense)
- Grit & Grind (Low Offense, High Defense)
- Offense Only (High Offense, Low Defense)
- Lottery Bound (Low Offense, Low Defense)
- Key Finding: A vast majority of NBA champions fall into the top-right "Championship Zone," demonstrating that while a great offense is important, an elite, above-average defense is a non-negotiable requirement for winning a title.
This analysis explores the relationship between shot quantity (Field Goal Attempts) and shot quality (Field Goal Percentage) and how they relate to winning.
- Plot: A scatter plot of Field Goal Attempts per Game (FGA/G) vs. Field Goal Percentage (FG%).
- Color: Points are colored by Win Percentage.
- Key Finding: The plot is divided into quadrants based on the median FGA and FG%. The analysis shows that teams in the "High FG%, Low FGA" quadrant (quality over quantity) have a higher average win percentage than teams in the "Low FG%, High FGA" quadrant.
This visualization creates a statistical "fingerprint" for the top contenders in a given season by comparing them to the league average across multiple categories.
- Plot: A polar (spider) chart showing the % deviation from the league average for key metrics like:
- Offensive & Defensive Efficiency
- Shooting % (FG%, 3P%, FT%)
- Defensive Impact (Steals, Blocks)
- Rebounding
- Use Case: The chart profiles the champion (e.g., GSW in 2022) against other top teams, identifying their unique strengths and weaknesses. It also includes a "recommendation" for the best-fit team for a hypothetical free agent (nicknamed "DeBron").
To determine which stats matter most, this analysis plots Win Percentage against seven different offensive metrics.
- Plots: A grid of scatter plots and binned bar charts showing the relationship between Win % and:
- Average Points per Game
- Field Goal %
- 3-Point %
- Free Throw %
- Average Assists
- Average Offensive Rebounds
- Average Turnovers
- Key Finding: Average Points per Game and Field Goal Percentage show the strongest positive correlations with winning, while Average Turnovers has the strongest negative correlation.
This project relies on several CSV files, which are assumed to be in a csv/ sub-directory:
game.csv(The primary file for most analyses)line_score.csvteam_details.csvteam_history.csvfinal_team_win_rate.csv
The script in Cell 2 also generates an intermediate file, csv/team_offense_vs_wins.csv.
-
Clone the repository:
git clone [https://github.com/your-username/your-repo-name.git](https://github.com/your-username/your-repo-name.git) cd your-repo-name -
Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required libraries:
pip install -r requirements.txt
-
Add the data:
- Create a folder named
csvin the root of the project. - Place all the required CSV files (listed above) into the
csv/folder. - Note: The notebook is not perfectly consistent with paths. You may need to adjust
pd.read_csv("game.csv")in Cell 1 topd.read_csv("csv/game.csv").
- Create a folder named
-
Run the Jupyter Notebook:
jupyter notebook "Final Project.ipynb"(You will need to rename the provided
.htmlfile back to.ipynbor copy the code into a new notebook.)