rocketlanding is a data science project focused on collecting, analyzing, and visualizing data related to SpaceX rocket landings. The repository showcases the entire workflow: from web scraping and data wrangling, to exploratory data analysis (EDA), machine learning, and building interactive dashboards.
This project demonstrates the end-to-end pipeline for analyzing rocket landing data:
- Collecting raw SpaceX launch data via web scraping
- Cleaning and wrangling data
- Running exploratory data analysis (EDA) using SQL and visualization tools
- Applying machine learning models to predict rocket landing outcomes
- Building interactive dashboards (with Dash and Folium) for visual representation
- Automated Web Scraping: Fetches latest rocket launch data from the web
- Comprehensive Data Wrangling: Cleans and formats multiple CSV datasets
- EDA: Explore trends with SQL queries and visualizations
- Machine Learning: Train models to predict successful landings
- Dashboards: Interactive visualization using Dash and Folium
- Reproducible Notebooks: All steps included as Jupyter notebooks
- Clone the repository:
git clone https://github.com/jdhruv555/rocketlanding.git cd rocketlanding - Set up a Python virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
-
Data Collection & Wrangling:
- Run
Web_Scrapping.ipynbandData_Collection.ipynbto scrape and organize the data. - Use
Data_Wrangling.ipynbto clean and prepare datasets for analysis.
- Run
-
Exploratory Data Analysis:
- Explore
EDA_SQL.ipynbfor SQL-based analysis. - Visualize data with
EDA_Visualization.ipynb.
- Explore
-
Machine Learning:
- Open and run
Machine_Learning.ipynbto train models and make predictions.
- Open and run
-
Dashboard:
- Serve the interactive dashboard via
Dashboard_dash.ipynbor useserver.pyto deploy locally. - For geographic visualizations, check
Folium_Dashboard.ipynb.
- Serve the interactive dashboard via
-
Web App:
- Use
index.html,script.js, andstyles.cssfor the static site (if any).
- Use
Web_Scrapping.ipynb—— Scrapes SpaceX data from the webData_Collection.ipynb—— Data assembly and mergingData_Wrangling.ipynb—— Data cleaning and feature engineeringEDA_SQL.ipynb—— EDA using SQL queriesEDA_Visualization.ipynb—— Data visualizations (plots, charts)Folium_Dashboard.ipynb—— Map-based dashboard with FoliumDashboard_dash.ipynb—— Interactive dashboard with DashMachine_Learning.ipynb—— Predictive modeling and evaluationrequirements.txt—— Required dependenciesindex.html,script.js,styles.css—— Frontend/static filesspacex_web_scrapped.csv,dataset_part_*.csv—— Processed datasetsmy_data1.db—— SQL database (if used)server.py—— Backend server for dashboards
- Languages: Python, SQL, JavaScript, HTML, CSS
- Libraries:
- Data: pandas, numpy, sqlite3
- Visualization: matplotlib, seaborn, folium, plotly, dash
- Machine Learning: scikit-learn
- Web: Flask (optional), Dash
- Jupyter Notebook
- jdhruv555 (Owner & Maintainer)
If you find this project useful, please ⭐ star and fork the repository! Happy Coding!