[ASTGCN and Causal Inference integrated code not uploaded here - Waiting for my independent research to be graded]
The prediction model aims to optimize the placement of bikes throughout the network using a SpatioTemporal Graph Convolutional Network (STGCN) model.
The data, sourced from Citi Bike's System Data, includes ride data from January 1, 2017, to October 1, 2023, in Jersey City.
- Data Preprocessing: Cleaning data, removing outliers, and preparing a final dataframe for analysis.
- Exploratory Data Analysis: Conducting network visualization, centrality analysis, and heatmap generation to understand activity patterns.
- Feature Selection: Including factors like time, station IDs, and number of rides, and encoding stations as features in the model.
- Model Implementation: Utilizing a Spatio-Temporal Graph Convolutional Network (STGCN) to predict bike traffic (departures and arrivals) at each station.
- Model Testing and Prediction: Evaluating the model's performance and predicting net rides for a set duration.
- The rides data is very sparse in the Jersey City impling that most of the rides are on limited ride paths through the city. The demand is generated by only by few stations.
- The stations which are high in demand such as Grove St Path and Hamilton Park have have high degree of imbalance in terms of net rides.
- Training the model for all the stations took a lot of computation time. In hindsight, we can run the model only for important stations to reduce the compute time.