| 💰 Predict Loan Default Customers |
VIX - Home Credit Indonesia: Data Scientist |
Data Wraggling, EDA, Supervised Learning - Classification |
pandas, matplotlib, seaborn, scikit-learn, scipy |
Predicted customer defaults or customer would experience payment difficulties. Conducted data cleansing on raw data and analyzed over 100 features using statistical methods for feature selection. The best model achieved an accuracy of 87% and an AUC of 73% using Logistic Regression. Created a simulation by deploying a web application for loan approval prediction using Streamlit. |
| ☎️ Telco Customer Churn |
FGA x Binar Academy: Data Science [Team] |
Data Wraggling, EDA, Supervised Learning - Classification |
pandas, matplotlib, seaborn, scikit-learn, shap |
Developed a machine learning model to predict customer churn in a telecom company. The Random Forest model yielded the highest accuracy score, reaching 89%, with the most influential feature being the total day charge. A higher charge indicates a higher potential for customer churn. |
| 📲 Predict Clicked Ads Customer Classification |
Mini Project by Rakamin Academy |
Data Wraggling, EDA, Supervised Learning - Classification |
pandas, matplotlib, seaborn, scikit-learn, shap, etc |
Developed a machine learning model and experimented with various algorithms, ultimately determining that the Random Forest model achieved the best fit with accuracy of 96% in identifying potential users likely to click on advertisements. Analyzed key influential features with SHAP to enhance targeting for improved conversion rates and cost efficiency. |
| 🙂 Predict Customer Personality to Boost Marketing Campaign |
Mini Project by Rakamin Academy |
Data Wraggling, EDA, Unsupervised Learning - Clustering |
pandas, matplotlib, seaborn, scikit-learn, yellowbrick |
Analyzed customer characteristics of a e-grocery store by creating a clustering model using K-means. Before to clustering, decomposition was performed, and the best cluster was determined using inertia score or distortion score. This resulted in 4 clusters based on customer behavior, considering factors such as the number of transactions, spending levels, response to campaigns, and website visit frequency. |
| 🏬 Investigate Hotel Business using Data Visualization |
Mini Project by Rakamin Academy |
Data Wraggling, EDA, Data Visualization |
pandas, matplotlib, seaborn |
Analyzed the performance of City Hotels and Resort Hotels, identifying the frequently visited hotel type and exploring the relationships between booking cancellations, length of stay, and lead time through Python visualization. Identified potential causes for these patterns and provided business recommendations based on the analysis. |
| 🚲 Data Quality Assessment and Customer Segmentation |
VIX - KPMG Australia: Data Analytic Consulting |
Data Wraggling, EDA, RFM analysis |
pandas, matplotlib, seaborn |
Developed and optimized a bike company market strategy by analyzing their data. Conducted a data quality assessment and identified strategies to mitigate any data quality issues. Performed customer segmentation using a simple RFM (Recency, Frequency, Monetary) analysis to recommend potential new customers for targeted marketing. Visualized insights about the targeted customer demographics on a dashboard. |
| 🛒 Online Shoppers Purchasing Intention |
Final Project -Rakamin Academy [Team] |
Data Wraggling, EDA, Supervised Learning - Classification |
pandas, matplotlib, seaborn, scikit-learn, shap |
Built a model to predict which website visitors are likely to make a purchase or not. After testing several algorithms, Random Forest Hyperparameter Tuning demonstrated the best performance, achieving a ROC-AUC score of 90%. Through simulation, it was projected that this model could potentially increase the conversion rate by 58%. |
| ✈️ Airline Customer Segmentation Based on LRFMC Model Using K-Means |
Assignment - Rakamin Academy [Team] |
Data Wraggling, EDA, Unsupervised Learning - Clustering |
pandas, matplotlib, seaborn, scikit-learn, yellowbrick |
Developed a clustering model employing LRFMC scores and the K-Means algorithm, resulting in the identification of 5 customer clusters: New Users, 20% are Loyal Customers, 19% are Potential Loyalists/The Champion, 18% are Need Attention, and 16% are Hibernating. |