diff --git a/data-generation-GCP.md b/data-generation-GCP.md new file mode 100644 index 0000000..c3c524b --- /dev/null +++ b/data-generation-GCP.md @@ -0,0 +1,209 @@ +# Advanced Data Generation on GCP with FightChurn + +This guide walks you through setting up a full Google Cloud environment to generate the advanced CRM churn simulation data using the `fightchurn` Python package. It includes VM and SQL database setup, code installation, simulation execution, and exporting the data to your local PostgreSQL setup. + +--- + +## ✨ Project & SQL Setup on Google Cloud + +1. **Go to [Google Cloud Platform](https://cloud.google.com)** + + ![](./readme_files/advanced-data-generation/image1.png) + +2. Click the project selector next to the search bar and **create a new project** (e.g. `fighting-churn`). + +### ⚖️ Set Up Cloud SQL (PostgreSQL) + +1. In the GCP search bar, type `SQL` and go to **SQL**. + + ![](./readme_files/advanced-data-generation/image5.png) + +2. Click **Create instance** > Choose **PostgreSQL** > Click **Enable API** when prompted. + +3. Choose **Enterprise Plus**. +4. Select a region (e.g. `southamerica-east1`) and choose a machine type (e.g. `N2 16 vCPU, 128gb`) with good resources for faster setup. +5. Name your instance (e.g. `churn-db`) and click **Create Instance**. + + ![](./readme_files/advanced-data-generation/image29.png) + ![](./readme_files/advanced-data-generation/image31.png) + +### 📁 Set Up a VM Instance + +1. Search for **VM Instances** and click **Create Instance**. +2. Set region = same as your SQL instance. Zone = any. +3. Machine type: `e2-highcpu-16 (16 vCPU, 16 GB)`. +4. Name your VM, then click **Create**. + + ![](./readme_files/advanced-data-generation/image15.png) + +Wait for the VM to be ready. Once the green check appears under **status**, click **SSH** to open a terminal. + +--- + +## ⚒️ Set Up Environment on the VM + + ![](./readme_files/advanced-data-generation/image10.png) + + +Create and run a shell script to install Python 3.9, dependencies, and the `fightchurn` package. Follow the commands: + +```bash +nano setup_churn_vm.sh +``` + +Paste this script: + +```bash +#!/bin/bash + +# Update packages +sudo apt-get update +sudo apt-get upgrade -y + +# Install dependencies +sudo apt-get install -y g++ wget build-essential libssl-dev zlib1g-dev \ +libncurses5-dev libncursesw5-dev libreadline-dev libsqlite3-dev \ +libgdbm-dev libdb5.3-dev libbz2-dev libexpat1-dev liblzma-dev tk-dev \ +uuid-dev libffi-dev postgresql-client + +# Install Python 3.9 +wget https://www.python.org/ftp/python/3.9.18/Python-3.9.18.tgz +tar xvf Python-3.9.18.tgz +cd Python-3.9.18 +./configure --enable-optimizations +make -j$(nproc) +sudo make altinstall +cd .. + +# Set up virtual environment +python3.9 -m venv venv +source venv/bin/activate + +# Install fightchurn +pip install --upgrade pip +pip install fightchurn + +# Create output directory +mkdir churn_output + +echo "✅ Python 3.9 + fightchurn environment ready!" +``` + +Save with `Ctrl+O`, Enter, and exit with `Ctrl+X`. Then run: + +```bash +chmod +x setup_churn_vm.sh +./setup_churn_vm.sh +``` + +If sucessful, your terminal will look like this: + ![](./readme_files/advanced-data-generation/image28.png) + +Use +```bash +pwd +``` +to find you username on GCP activate the environment like this: + +```bash +cd /home/YOUR_USERNAME +source venv/bin/activate +``` +You will see a “(venv)” beforte the directory. Do not close this window. + +--- + +## 🔗 Connect VM to Cloud SQL + +1. Copy your **VM external IP**. + ![](./readme_files/advanced-data-generation/image18.png) +2. In SQL > Instance (churn-db) > Connections, add a new authorized network with your VM's **external IP**. + + ![](./readme_files/advanced-data-generation/image21.png) +3. Copy the database instance Public IP + ![](./readme_files/advanced-data-generation/image30.png) + +Then back in the VM terminal: + +```bash +psql "dbname=postgres user=postgres hostaddr=YOUR_DB-INSTANCE_PUBLIC_IP" +``` + +Enter your password (e.g. `churn`) and create your database: + +```sql +create database churn; +\q +``` + +--- + +## ✨ Run the Advanced Simulation + +1. Activate your Python environment: + +```bash +source /home/YOUR_USERNAME/venv/bin/activate +python +``` + +2. Run the simulation: + +```python +from fightchurn import run_churn_listing + +run_churn_listing.set_churn_environment('churn', 'postgres', 'churn', '/usr/src/churn_output', host='YOUR_DB-INSTANCE_PUBLIC_IP') +run_churn_listing.run_standard_simulation(schema='crm5', n_parallel=16) +``` + +Choose `crm5` when prompted. Wait for the simulation to finish. + + ![](./readme_files/advanced-data-generation/image24.png) + +Now your database is populated. + + ![](./readme_files/advanced-data-generation/image6.png) + + +--- + +## 🛫 Export the Data to Local PostgreSQL (If you want) + +### 1. On GCP: + +- Go to your SQL instance > **Connections** > Add your **local IP address**. **[Find out your IP Address here](https://whatismyipaddress.com)** + +### 2. On your PC (pgAdmin4): + +- Create a new Server and connect to your GCP instance using Public IP from Database instance, username (e.g. 'postgres'), and password (e.g. 'churn'). + ![](./readme_files/advanced-data-generation/image8.png) + ![](./readme_files/advanced-data-generation/image2.png) + +This will create a connection from your PC to your Database on GCP. Now you can backup the data. +- Right-click `churn` DB on pgAdmin4 > **Backup** → Format: `Custom` → Save the file. + +### 3. Create a Local DB: + +- Right-click Databases > Create > `churn-local` + +### 4. Restore: + +- Right-click `churn-local` > Restore > select the `.backup` file (sometimes it will no appear as a '.backup', so search for "All files (*.*)" → Run. +- Some warnings (e.g. `google_vacuum_mgmt`) and/or "Failed" status can be ignored. + + ![](./readme_files/advanced-data-generation/image19.png) + +--- + +## ⚠️ Cleanup to Avoid Charges + +After completing your setup, **stop and delete your SQL and VM instances** on GCP to avoid billing (If you are not going to use it anymore). + +--- + +## ❓ Questions? + +If you need help, open an issue on GitHub or contact me on andrefeitosa9@gmail.com or **[my LinkedIn](https://www.linkedin.com/in/andrefeitosa/)** + +Happy simulating! 🚀 + diff --git a/readme_files/advanced-data-generation/image1.png b/readme_files/advanced-data-generation/image1.png new file mode 100644 index 0000000..4899e8d Binary files /dev/null and b/readme_files/advanced-data-generation/image1.png differ diff --git a/readme_files/advanced-data-generation/image10.png b/readme_files/advanced-data-generation/image10.png new file mode 100644 index 0000000..49db1e9 Binary files /dev/null and b/readme_files/advanced-data-generation/image10.png differ diff --git a/readme_files/advanced-data-generation/image11.png b/readme_files/advanced-data-generation/image11.png new file mode 100644 index 0000000..1c4945c Binary files /dev/null and b/readme_files/advanced-data-generation/image11.png differ diff --git a/readme_files/advanced-data-generation/image12.png b/readme_files/advanced-data-generation/image12.png new file mode 100644 index 0000000..679f66d Binary files /dev/null and b/readme_files/advanced-data-generation/image12.png differ diff --git a/readme_files/advanced-data-generation/image13.png b/readme_files/advanced-data-generation/image13.png new file mode 100644 index 0000000..c694dec Binary files /dev/null and b/readme_files/advanced-data-generation/image13.png differ diff --git a/readme_files/advanced-data-generation/image14.png b/readme_files/advanced-data-generation/image14.png new file mode 100644 index 0000000..55b7e96 Binary files /dev/null and b/readme_files/advanced-data-generation/image14.png differ diff --git a/readme_files/advanced-data-generation/image15.png b/readme_files/advanced-data-generation/image15.png new file mode 100644 index 0000000..3a792a8 Binary files /dev/null and b/readme_files/advanced-data-generation/image15.png differ diff --git a/readme_files/advanced-data-generation/image16.png b/readme_files/advanced-data-generation/image16.png new file mode 100644 index 0000000..ba4bc03 Binary files /dev/null and b/readme_files/advanced-data-generation/image16.png differ diff --git a/readme_files/advanced-data-generation/image17.png b/readme_files/advanced-data-generation/image17.png new file mode 100644 index 0000000..a48467e Binary files /dev/null and b/readme_files/advanced-data-generation/image17.png differ diff --git a/readme_files/advanced-data-generation/image18.png b/readme_files/advanced-data-generation/image18.png new file mode 100644 index 0000000..b165479 Binary files /dev/null and b/readme_files/advanced-data-generation/image18.png differ diff --git a/readme_files/advanced-data-generation/image19.png b/readme_files/advanced-data-generation/image19.png new file mode 100644 index 0000000..183c815 Binary files /dev/null and b/readme_files/advanced-data-generation/image19.png differ diff --git a/readme_files/advanced-data-generation/image2.png b/readme_files/advanced-data-generation/image2.png new file mode 100644 index 0000000..72dc1a6 Binary files /dev/null and b/readme_files/advanced-data-generation/image2.png differ diff --git a/readme_files/advanced-data-generation/image20.png b/readme_files/advanced-data-generation/image20.png new file mode 100644 index 0000000..9a3ce65 Binary files /dev/null and b/readme_files/advanced-data-generation/image20.png differ diff --git a/readme_files/advanced-data-generation/image21.png b/readme_files/advanced-data-generation/image21.png new file mode 100644 index 0000000..29639a3 Binary files /dev/null and b/readme_files/advanced-data-generation/image21.png differ diff --git a/readme_files/advanced-data-generation/image22.png b/readme_files/advanced-data-generation/image22.png new file mode 100644 index 0000000..2f669f9 Binary files /dev/null and b/readme_files/advanced-data-generation/image22.png differ diff --git a/readme_files/advanced-data-generation/image23.png b/readme_files/advanced-data-generation/image23.png new file mode 100644 index 0000000..01aba45 Binary files /dev/null and b/readme_files/advanced-data-generation/image23.png differ diff --git a/readme_files/advanced-data-generation/image24.png b/readme_files/advanced-data-generation/image24.png new file mode 100644 index 0000000..13e5f4b Binary files /dev/null and b/readme_files/advanced-data-generation/image24.png differ diff --git a/readme_files/advanced-data-generation/image25.png b/readme_files/advanced-data-generation/image25.png new file mode 100644 index 0000000..e14e384 Binary files /dev/null and b/readme_files/advanced-data-generation/image25.png differ diff --git a/readme_files/advanced-data-generation/image26.png b/readme_files/advanced-data-generation/image26.png new file mode 100644 index 0000000..6f28fa7 Binary files /dev/null and b/readme_files/advanced-data-generation/image26.png differ diff --git a/readme_files/advanced-data-generation/image27.png b/readme_files/advanced-data-generation/image27.png new file mode 100644 index 0000000..999a1b1 Binary files /dev/null and b/readme_files/advanced-data-generation/image27.png differ diff --git a/readme_files/advanced-data-generation/image28.png b/readme_files/advanced-data-generation/image28.png new file mode 100644 index 0000000..bcc39e2 Binary files /dev/null and b/readme_files/advanced-data-generation/image28.png differ diff --git a/readme_files/advanced-data-generation/image29.png b/readme_files/advanced-data-generation/image29.png new file mode 100644 index 0000000..bde013c Binary files /dev/null and b/readme_files/advanced-data-generation/image29.png differ diff --git a/readme_files/advanced-data-generation/image3.png b/readme_files/advanced-data-generation/image3.png new file mode 100644 index 0000000..c2dae01 Binary files /dev/null and b/readme_files/advanced-data-generation/image3.png differ diff --git a/readme_files/advanced-data-generation/image30.png b/readme_files/advanced-data-generation/image30.png new file mode 100644 index 0000000..f6b2525 Binary files /dev/null and b/readme_files/advanced-data-generation/image30.png differ diff --git a/readme_files/advanced-data-generation/image31.png b/readme_files/advanced-data-generation/image31.png new file mode 100644 index 0000000..5669958 Binary files /dev/null and b/readme_files/advanced-data-generation/image31.png differ diff --git a/readme_files/advanced-data-generation/image32.png b/readme_files/advanced-data-generation/image32.png new file mode 100644 index 0000000..1a48820 Binary files /dev/null and b/readme_files/advanced-data-generation/image32.png differ diff --git a/readme_files/advanced-data-generation/image33.png b/readme_files/advanced-data-generation/image33.png new file mode 100644 index 0000000..0f0f752 Binary files /dev/null and b/readme_files/advanced-data-generation/image33.png differ diff --git a/readme_files/advanced-data-generation/image4.png b/readme_files/advanced-data-generation/image4.png new file mode 100644 index 0000000..addb141 Binary files /dev/null and b/readme_files/advanced-data-generation/image4.png differ diff --git a/readme_files/advanced-data-generation/image5.png b/readme_files/advanced-data-generation/image5.png new file mode 100644 index 0000000..6100df9 Binary files /dev/null and b/readme_files/advanced-data-generation/image5.png differ diff --git a/readme_files/advanced-data-generation/image6.png b/readme_files/advanced-data-generation/image6.png new file mode 100644 index 0000000..966f3a6 Binary files /dev/null and b/readme_files/advanced-data-generation/image6.png differ diff --git a/readme_files/advanced-data-generation/image7.png b/readme_files/advanced-data-generation/image7.png new file mode 100644 index 0000000..b454376 Binary files /dev/null and b/readme_files/advanced-data-generation/image7.png differ diff --git a/readme_files/advanced-data-generation/image8.png b/readme_files/advanced-data-generation/image8.png new file mode 100644 index 0000000..458f03d Binary files /dev/null and b/readme_files/advanced-data-generation/image8.png differ diff --git a/readme_files/advanced-data-generation/image9.png b/readme_files/advanced-data-generation/image9.png new file mode 100644 index 0000000..b9d2106 Binary files /dev/null and b/readme_files/advanced-data-generation/image9.png differ