This is a Bayesian model that predicts employee retention and analyzes potential reasons for turnover based on various factors including workload, salary, and tenure.
- Predicts expected length of employment
- Calculates probability of employee leaving
- Analyzes most likely reason for departure (if applicable)
- Provides confidence intervals for predictions
- Uses Bayesian inference for robust uncertainty quantification
- Python 3.8 or higher
- Dependencies listed in
requirements.txt
- Clone this repository
- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
The model requires two CSV files:
- Historical data file for training (see
data.example.csv
) - Current employees file for prediction (see
current_employees.example.csv
)
python run.py --learning data.csv --input current_employees.csv --output report.csv
Must include the following columns:
employee_id
: Unique identifier (email or employee ID)months_in_company
: Duration of employmentworkload_quota
: Percentage of workload achievement (e.g., 0.95 = 95%)salary_ratio
: Employee's salary as percentage of reference salary (e.g., 1.10 = 110%)stayed
: Boolean (1 = still employed, 0 = left)termination_reason
: Code for why employee left (0 = N/A, 1 = underperformance, 2 = better offer)
Must include:
employee_id
: Unique identifier (email or employee ID)months_in_company
: Current duration of employmentworkload_quota
: Current workload achievementsalary_ratio
: Current salary ratio
The output CSV will contain all input columns plus:
employee_id
: Employee identifier from inputpredicted_longevity
: Expected total months of employmentlongevity_ci_50_low/high
: 50% confidence intervallongevity_ci_75_low/high
: 75% confidence intervalprobability_of_leaving
: Chance of employee leaving (0-1)predicted_reason
: Most likely reason if leavingprob_stays
: Probability of stayingprob_underperformance
: Probability of leaving due to underperformanceprob_better_offer
: Probability of leaving for a better offer
The repository includes example files:
data.example.csv
: Example historical datacurrent_employees.example.csv
: Example current employees data
You can test the model using these files:
python run.py --learning data.example.csv --input current_employees.example.csv --output report.csv
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.