Debugging class imbalance #3142
-
I've been investigating ways to improve a model that predicts a binary feature, Currently my loss is ~0.07, which I think I want around <0.02 based on what I've read here. Here is my dataset as well as my config, although I've tweak the config quite a bit throughout the course of this. Config: https://drive.google.com/file/d/1irPEMR7HFILD3G2GhblCF_Rw5PGKEIsR/view?usp=sharing Attempt #1: Decrease the thresholdI've modified my yml file as follows. output_features:
- name: submitted_proposal
type: binary
decoder:
threshold: 0.01 After running train I'm not sure if this is actually doing anything based on the output of
I've also tried changing the top level key. It's possible the threshold in the output from the train command is different than what it's using. Attempt #2: Increase positive_class_weightThe default positive_class_weight is 1. I've tried changing it to 2 and 10. output_features:
- name: submitted_proposal
type: binary
loss:
type: binary_weighted_cross_entropy
positive_class_weight: 10 This increases loss to 0.11-0.33. Attempt #3: Increase confidence_penaltyoutput_features:
- name: submitted_proposal
type: binary
loss:
type: binary_weighted_cross_entropy
positive_class_weight: 1
confidence_penalty: 0.01 I wasn't able to get this to work - I get the following error for any value I put in
Any other idea to improve this model that are beyond feature engineering (which is definitely something we're looking into). |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @Overload119! Thanks for explaining the steps you took so clearly - it was useful and made it easy to follow along. I had a chance to use the same dataset and config and run a variety of tests and wanted to share those results with you.
These are really good performance metrics for a classification task. To also support this claim, here are some plots I created: All of these plots show that the model is extremely good at discriminating between the two classes (and you can plot all three of them as well using Ludwig 0.7)
The main things to note here are swapping out the default combiner (
This is marginally better than just using the defaults that come out of Ludwig, but I think any further improvements would be going down the road of diminishing returns because the model has already overfit and the performance at the point of overfitting is already very very good. Let me know if this helps and if there are other questions you might have. |
Beta Was this translation helpful? Give feedback.
Hi @Overload119! Thanks for explaining the steps you took so clearly - it was useful and made it easy to follow along. I had a chance to use the same dataset and config and run a variety of tests and wanted to share those results with you.
The snapshot of the dataset you added to Google Drive is balanced, with almost an exact 50-50 split between 1 and 0s. In case you balanced these datasets out manually by dropping rows from the majority class so that the majority and minority classes were equal, it may be an unfair representation of the true dataset (and what you will actually see in a production scenario when you're running inference against your trained model). Instead, I would sugge…