Open
Description
Description
I attempted to train the official DropoutNet model using the provided sample Taobao dataset and the sample configuration file. However, during training, I observed that the AUC decreased and the losses increased as the training steps progressed. Based on my understanding, the expected behavior is that the AUC should increase and the losses should decrease as training continues.
Steps to reproduce
OS: Ubuntu 20.04
GPU: 1 NVIDIA RTX 3090
Python: 3.10.16
TensorFlow: 2.14.0 with CUDA
- git clone the easyrec repo (commit SHA: 4b0b1f5)
- install easyrec
- download the sample taobao dataset:
wget http://easyrec.oss-cn-beijing.aliyuncs.com/data/git_oss_sample_data/data_test_tb_data_b1579db090d72b3b70b59ba3c7692701 -O tb_data.tar.gz
tar -zxf tb_data.tar.gz
- run the training with the sample dropoutnet config and sample dataset
python -m easy_rec.python.train_eval --pipeline_config_path samples/model_config/dropoutnet_on_taobao.config
Actual training result
TensorBoard:
tensorboard --logdir experiments/dropoutnet_taobao_ckpt/eval_val
Expected behavior
- AUC should increase with more training steps.
- Losses should decrease with more training steps.
Could you please confirm if this is expected behavior or if there might be an issue with the sample configuration or dataset? If additional debugging information is needed, I am happy to provide more details.
Thank you!
Metadata
Metadata
Assignees
Labels
No labels