-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
having trouble with convergence #21
Comments
I have the same question. |
Does lowering your learning rate help? |
Actually, the blobs example in general is fairly unreliable - I can get poor results occasionally after repeated runs. Honestly, I didn't do any tuning of hyperparams - it was just a small, fast experiment to validate the implementation when I was writing it. If you find hyperparams that work better, please share them and I can update the example. |
for me, the biggest discrepancy with the blobs example was that source-only training resulted in trivial (50%) accuracy. did you get this as well? as for hyperparameters, adding more dimensions to the feature extractor (i.e. going from 8 to 50) was what allowed it to converge at all for me. |
I take it that that 50% was the source accuracy, not the target accuracy? In that case, there is certainly something wrong, but 50% accuracy on the target class is not unusual if you only train on the source. One thing that might help is annealing the gradient reversal parameter. I do this in the MNIST example, following the schedule presented in the paper, but for the blobs example I keep it fixed at -1 throughout training. That is almost certainly not the optimal thing to do. |
Hi, first off thank you for the wonderful code. I am trying to replicate the toy blob example in pytorch. I am finding that it unreliably converges to the same accuracies that you report. Sometimes it will not converge at all, and other times it will get to the 97% source/97% target accuracy. Also, the source-only training yields a 50% accuracy on target domain. I was wondering if there were any snags you encountered that hindered convergence?
Thanks
Austin
The text was updated successfully, but these errors were encountered: