Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large scale dataset training #29

Open
Ruazzm opened this issue Jun 14, 2024 · 1 comment
Open

Large scale dataset training #29

Ruazzm opened this issue Jun 14, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@Ruazzm
Copy link

Ruazzm commented Jun 14, 2024

Hi,
I have encountered an issue where the dataset I entered is too large to be read, and if it is particularly large, , it can cause the process to be Killed. For example,

Loading extension module split_decision...
Using /root/.cache/torch_extensions/py38_cu118 as PyTorch extensions root...
No modifications detected for re-loaded extension module split_decision, skipping build step...
Loading extension module split_decision...
Killed

How can I solve this problem? Does PGBM support batch training?
Thanks

@elephaint
Copy link
Owner

Thanks for reporting, looking into it. PGBM currently doesn't support batch training, unfortunately. I'd suggest to try the CPU version based on Sklearn - let me know if that one does work for you?

@elephaint elephaint self-assigned this Jul 29, 2024
@elephaint elephaint added the bug Something isn't working label Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants