Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory. #57

Open
FiorenzoParascandolo1 opened this issue Jan 15, 2025 · 0 comments
Open

CUDA out of memory. #57

FiorenzoParascandolo1 opened this issue Jan 15, 2025 · 0 comments

Comments

@FiorenzoParascandolo1
Copy link

Hi,
I'm using KANLinear in my own project. I have a problem of CUDA out of memory.
Specifically:

  • the model A uses a unique MLP layer (1 and only 1 MLP layer in the whole network) to map a (160, 8, 197, 197) vector in a (160, 8, 197, 197) vector.
  • the model B uses a unique KANLinear layer (1 and only 1 KANLinear layer in the whole network) to map a (160, 8, 197, 197) vector in a (160, 8, 197, 197) vector.

The "whole network" is a transformer based on MLP both for model A and model B.
The model A uses the 60% of the VRAM of a GPU with 24GB of VRAM, while the second model shows CUDA out of memory problem. Since the difference in the number of parameters for the two models is negligible:
the difference is equal to the difference between a single nn.Linear(197, 197) and a single KanLinear(197, 197), how is it possible to have a CUDA out of memory problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant