Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference for low gpu and less number of points #13

Open
zeynytu opened this issue Mar 11, 2024 · 5 comments
Open

Inference for low gpu and less number of points #13

zeynytu opened this issue Mar 11, 2024 · 5 comments

Comments

@zeynytu
Copy link

zeynytu commented Mar 11, 2024

Hello
You have done a great work I really appriciate it !
I have been trying to run the model to track some specific points on videos but I could not figure out how to do that exactly. I tried the format

model({"video": video[None], "query_points": torch.Tensor([[[1, 15, 51]]]).cuda()},
but GPU ran out of memory. Am I doing it right or is there any other method to do this ?

@16lemoing
Copy link
Owner

Hi @zeynytu, our method is meant to track all pixels in a frame together. If you want to track only a few points, you have two options. Either (1) using point tracking directly --model pt, or (2) using our method to track densely and then deduce the tracks for the points you are interested in --model dot. The inference mode in both cases is "tracks_for_queries" as is done here:

pred = model(gt, mode="tracks_for_queries", **vars(args))

Please provide more information on your GPU setup, video length and spatial resolution if you need further assistance on the OOM errors.

@zeynytu
Copy link
Author

zeynytu commented Mar 16, 2024

Actually, I have a long video, and I want to track around 50 points in specific coordinates. The length of the video is not a big deal; I can trim the video into separate parts. The GPU is nvidia 3060 ti with 8 GB vRAM.

@Billy-ZTB
Copy link

Hi @zeynytu, our method is meant to track all pixels in a frame together. If you want to track only a few points, you have two options. Either (1) using point tracking directly --model pt, or (2) using our method to track densely and then deduce the tracks for the points you are interested in --model dot. The inference mode in both cases is "tracks_for_queries" as is done here:

pred = model(gt, mode="tracks_for_queries", **vars(args))

Please provide more information on your GPU setup, video length and spatial resolution if you need further assistance on the OOM errors.

So, in the function 'interpolate', S=H*W?
image

@16lemoing
Copy link
Owner

Hi @Billy-ZTB, S is the number of initial tracks (which are then densified) so in general S<<H*W.

@Billy-ZTB
Copy link

Hi @Billy-ZTB, S is the number of initial tracks (which are then densified) so in general S<<H*W.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants