Inference for low gpu and less number of points #13

zeynytu · 2024-03-11T09:16:34Z

Hello
You have done a great work I really appriciate it !
I have been trying to run the model to track some specific points on videos but I could not figure out how to do that exactly. I tried the format

model({"video": video[None], "query_points": torch.Tensor([[[1, 15, 51]]]).cuda()},
but GPU ran out of memory. Am I doing it right or is there any other method to do this ?

The text was updated successfully, but these errors were encountered:

16lemoing · 2024-03-11T12:42:11Z

Hi @zeynytu, our method is meant to track all pixels in a frame together. If you want to track only a few points, you have two options. Either (1) using point tracking directly --model pt, or (2) using our method to track densely and then deduce the tracks for the points you are interested in --model dot. The inference mode in both cases is "tracks_for_queries" as is done here:

dot/test_tap.py

Line 35 in cdee971

pred = model(gt, mode="tracks_for_queries", **vars(args))

Please provide more information on your GPU setup, video length and spatial resolution if you need further assistance on the OOM errors.

zeynytu · 2024-03-16T13:53:29Z

Actually, I have a long video, and I want to track around 50 points in specific coordinates. The length of the video is not a big deal; I can trim the video into separate parts. The GPU is nvidia 3060 ti with 8 GB vRAM.

Billy-ZTB · 2024-09-04T10:36:10Z

Hi @zeynytu, our method is meant to track all pixels in a frame together. If you want to track only a few points, you have two options. Either (1) using point tracking directly --model pt, or (2) using our method to track densely and then deduce the tracks for the points you are interested in --model dot. The inference mode in both cases is "tracks_for_queries" as is done here:

dot/test_tap.py

Line 35 in cdee971

pred = model(gt, mode="tracks_for_queries", **vars(args))

Please provide more information on your GPU setup, video length and spatial resolution if you need further assistance on the OOM errors.

So, in the function 'interpolate', S=H*W?

16lemoing · 2024-09-10T15:28:55Z

Hi @Billy-ZTB, S is the number of initial tracks (which are then densified) so in general S<<H*W.

Billy-ZTB · 2024-09-12T02:25:02Z

Hi @Billy-ZTB, S is the number of initial tracks (which are then densified) so in general S<<H*W.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference for low gpu and less number of points #13

Inference for low gpu and less number of points #13

zeynytu commented Mar 11, 2024

16lemoing commented Mar 11, 2024

zeynytu commented Mar 16, 2024

Billy-ZTB commented Sep 4, 2024

16lemoing commented Sep 10, 2024

Billy-ZTB commented Sep 12, 2024

Inference for low gpu and less number of points #13

Inference for low gpu and less number of points #13

Comments

zeynytu commented Mar 11, 2024

16lemoing commented Mar 11, 2024

zeynytu commented Mar 16, 2024

Billy-ZTB commented Sep 4, 2024

16lemoing commented Sep 10, 2024

Billy-ZTB commented Sep 12, 2024