Create SLP files without linking to videos and parameter search #1884

cxrodgers · 2024-07-28T16:51:30Z

cxrodgers
Jul 28, 2024

Hello .. I have a large dataset of thousands of similar videos, and I'd like to determine the best way to analyzing them all, ideally without labeling frames from every single one. I'm facing an immediate technical issue, which is that the SLEAP GUI is a little slow to respond when I load that many videos in. I think it's because they're all located on a remote server, and fetching all that metadata + decoding frames is intrinsically slow. I also want to use my custom code to select frames to label, rather than the built-in options. So I'm wondering if it's possible to create an SLP file for labeling and/or training that doesn't include links to the videos, but rather the actual labeled frames themselves. Would this be a "labels file", a "training package", or something else? I understand some custom coding/hacking will be necessary to do that, it would just be really helpful if you could point me in the right direction to get started. I'd like to be able to create this file in a programmatic way, and then solely use the GUI for the user labeling (because the GUI works great for this and I don't want to reinvent that!)

This would also address a long-term concern, which is I'd like to train a ton of different SLEAP models with different hyperparameters on this training package, in order to select the best hyperparameters. Like a huge grid search. The goal is to find hyperparameters that best generalize to new videos (so that I can eventually stop labeling new videos, and just use my trained model, even when the video conditions change slightly). Is there any guidance on the most useful parameters to vary in such a grid search?

thanks for any tips!

EDIT

I think I figured it out! I followed these instructions to create my own labels file: #1534
And then I put my frames into individual image files, and referenced them with a sleap.Video.from_image_filenames
I was worried it wouldn't work since the aspect ratios differ, but this seems to be fine.

I would still appreciate any guidance about what parameters are most useful to vary in a grid search.

roomrys · 2024-07-29T18:57:18Z

roomrys
Jul 29, 2024
Maintainer

Hi @cxrodgers,

Glad you found part of your answer!

For the grid search:

Data Augmentation

Data augmentation will directly help account for variations. I would keep the rotation angle constant (+-180 deg if bird's eye view, otherwise +-15 deg), estimate and keep scale min/max constant (how zoomed in/out do you expect keypoints to be in relation to your ground truth labels), and keep random flip at None (we haven't seen much improvement using this parameter). The other augmentation parameters (uniform noise, gaussian noise, contrast, and brightness) you can include in your search to see what works best for your data.

Optimization

The title of this section makes it seem like we are optimizing for how fast the network trains, but the parameters here don't just affect training speed. I would keep the number of epochs, Stop Training on Plateau constant, and the number of plateau patience epochs constant (we never reach this number if we stop on plateau). In the grid search I would target initial learning rate and batch size (although a larger batch size may be memory constrained). Online mining could help, but a toss up of whether it's critical to training.

Model

For the unet backbone, focus the search on filters and filters rate (possibly also max stride), but keep the rest constant. The filters and filter rate will determine how much learning capacity your network has (if you have too much capacity and not enough complexity in your data, then this could lead to overfitting). I would recommend sticking to the unet architecture.
For the heads, you can search all the parameters here (output stride, sigma, and loss weight - if any). This section impacts keypoint location accuracy and balancing different losses from multiple heads (if there are multiple heads). If you are using a top-down model, then you can keep the anchor point constant, but ensure you set it to a node that is visible most of the time (otherwise the centroid model won't even detect an animal). I created a gist to plot node visibility a while back that might be useful here.

As for the values to use, I would vary slightly from the preset values that are present. (The presets are the values we found to be most useful in our own grid search.)

Thanks,
Liezl

2 replies

cxrodgers Aug 3, 2024
Author

Thanks @roomrys , this is super useful! I have one simple question, which I should know the answer to but don't. All of my videos are single animal. Doesn't this mean that there's really only one type of model (single instance), and the top-down and bottom-up models are not relevant for me? I know that I can explore all of the options you mentioned within "single instance". I just want to make sure there isn't another choice other than "single instance" that I need to consider.

talmo Aug 16, 2024
Maintainer

Hey @cxrodgers,

Sorry for the delay but just to circle back to your Q: you can totally use the top down and bottom up micros for single animal data and they'll often work better.

The rain is that the extra structure imposed by the multi-instance models can often be beneficial for improving predictions since they force the model to reason about the connectivity of relative geometry of the body parts.

The downside is that they might predict extra instances, so a few versions ago we added the capability to limit the max number of instances produced during inference with these models, so you can effectively turn them into single instance models :)

Let us know if you have any other questions!

Cheers,

Talmo

eberrigan · 2024-07-29T20:01:07Z

eberrigan
Jul 29, 2024
Collaborator

You could also export a SLEAP package to have your videos embedded in the file, instead of referenced elsewhere.

2 replies

cxrodgers Aug 3, 2024
Author

Thanks @eberrigan ! The reason I was a bit hesitant about this approach is I found a discussion on here where someone was having trouble converting a SLEAP package back to a SLP file. (I can't find the discussion again, sorry.) So I think it's an irreversible conversion, right? Is there anything that is lost other than the paths to the original videos?

eberrigan Aug 8, 2024
Collaborator

That is correct. The workflow should be 1. label on your desktop using something like "labels.v000.slp" 2. Make a training package with labels and suggested frames embedded (which is much smaller than the whole project with a filename "labels.v000.pkg.slp" 3. Train on your cluster and get the model folder 4. Save a new version of your labels like "labels.v001.slp" using "Save File As" 5. Get predictions on your new labels file with the updated model 6. Repeat

talmo · 2024-07-29T20:40:12Z

talmo
Jul 29, 2024
Maintainer

Chiming in with another tip:

You can also use sleap-io, our standalone library for I/O-related tasks, to create a custom SLEAP labels file, including as a training package with embedded images.

Generating splits with embedded images:

import sleap_io as sio

# Load source labels.
labels = sio.load_file("labels.v001.slp")

# Make splits and export with embedded images.
labels.make_training_splits(n_train=0.8, n_val=0.1, n_test=0.1, save_dir="split1", seed=42)

# Splits will be saved as self-contained SLP package files with images and labels.
labels_train = sio.load_file("split1/train.pkg.slp")
labels_val = sio.load_file("split1/val.pkg.slp")
labels_test = sio.load_file("split1/test.pkg.slp")

The docs also have examples of how to build the labels via the API as well.

1 reply

cxrodgers Aug 3, 2024
Author

Thanks @talmo !! I basically did something like this, but using the sleap package instead of sleap-io. Is it the case that sleap-io is basically just a subset of sleap functionality, so I can more or less drop in sleap-io whenever I don't need all the extra functions of sleap?

By the way, I noticed that even though I was doing stuff that you all probably never intended users to do (like mixing frames of different size into the same Video object), everything still worked well. It shows that sleap is very well designed!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create SLP files without linking to videos and parameter search #1884

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Create SLP files without linking to videos and parameter search #1884

cxrodgers Jul 28, 2024

EDIT

Replies: 3 comments · 5 replies

roomrys Jul 29, 2024 Maintainer

Data Augmentation

Optimization

Model

cxrodgers Aug 3, 2024 Author

talmo Aug 16, 2024 Maintainer

eberrigan Jul 29, 2024 Collaborator

cxrodgers Aug 3, 2024 Author

eberrigan Aug 8, 2024 Collaborator

talmo Jul 29, 2024 Maintainer

cxrodgers Aug 3, 2024 Author

cxrodgers
Jul 28, 2024

Replies: 3 comments 5 replies

roomrys
Jul 29, 2024
Maintainer

cxrodgers Aug 3, 2024
Author

talmo Aug 16, 2024
Maintainer

eberrigan
Jul 29, 2024
Collaborator

cxrodgers Aug 3, 2024
Author

eberrigan Aug 8, 2024
Collaborator

talmo
Jul 29, 2024
Maintainer

cxrodgers Aug 3, 2024
Author