Fine-tuning DeepForest for Sapling Detection with Label Studio #1108

tripathi-shiv · 2025-08-25T14:07:19Z

tripathi-shiv
Aug 25, 2025

Hello,

I’m working on a project where I’d like to adapt the DeepForest model to specifically identify saplings (young trees), rather than general canopy crowns. Since the pretrained model is optimized for canopy detection, I want to fine-tune it with a custom dataset of saplings.

Here’s my approach so far:

I’m using Label Studio for annotation.

I set up the Label Studio ML backend in an attempt to use DeepForest for pre-annotations before correcting and expanding the dataset.

However, I wasn’t able to get any pre-annotations showing up — it seems like the backend code isn’t working as expected.

My questions are:

Is there an established workflow for integrating DeepForest with Label Studio ML backend for pre-annotation?

Should I treat saplings as a separate class in the annotations (e.g., "sapling") or train DeepForest only with sapling labels for best results?

Since saplings are much smaller than full crowns, are there adjustments you recommend (e.g., tile size, anchor scales, NMS settings) to improve detection of smaller objects?

Are there any examples or pointers for successfully setting up DeepForest + Label Studio integration?

My end goal is to build a fine-tuned sapling detection model while still leveraging DeepForest’s pretrained backbone. Any guidance on debugging the ML backend or alternative recommended annotation pipelines would be super helpful 🙏

Thanks for building such a powerful tool!

jveitchmichaelis · 2025-09-03T17:26:01Z

jveitchmichaelis
Sep 3, 2025
Maintainer

Sorry for the delay in replying, here are some thoughts:

Is there an established workflow for integrating DeepForest with Label Studio ML backend for pre-annotation?

Could you share your backend code?

It'd be helpful to understand if you're not getting any detections from DeepForest or if its the connection to Label Studio that's not working. I don't think we have an established workflow documented, but this would be useful to have. It's something we're actively working on for some other features in DeepForest, so a straightforward integration example would be nice. But it should be a case of:

Predict via whatever means with DeepForest.
Convert/transform annotations to use with Label Studio.
Use the Label Studio API/SDK to authenticate and upload samples for annotation.

The DeepForest default models are well tested, so unless your images are particularly challenging for the model you should have some boxes to start with.

Should I treat saplings as a separate class in the annotations (e.g., "sapling") or train DeepForest only with sapling labels for best results?

Either way would probably work, I would label them as a separate class for now. You can always change the class name from sapling > tree later. For example if you export your labels, merging the classes is essentially a find + replace. Going from tree -> sapling later would require a lot more work.

The important thing is that you annotate your images as completely as you can. If you only want to locate saplings then you don't need to label trees in your new training data. That might save a lot of time. If you want both trees and saplings then you need separate classes and you also need to label all the trees in your data. Whether it makes sense to do each one probably depends on your dataset, and without seeing any images I can't give you a good answer yet.

Since saplings are much smaller than full crowns, are there adjustments you recommend (e.g., tile size, anchor scales, NMS settings) to improve detection of smaller objects?

Spatial scale of the images is probably the most important. A good rule of thumb (in my experience) is to have labels at least 20 px wide. I would label and train at 5cm/px if you have that resolution available.

NMS settings shouldn't make a difference, unless you have saplings that fall inside the corner of a tree box. NMS is used to deal with overlapping predictions, it doesn't affect what boxes the model predicts on its own.

You could try adding some smaller anchor scales, or dropping the scales that detect larger objects if you never expect to see them. I would suggest not touching that unless you really have to. The more you twiddle anchors the more retraining you're likely to need. Let's figure out the connection issues first.

That said, we do have models which detect birds which are very small compared to trees, I don't know that those models have particularly different anchors but @bw4sz could comment more on that.

Are there any examples or pointers for successfully setting up DeepForest + Label Studio integration?

@henrykironde or @bw4sz can probably give you a better answer here. Again, please share any code you currently have or outline what you've tried. As I said above, try to use as much library code from Label Studio as you can. You just need to export annotations from DeepForest and then upload them. https://api.labelstud.io/api-reference/introduction/getting-started

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fine-tuning DeepForest for Sapling Detection with Label Studio #1108

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Fine-tuning DeepForest for Sapling Detection with Label Studio #1108

Uh oh!

tripathi-shiv Aug 25, 2025

Replies: 1 comment

Uh oh!

Uh oh!

jveitchmichaelis Sep 3, 2025 Maintainer

tripathi-shiv
Aug 25, 2025

jveitchmichaelis
Sep 3, 2025
Maintainer