GSOC 2025 || Discussion on Proposal 3 - Airborne Wildlife Benchmark Dataset #984
Abhishek-Dimri
started this conversation in
General
Replies: 2 comments
-
I am not the mentor so IDK the actual parts of this project. But, I guess this issue is related to it #915. It looks kind of related. |
Beta Was this translation helpful? Give feedback.
0 replies
-
This is all sounds good, but just dropping any of the vision agent or LLM integration that is a whole other, very exploratory project that shouldn't involve this. I think the community has moved more to COCO annotation, but more importantly the entire idea behind wrapping datasets into pytorch datasets is to avoid any of that from the user side, the user just sees
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @bw4sz, @henrykironde, and @ethanwhite,
I’m excited to work on Proposal 3: The Airborne Wildlife Benchmark Dataset and would love to discuss my approach before finalizing my proposal. Below is an outline of my understanding, project breakdown, and a few key questions to ensure alignment with project goals.
Understanding the Problem & Approach
Airborne wildlife datasets are fragmented, inconsistently annotated, and lack standardization, making it difficult to train general-purpose animal detectors. Inspired by MillionTrees, this project aims to create a MillionAnimals benchmark—organizing datasets, training a general wildlife detector, and integrating it with DeepForest.
🔹 Proposed Workflow:
Key Discussion Points & Questions
🔸 Dataset Standardization
Pascal VOC
is the preferred annotation format for DeepForest. Should we stick to this format for MillionAnimals, or is there a need to explore other formats?🔸 Baseline Model Development
🔸 DeepForest & VisionAgent Integration
Proposed Deliverables
📌 Minimal Deliverables:
✅ Standardized MillionAnimals benchmark dataset
✅ Baseline wildlife detection model
✅ DeepForest integration for reproducible training
✅ Complete documentation for dataset and model usage
🚀 Stretch Goals (If Time Permits):
Next Steps
📌 1. Feedback on Approach: Does this plan align with the project’s objectives? Any suggestions to refine it?
📌 2. Resources & References: Any specific repositories or datasets I should explore before finalizing my proposal?
📌 3. Getting Started: To validate feasibility, I could start by standardizing a small subset of airborne datasets and training a minimal model. Would this be a useful starting point?
Looking forward to your insights!
Best,
Abhishek Dimri
Beta Was this translation helpful? Give feedback.
All reactions