-
Notifications
You must be signed in to change notification settings - Fork 222
wip: add detr keypoint architecture #1182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
37e04fb to
5d3e5f5
Compare
|
High level:
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1182 +/- ##
==========================================
- Coverage 87.38% 85.52% -1.86%
==========================================
Files 20 22 +2
Lines 2569 3165 +596
==========================================
+ Hits 2245 2707 +462
- Misses 324 458 +134
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@bw4sz integration is done, I'll add tests. Lots of TODOs dotted around which relate to tidying up and merging some duplicated code, but I anticipate polygons/segmentation will follow in a similar way. The pre-commit failure is weird. It thinks that pandas is undefined, but the import is unchanged and ruff passes locally. |
|
Currently training a DETR backbone on the lidar pretrain dataset, will swap that in once done. |
d6f90e9 to
bd9a09a
Compare
|
|
||
| def load_image(self, idx): | ||
| img_name = os.path.join(self.root_dir, self.image_names[idx]) | ||
| img_name = os.path.join(self.root_dir, os.path.basename(self.image_names[idx])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove this, or make it an option.
8eb5caf to
53820cd
Compare
53820cd to
4ed7f25
Compare
@bw4sz no direct training integration yet, but this is what I'm thinking for the modeling.
The implementation looks verbose, but that's because a lot of the pieces are copied from
transformerswith very minor changes - like a reference to bbox changed to point. To modify the arch we need:DeformableDetrKeypointConfig: Same as object detection but explicit parameter for point loss cost.DeformableDetrKeypointDetectionOutput: Mostly similar but name changes.DeformableDetrForKeypointDetection: Model, changes number of decoder outputs to 2 and removes some bbox specific bits.DeformableDetrKeypointMatcher: Hungarian matcher that uses L1 or L2 cost instead of IoU.DeformableDetrKeypointLoss: L1 or L2.DeformableDetrKeypointImageProcessor: Normalizeskeypointsto relative pixel coords and some other bits.I don't anticipate checking in the current test suite as the overfit test takes a minute or two to run, but it does converge.