-
Notifications
You must be signed in to change notification settings - Fork 46
Description
It has become increasingly important to allow pose estimation models to train on heterogeneous datasets, which are composed of multiple sub-datasets that may be similar in appearance (e.g., head-fixed mice, or top-down views of mice) but have different subsets of labeled keypoints.
In order to accommodate training on such heterogeneous datasets, Lightning Pose must be able to distinguish between labels missing due to occlusions, and labels missing due to not being part of the original keypoint set for that dataset. Therefore, we propose to introduce a new (optional) coding scheme in the labeled data csv file to account for these different scenarios. We will follow the COCO format, such that we add a third column v for each keypoint which codes the visibility:
visibility flag v defined as v=0: not labeled; v=1: labeled but not visible; and v=2: labeled and visible. A keypoint is considered visible if it falls inside the object segment.
By default, keypoints where v=0 are not included in the model loss; keypoints where v=1 behavior will be controlled by the existing config option training.uniform_heatmap_for_nan_keypoints; and keypoints where v=2 will have the standard heatmap loss applied.