-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Symmetry for a Tennis Ball #249
Comments
Damn, when I developed the symmetry handling, I knew somebody would try to model a sphere at some point, but I hoped it wouldn't be so soon. Spheres are not really well supported at the moment. :-( The computation of symmetries happens in this function: Deep_Object_Pose/scripts/nvisii_data_gen/utils.py Lines 1490 to 1535 in c6d0033
What it does is that it multiplies each discrete symmetry with each continuous symmetry. It does not multiply a continuous symmetry with another continuous symmetry. That is how the original code from the BOP toolkit did it, and I didn't fix it. It works for all objects except spheres. This is because there's no geometrical body with 2 continuous symmetries, and the sphere is the only body with 3 continuous symmetries. You can however fake it: You simply pre-compute two of the continuous symmetries and put them into discrete_symmetries. I've written a script for you that does this: #!/usr/bin/env python3
from math import sin, cos, pi
import numpy as np
import simplejson as json
DISCRETE_STEPS_COUNT = 64
def rpy_to_transformation_matrix(roll=0.0, pitch=0.0, yaw=0.0):
R = np.zeros((4, 4))
R[0, 0] = cos(yaw) * cos(pitch)
R[0, 1] = -sin(yaw) * cos(roll) + cos(yaw) * sin(pitch) * sin(roll)
R[0, 2] = sin(yaw) * sin(roll) + cos(yaw) * sin(pitch) * cos(roll)
R[0, 3] = 0.0
R[1, 0] = sin(yaw) * cos(pitch)
R[1, 1] = cos(yaw) * cos(roll) + sin(yaw) * sin(pitch) * sin(roll)
R[1, 2] = -cos(yaw) * sin(roll) + sin(yaw) * sin(pitch) * cos(roll)
R[1, 3] = 0.0
R[2, 0] = -sin(pitch)
R[2, 1] = cos(pitch) * sin(roll)
R[2, 2] = cos(pitch) * cos(roll)
R[2, 3] = 0.0
R[3, 0] = 0.0
R[3, 1] = 0.0
R[3, 2] = 0.0
R[3, 3] = 1.0
return R
symmetries_discrete = []
for roll in np.arange(DISCRETE_STEPS_COUNT) / DISCRETE_STEPS_COUNT * 2.0 * np.pi:
for pitch in np.arange(DISCRETE_STEPS_COUNT) / DISCRETE_STEPS_COUNT * 2.0 * np.pi:
R = rpy_to_transformation_matrix(roll=roll, pitch=pitch)
symmetries_discrete.append(list(R.flatten()))
model_info_dict = {
"symmetries_discrete": symmetries_discrete,
"symmetries_continuous": [{"axis": [0, 0, 1], "offset": [0, 0, 0]}],
"align_axes": [{"object": [0, 1, 0], "camera": [0, 0, 1]}, {"object": [0, 0, 1], "camera": [0, 1, 0]}]
}
with open("model_info.json", "w") as fp:
json.dump(model_info_dict, fp, indent=4)
I think you only have two options:
If you strictly follow the definition of symmetries, then the logo and the rest of the texture makes your object non-symmetrical. I believe however that DOPE will do much better if you go with option 2. |
Thank you so much, Martin. I will try out your script, and let you know the results. |
Hello. I ran the script above and was able to successfully generate a few new images. When I trained the dope neural network on this data, the performance was significantly better, and I was able to successfully use the inference.py script. However, the image generation is incredibly slow. I'm running 5 gpus around the clock and still can only generate 480 images per day. I'm assuming I will need 40-50k images, which makes the current approach infeasible for my timeline. Are there any alternatives or solutions to this issue? Thank you for your help. |
Ah damn, I was afraid that would happen. The problem is that what my proposal is doing is very inefficient:
Since in this special case, what we really want is the identity rotation: The "best symmetry_corrected transform" is the one where the rotation of the object is (0, 0, 0) in the camera frame. Here is a dirty hack that directly sets the identity transform. WARNING: Now it only works for spheres, but it should solve your problem! diff --git a/scripts/nvisii_data_gen/utils.py b/scripts/nvisii_data_gen/utils.py
index 0e624a1..ff7bf6b 100644
--- a/scripts/nvisii_data_gen/utils.py
+++ b/scripts/nvisii_data_gen/utils.py
@@ -1348,72 +1348,10 @@ def update_symmetry_corrected_transform(obj_id, model_info, symmetry_transforms,
identical for two poses which are symmetrical.
"""
- # convert align_axes vectors to visii.vec3
- align_axes = []
- try:
- # e.g., "align_axes": [{"object": [0, 1, 0], "camera": [0, 0, 1]}, {"object": [0, 0, 1], "camera": [0, 1, 0]}]
- for axis_pair in model_info["align_axes"]:
- object_axis = axis_pair["object"]
- object_axis = visii.vec3(object_axis[0], object_axis[1], object_axis[2])
- camera_axis = axis_pair["camera"]
- camera_axis = visii.vec3(camera_axis[0], camera_axis[1], camera_axis[2])
- align_axes.append({"object": object_axis, "camera": camera_axis})
- except (KeyError, IndexError):
- align_axes.append({"object": visii.vec3(1, 0, 0), "camera": visii.vec3(0, 0, 1)})
-
- # find the symmetry transform with the best alignment between the first object_axis and the first camera_axis,
- # using the second object_axis/camera_axis as a tie breaker
-
- cam_matrix = visii.entity.get(camera_name).get_transform().get_world_to_local_matrix()
-
- # calculate rotation error for each symmetry transform
- def calc_alignment_score(object_axis, camera_axis, symmetry_trans):
- """
- Transforms the `object_axis` vector into the camera frame using the transformation `trans` and calculates
- the "alignment score" between the rotated vector and the `camera_axis` vector. The "alignment score" is
- just the dot product between the vectors, so `math.acos(alignment_score)` is the angle between the vectors.
- `alignment_score` is between -1 (180 degree angle, worst alignment) and 1 (0 degree angle, best alignment).
- """
- mat_trans = visii.transform.get(f"{obj_id}").get_local_to_world_matrix() * symmetry_trans
- sym_object_to_camera_rotation = visii.quat(cam_matrix * mat_trans)
- sym_object_axis = sym_object_to_camera_rotation * object_axis
- return visii.dot(sym_object_axis, camera_axis)
-
- # score according to the first element of align_axes
- scored_symmetry_transforms = []
- for i, trans in enumerate(symmetry_transforms):
- alignment_score = calc_alignment_score(align_axes[0]["object"], align_axes[0]["camera"], trans)
- scored_symmetry_transforms.append((alignment_score, trans))
-
- # Sort from best to worst. Using only first element (score) for sorting to
- # avoid undefined "<" operator between second element (transforms).
- scored_symmetry_transforms.sort(key=lambda x: x[0], reverse=True)
-
- best_symmetry_transform = scored_symmetry_transforms[0][1]
-
- if len(align_axes) >= 2:
- # collect best transforms with identical score
- if len(scored_symmetry_transforms) == 0:
- return
- reference_score = scored_symmetry_transforms[0][0]
- top_symmetry_transforms = []
- EPSILON = 1e-3
- for (score, symmetry_transform) in scored_symmetry_transforms:
- if math.fabs(score - reference_score) > EPSILON:
- break
- top_symmetry_transforms.append(symmetry_transform)
-
- # use second element of align_axes as a tie breaker
- max_alignment_score = float("-inf")
- for trans in top_symmetry_transforms:
- alignment_score = calc_alignment_score(align_axes[1]["object"], align_axes[1]["camera"], trans)
- if alignment_score > max_alignment_score:
- max_alignment_score = alignment_score
- best_symmetry_transform = trans
-
# update the symmetry_corrected transform
+ object_rotation_quat = visii.transform.get(f"{obj_id}").get_rotation()
symmetry_corrected_trans = visii.transform.get(f"{obj_id}_symmetry_corrected")
- symmetry_corrected_trans.set_transform(best_symmetry_transform)
+ symmetry_corrected_trans.set_rotation(visii.inverse(object_rotation_quat))
# Modified from https://github.com/thodan/bop_toolkit/
--
2.25.1 |
Thank you for the script. Should I leave the generated model_info.json inside the tennis ball's directory? |
With the hack above, the model_info.json doesn't matter anymore. You can leave it or delete it, it doesn't make a difference. |
Sounds good. I'm trying to train these custom images on multiple gpus using the command below.
However, I'm running into CUDA memory issues since it seems DOPE is using all the memory in the first gpu and not using the memory in the other gpus. I've attached the nvidia-smi below. How would I go about fixing this? |
I don't know how to solve the problem that it only uses 1 GPU. In general, you have to reduce the batchsize if you run into memory problems. Also, |
2.5k images is definitely not enough. I usually use 60k images and 60 epochs. This takes about 48h on a single RTX3090 or RTXA6000. I vaguely remember I had some problems getting it to run on our A100s and with multiple GPUs, so that's why I'm using single RTX GPUs for now. To debug any problems, it's really valuable to visualize the training progress, including the belief maps, like this:
|
The old train code should train fine on multi GPU, I used 4 p100 for the original paper. Recently I have used 4 a100 as well with the new train code which uses a different torch data distribution. |
@TontonTremblay Can you provide the command you used with the new training script while training on your cluster? Ive got a problem where the cluster is only using 1 gpu like shown above. |
Hello.
I'm trying to properly define the symmetry for a tennis ball. Currently, I've stated nothing for the discrete symmetry. However, I defined continuous symmetry like so [{"axis": [0, 0, 1], "offset": [0, 0, 0]}, {"axis": [1, 0, 0], "offset": [0, 0, 0]}, {"axis": [0, 1, 0], "offset": [0, 0, 0]}].
I didn't know how to define align axes, so I left it the same as the one from the cylinder object "align_axes": [{"object": [0, 1, 0], "camera": [0, 0, 1]}, {"object": [0, 0, 1], "camera": [0, 1, 0]}].
Another thing that is confusing is a me is a tennis ball has a logo on only one side. How does that affect the model_info.json?
Any help would be appreciated, thank you in advance.
The text was updated successfully, but these errors were encountered: