-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding control cost and decouple it with temperature #98
Conversation
Really you should thank @artofnothingness, this was his brain child that I can only claim (at most) like 25% of this for. Regarding velocity, I answered via your email. That sharp turning and going very slow problem this PR introduces is definitely something we need to work through. Maybe just things need to be retuned? I'll play around with this this afternoon. The "slow" thing could be retuning, but ignoring turns is very, very strange. |
Yeah after a bit of testing, these settings don't work. Its very wobbly and isn't responsive to commands. I think its purely coincidental that these settings actually make the robot go in a straight line in the first place, increasing the STD that way doesn't seem like that should actually help and its odd that going just from 0.2 -> 0.8 makes it go from not working at all to somewhat working, given the trajectories I'm looking at using I commented out all of the For very large values of I tried afterwards to apply some of these changes to the main branch without the action smoothing - since the new gamma stuff in this PR should, in theory, be used either way. So trying to show basically that these changes help, but isn't solving the underlying issue in my Temperature = 10So I think maybe that was a bug @artofnothingness and I had. We were using Using Temperature = 10 in the This confuses me at the moment why ours works well with what seems to be the inverse of the values I see in other implementations Adding GammaAdded what this PR does with the gamma / control sequence / bounded noises (recomputed locally as I modified the code though, since I think something was wrong (isn't it supposed to be inverse sampling noise? Though I think that's supposed to be a matrix, so this might not quite be right), to:
I'll let @tkkim-robot be the judge of that. I used the same So it seems like with that change + #99 PR that implements these findings for the main Either decreasing |
Another hint here is that if I delete the |
@artofnothingness Thanks for your great work again! What?Based on @SteveMacenski 's comments, I've resolved some issues here. I have fixed the incorrect implementation of the control cost and have added separate parameters for control constraints. It seems to be kind of working, I guess. Why?
costs_ += gamma / pow(s.sampling_std.vx, 2) * xt::sum(xt::eval(xt::view(control_sequence_.vx, xt::newaxis(), xt::all())) * bounded_noise_vx, 1, immediate);
How?
d_vx_max: 0.2
d_vx_min: -0.2
d_vy_max: 0.2
d_wz_max: 0.4
Testing?
Future Works?
Misc.I used this controller_server setting, which is basically the same as the example on the README. The only things that have been changed are controller_server:
ros__parameters:
controller_frequency: 30.0
FollowPath:
plugin: "mppi::Controller"
time_steps: 30
model_dt: 0.1
batch_size: 2000
vx_std: 0.2
vy_std: 0.2
wz_std: 0.4
vx_max: 0.5
vx_min: -0.35
vy_max: 0.5
wz_max: 1.3
d_vx_max: 0.2 # d_constraint
d_vx_min: -0.2 # d_constraint
d_vy_max: 0.2 # d_constraint
d_wz_max: 0.4 # d_constraint
iteration_count: 1
prune_distance: 1.7
transform_tolerance: 0.1
temperature: 1. # 0.15 # this is inverse_temperature
motion_model: "DiffDrive"
visualize: true
TrajectoryVisualizer:
trajectory_step: 5
time_step: 3
AckermannConstrains:
min_turning_r: 0.2
critics: ["ObstaclesCritic", "GoalCritic", "GoalAngleCritic", "PathAlignCritic", "PathFollowCritic", "PathAngleCritic", "PreferForwardCritic"]
GoalCritic:
enabled: true
cost_power: 1
cost_weight: 5.0
GoalAngleCritic:
enabled: true
cost_power: 1
cost_weight: 3.0
threshold_to_consider_goal_angle: 0.35
ObstaclesCritic:
enabled: true
cost_power: 2
cost_weight: 1.65
consider_footprint: false
collision_cost: 2000.0
PathAlignCritic:
enabled: true
cost_power: 1
cost_weight: 2.0
path_point_step: 2
trajectory_point_step: 3
PathFollowCritic:
enabled: true
cost_power: 1
cost_weight: 3.0
offset_from_furthest: 10
max_path_ratio: 0.40
PathAngleCritic:
enabled: true
cost_power: 1
cost_weight: 2.0
offset_from_furthest: 4
PreferForwardCritic:
enabled: true
cost_power: 1
cost_weight: 3.0
# TwirlingCritic:
# twirling_cost_power: 1
# twirling_cost_weight: 10.0 |
Yes 😄 That's one of the things that when I first took a glance at @artofnothingness project's made me very excited about MPPI's prospects to replace TEB. It does this maneuver in a super clean, smooth way, where timed elastic bands tend to "whip" around dangerously. If you start the robot with its front up against a wall too (like if docked), it does a really nice back out -> turn -> move forward 3-point move as well that's just delightful. That simulates what would happen if a person started to charge a robot - it would actually back up a bit to give itself some space before moving around the person so that it doesn't scrap super close by it (like DWA would). I can say from experience deploying service robots, that behavior built into a controller is worth its weight in gold. I haven't tested this in a crowd setting yet, but I'm very optimistic. Maybe someone at ROSCon will have a robot and we can test during a reception. I'll play with this today! First off though is getting #99 since that's an incremental change to decouple the SMPPI stuff from just fixing general MPPI stuff. Your video looks very promising though!
Yes, I'll test but if its no better/worse than before, then its not a problem that needs to be solved here. I don't love that it does that, but I understand why its happening. This is really only an issue in this situation where we have a minefield of relatively small obstacles that a 3 second time horizon can actually "engulf" in the wrong direction. While this map isn't great for most testing, its pretty good for controller testing. Why did you choose the values you did for the What does I also don't see the behavior you do when I apply these changes to my smppi branch. I'm not seeing things getting up to full speed. With these changes, I'm getting aboubt 0.34m/s of my requested 0.5m/s and commented out all of the I actually also tried to just clone your exact branch using those parameters and still get the same ~0.34m/s number. Did you miss a config change or change something else here that wasn't mentioned or pushed to the branch? Even when I reduce to 0.26m/s as you showed in the videos, I see less than that. So I think something is missing 😄 |
Absolutely! I was very excited to see this behavior for the first time. I have always thought that the MPPI would definitely be able to solve problems including backward motions of the robot, such as the parking task demonstrated in the box-DDP paper. The motion here from this repo is nice and smooth!
I understand this point. It should be improved then. I will follow #96 as well.
Yes, I didn't tune seriously about You should notice that this It is strange that you could not reproduce this. I will double check my commit and test on 0.5 m/s max speed within few days. Thanks for checking this out. |
I tried with Got it! Thanks for the details! That all makes sense and good to know |
During the last couple of days, I tried to achieve 0.5 m/s full speed with SMPPI implementation. All the tests here were based on my last commit. I didn't change any codes during these days. The first thing I did yesterday is to naively increase the max speed from 0.26 m/s to 0.5 m/s. I found that Secondly, I tried to make it consistently get full speed, not like oscillating around So, I just found a simple trick to achieve full speed, which is setting a bit higher max speed in the controller. I set 0.6 m/s here. Then, I set the actual max speed (i.e., 0.5 m/s) in the velocity smoother. Therefore, it guarantees that every action command is bounded to 0.5 m/s, while allowing the SMPPI to consider for higher speeds. This helps to achieve the full speed consistently even if we are using sampling-based optimization. This is a tricky solution, but it is acceptable because there is no theoretical violation and the commands sent to the robot never exceed the actual max speed. This is the video showing my latest results. The robot usually gets full speed and there are few wobbly motions. Please check this out and let me know your ideas. BTW, I wonder if the reason why you could not reproduce my results is because of the yaml setting. I'm attaching my yaml settings, for controller_server:
ros__parameters:
controller_frequency: 30.0
FollowPath:
plugin: "mppi::Controller"
time_steps: 30
model_dt: 0.1
batch_size: 2000
vx_std: 0.2
vy_std: 0.2
wz_std: 0.4
vx_max: 0.6 #0.5
vx_min: -0.35
vy_max: 0.5
wz_max: 1.3
d_vx_max: 0.2
d_vx_min: -0.2
d_vy_max: 0.2
d_wz_max: 0.4
iteration_count: 1
prune_distance: 1.7
transform_tolerance: 0.1
temperature: 0.1 # 015 # this is inverse_temperature
motion_model: "DiffDrive"
visualize: true
TrajectoryVisualizer:
trajectory_step: 5
time_step: 3
AckermannConstrains:
min_turning_r: 0.2
critics: ["ObstaclesCritic", "GoalCritic", "GoalAngleCritic", "PathAlignCritic", "PathFollowCritic", "PathAngleCritic", "PreferForwardCritic"]
GoalCritic:
enabled: true
cost_power: 1
cost_weight: 5.0
GoalAngleCritic:
enabled: true
cost_power: 1
cost_weight: 3.0
threshold_to_consider_goal_angle: 0.35
ObstaclesCritic:
enabled: true
cost_power: 2
cost_weight: 1.65
consider_footprint: false
collision_cost: 2000.0
PathAlignCritic:
enabled: true
cost_power: 1
cost_weight: 2.0
path_point_step: 2
trajectory_point_step: 3
PathFollowCritic:
enabled: true
cost_power: 1
cost_weight: 3.0
offset_from_furthest: 10
max_path_ratio: 0.40
PathAngleCritic:
enabled: true
cost_power: 1
cost_weight: 2.0
offset_from_furthest: 4
PreferForwardCritic:
enabled: true
cost_power: 1
cost_weight: 3.0
# TwirlingCritic:
# twirling_cost_power: 1
# twirling_cost_weight: 10.0 velocity_smoother:
ros__parameters:
smoothing_frequency: 20.0
scale_velocities: False
feedback: "OPEN_LOOP"
max_velocity: [0.5, 0.0, 1.0]
min_velocity: [-0.35, 0.0, -1.0]
max_accel: [2.5, 0.0, 3.2]
max_decel: [-2.5, 0.0, -3.2]
odom_topic: "odom"
odom_duration: 0.1
deadband_velocity: [0.0, 0.0, 0.0]
velocity_timeout: 1.0 |
I agree that this makes it faster, but also not stable, its wobbly when doing path tracking that would be unacceptable for any practical application (e.g. unsmooth). Is that what the Removing Perhaps its also just retuning of the entire system due to the new control cost function. As explored in #99, the squared control cost STD term requires basically retuning all of the penalty functions from scratch. So maybe I need to retune the system in accordance with the changes in #99 and then circle back here if those weights make this more stable as to see if the SMPPI helps make things smoother. But it seems concerning to me that this actually seems much worse with the same gains than what we see in #99 or current behavior with the current weights. It doesn't seem to me like retuning the system for the Simga control costs term will fix the issues for the Action part of these branches. But that's my intuition, not fact, for all I know, that is it. But maybe retuning the
If I use exactly your parameters with Either way, that's a pretty hacky solution. But that's also not really the problem we're trying to solve right now, if its ~0.45m/s of a 0.5m/s requested velocity, that's actually fine for right now. If we can do 90% speed smoothly and stable-y I'd be pretty happy with that. I'm hoping that making it smoother via this method will let us drop some of temperature or gamma terms to up the cost more (or the |
That's really a sharp observation. Let me answer them one by one.
I understand your point that just naively achieving target velocity is not what we are trying to solve here. So, please ignore the related section of my latest comment. I knew that it is a tricky solution, so we can just simply wipe out this option.
MPPI is a powerful implementation for addressing nonlinear and non-convex tasks. The beauty of MPPI, I believe, is that there are two types of theoretical derivation that can result in the same final form of MPPI: 1) Stochastic Optimal Control, and 2) Information Theoretic. Given that they have enticing theoretical integrity, I believe
Applying
This is the video I recorded last time (the first video of my previous comment): https://youtu.be/UnhKPjepPBg. It was tested with my branch, and without my tricky part (i.e., I set 0.5 m/s as max velocity). The velocity in the recorded video was from Lastly, I think it is better for us to solve the problem in MPPI first, and then pull those improvements into this branch and continue with the SMPPI stuff later. Like you said, one thing at a time! |
Agreed, it should be there. I just removed it for testing purposes to try to isolate changes to see what effect was coming from where.
Got it. So why aren't we seeing something more smooth like you show in Figure 6 of your paper where it converges quite quickly / smoothly whereas MPPI / SGF MPPI rattled around a bit? From looking, usual acceleration params for robots are between 2-4m/s^2, so values of 0.2-0.4 are in line with that over a Maybe the I'm also wondering now if maybe one of the critics is causing the wobbliness, since if we set a high temperature, then we smooth out alot. That would mask it when we drop lambda down. Most are pretty smooth, so the only ones I could think of that would cause wobbling are the path follow/align/angle. I'll try removing them in a more trivial path tracking environment and see if that helps at all. That would leave only the goal and obstacle critics (and prefer forward, which wouldn't have any kind of wobbly behavior). That should absolutely be smooth, so if we're still seeing wobble, that's got to be from somewhere else (noise distribution, balancing of critics to low-energy/smoothness term, SMPPI action stuff). The other thing is that our dynamics model is just pass through (noised velocity in is velocity out), so maybe we should be applying acceleration constraints there? But again, without action stuff, that seems to work "fine". |
OK, so I took a look at the current Goal critic, goal angle critic, prefer forward, path angle, and path follow together make for pretty smooth, non-jerky behavior in both X and theta axes. When adding in Obstacles or Path Align, that's when things get a little more jerky. Obstacles / Path AlignI think what's happening here is that there's a fight between the Obstacles/PathAlign critics and the Path Follow/Goal critics, where the path follow critic wants to drive towards a point on the path just outside of the trajectories being scored (e.g. we find the last point on the trajectories then correlate it to the nearest path point -- then add some However, there's just alot of noise in the calculations of Obstacles and Path Align. When I include either, looking at RQT Plot the rotational velocity becomes much less smooth even on straight-aways where there shouldn't be that much in-fighting. ObstaclesSo one thing I tried was to use the integrated cost of the trajectory rather than the single highest point cost. That seems to make things a bit better. Its still jerky, but on a different scale (spikes in ~0.15 rad/s range rather than ~0.35 rad/s). Probably passable for now - adding to #99. Path AlignThis is definitely the highest contributor of noise, by far. Without this, things are pretty acceptable(ish). This term is what is causing the appreciable and noticable in end behavior wobble @artofnothingness. Any thoughts on how to maybe resolve that? I tried going back to full resolution on the path / trajectory markers (e.g. steps = 1) but see the same issue. What I think is happening is that the very discrete, unaligned, and not similarly-distributed nature of the path point markers and the trajectory point markers makes this comparison not great. Maybe instead of discrete points, we should use the line segments that make up the path / trajectories and integrate the area more directly. I haven't thought too deeply about the technical difficulty of that, but what I'm thinking since I think that would reduce the "jitter" in that function. I have to think we can a method for computing the area between 2 sets of contiguous line segments. Someone in graphics must need to do that. If not, maybe fit a spline to the path locally and use that to have a continuous function to integrate against. Or maybe there's a more direct point-set comparison distance metric we can come up with that is less jittery (or use this same one but try to smooth it out a bit?) Edit: Sorry, I hate doing this, but there's alot of cross pollination between these 2 PRs. Its basically 1 huge topic now. I'm going to focus on adding the smoothness/low-energy term in the other PR first to get our foundations in order. If we can reduce jitter and tune the system over there, hopefully applying this over here will be more straight forward. Quadratic CriticsI started this (didn't have time to get far though, the other critic analysis here took me the majority of the time I had to give to this today) and found that "yes" if I increase the orders to quadratic, it makes sense to push temperature to 0.7 (from 0.15) and gamma to 0.05 (from 0.015). I think your + the MPPI paper had things like |
Thanks for your experiments. It makes sense the parameters' magnitudes are dependent on the critic functions. The first thing should be resolved seems to be the critic functions now. I will get back to here when #99 has resolved. |
What?
I've implemented the control cost and decouple the inverse temperature with control cost.
Why?
As Algorithm 1 described in the paper(IT-MPC, T-RO, 2018), there is a control cost derived from free energy and importance sampling. This will help to minimize the KL divergence between the controlled distribution and the base distribution.
Also, I added a variable 'gamma', which decouples the inverse temperature and the control cost. Please see the same paper, Section.III.D.4).
data:image/s3,"s3://crabby-images/cb2fc/cb2fc251c85f8ee8767cb0c7fa016ccdfa749d37" alt="image"
How?
Testing?
[see this video] I thinks it has been considerably improved. There is no wavy driving. I also noticed that the linear longitudinal velocity is smooth.
Future Works?
BTW, I've finally reviewed your entire implementation today for the first time. Thank you for your hard work! @SteveMacenski