Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding control cost and decouple it with temperature #98

Closed

Conversation

tkkim-robot
Copy link

@tkkim-robot tkkim-robot commented Oct 6, 2022

What?

I've implemented the control cost and decouple the inverse temperature with control cost.

Why?

As Algorithm 1 described in the paper(IT-MPC, T-RO, 2018), there is a control cost derived from free energy and importance sampling. This will help to minimize the KL divergence between the controlled distribution and the base distribution.

Also, I added a variable 'gamma', which decouples the inverse temperature and the control cost. Please see the same paper, Section.III.D.4).
image

How?

  • I've just implemented the function: evalControlCost(), in 'optimizer.cpp'.
  • Then, call this function before calculating the importance weights.
  • Set lambda: 10.0 (previously 0.35), and set gamma: 0.1 (based on my previous experience).
  • Increase the sampling std: [vx: 0.8, vy: 0.8, vz: 0.8] (previously [vx: 0.2, vy: 0.2, vz: 0.4]). I increased for testing the robot taking a 90 degree turn. I found the controller could not go for a sharp turn in this setting (see below for more details). It could be smaller again.

Testing?

[see this video] I thinks it has been considerably improved. There is no wavy driving. I also noticed that the linear longitudinal velocity is smooth.

Future Works?

  1. I found that there is no velocity cost. Are 'path_align_critic()' and 'path_follow_critic()' actually enough for achieving full velocity?
  2. [see this video] I found the controller was unable to go for a sharp turn in this setting. Should check the 'threshold_to_consider_goal_angle' in 'GoalAngleCritic()'.

BTW, I've finally reviewed your entire implementation today for the first time. Thank you for your hard work! @SteveMacenski

@SteveMacenski
Copy link
Collaborator

Really you should thank @artofnothingness, this was his brain child that I can only claim (at most) like 25% of this for.

Regarding velocity, I answered via your email.

That sharp turning and going very slow problem this PR introduces is definitely something we need to work through. Maybe just things need to be retuned? I'll play around with this this afternoon. The "slow" thing could be retuning, but ignoring turns is very, very strange.

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Oct 6, 2022

Yeah after a bit of testing, these settings don't work. Its very wobbly and isn't responsive to commands. I think its purely coincidental that these settings actually make the robot go in a straight line in the first place, increasing the STD that way doesn't seem like that should actually help and its odd that going just from 0.2 -> 0.8 makes it go from not working at all to somewhat working, given the trajectories I'm looking at using 0.2, 0.2, 0.4 look reasonable and isn't using any of them.

I commented out all of the $\Omega$ (the costs_ += xt::sum(d_avx * d_avx + d_awz * d_awz, {1}, immediate) * weight;) since you showed in the paper that this helps but isn't required to try to simplify the number of total changes that we need to consider for what's wrong. So while the $\Omega$ stuff may or may not be right, with that totally deleted, we still have issues so at least that's not the main problem.

For very large values of gamma it becomes more stable (10.0), but that has nothing to do really with the Action-space changes that were made, this should have been done in the original method. There feels to me to be something wrong with the action space stuff. Plus, lambda > 1 isn't valid.


I tried afterwards to apply some of these changes to the main branch without the action smoothing - since the new gamma stuff in this PR should, in theory, be used either way. So trying to show basically that these changes help, but isn't solving the underlying issue in my smoothmppi branch trying to implement your paper.

Temperature = 10

So I think maybe that was a bug @artofnothingness and I had. We were using 0.35, but we were taking 1/temp for the calculation. The gamma in Autorally is 0.15 (which I think according to the literature is actually lambda), but they're using it directly in the cost2go exponential function, so that's just straight 0.15 where we were actually applying 1/0.35 = 6.66.

Using Temperature = 10 in the dev branch gives us super slow motion. Why is that the case? It seems like our value of temperature (e.g. lambda) is inverted in the relationship elsewhere.

This confuses me at the moment why ours works well with what seems to be the inverse of the values I see in other implementations

Adding Gamma

Added what this PR does with the gamma / control sequence / bounded noises (recomputed locally as state_.cvx - control_sequence_.vx to account for clamping of noise velocities since we don't have the getBoundedNoises in dev added in the action space PR)

I modified the code though, since I think something was wrong (isn't it supposed to be inverse sampling noise? Though I think that's supposed to be a matrix, so this might not quite be right), to:

costs_ += gamma / s.sampling_std.vx * xt::sum(xt::view(control_sequence_.vx, xt::newaxis(), xt::all()) * bounded_noises_vx, 1, immediate);

I'll let @tkkim-robot be the judge of that.

I used the same 0.15 temperature value (e.g. not 10 like above) and gamma = 0.1 and it appears to by itself make things a bit more smooth! We only reach a maximum speed of about 80% though. If I make gamma higher (e.g. 1.0) it twitches in place and goes no where. If I drop it (e.g. 0.001) it becomes more wobbly again, but does get back up to 90% maximum speed - which is not surprising since 0.001 is basically 0. So 0.075 - 0.125 seems like the sweet spot.

So it seems like with that change + gamma = 0.1 + temperature = 0.15 gets us relatively stable behavior, moving at approximately 80% of maximum speed - which is definitely an improvement over current behavior over-relying on temperature.

#99 PR that implements these findings for the main dev branch to add in the gamma term. But I think we can agree that this (1) doesn't solve the smoothmppi branch problems but does (2) help with general smoothness. So once we do fix the issues here, using this term will further help. I showed that formulation applied to the smoothmppi branch does help a bit, but still waves and bobbles around (all other parameters equal).

Either decreasing lambda or temperature boosts our speed, but then things start to become wobbly again. I think then the goal of the Action space changes in SMPPI is to let us drop those parameters more to get to full speed

@SteveMacenski
Copy link
Collaborator

Another hint here is that if I delete the $\Omega$ stuff, its super jerky / wobbly, but generally does things OK, as if just very poorly tuned. When adding in the $\Omega$ stuff again, that's when it goes absolutely crazy. Smaller values of action_dt (and adding * action_dt into the noise generator's "get bounded noise" function) makes things super unreactive, but generally helps a bit with stability, just high latency and overshoots.

@tkkim-robot
Copy link
Author

@artofnothingness Thanks for your great work again!

What?

Based on @SteveMacenski 's comments, I've resolved some issues here. I have fixed the incorrect implementation of the control cost and have added separate parameters for control constraints. It seems to be kind of working, I guess.

Why?

  1. The control cost I mentioned above includes multiplying an inverse covariance matrix. Since the covariance matrix is a diagonal matrix, it is equivalent to dividing (std)^2 on each control sequence:
costs_ += gamma / pow(s.sampling_std.vx, 2) * xt::sum(xt::eval(xt::view(control_sequence_.vx, xt::newaxis(), xt::all())) * bounded_noise_vx, 1, immediate);
  1. In my SMPPI implementation with PyTorch, I use a separate set of parameters for control constraints. It is very reasonable to apply different constraints between the control space (i.e., derivative action) and the action space. It is not feasible for a robot moving/rotating from the minimum velocity to the maximum velocity within a single optimization step. This will prevent kinematically or physically infeasible samples from being drawn by the noise generator.

How?

  1. I found the implementation in the previous commit was wrong, so I fixed it here.
  2. I added d_constraints for applying derivative action constraints, with the parameters:
      d_vx_max: 0.2
      d_vx_min: -0.2
      d_vy_max: 0.2
      d_wz_max: 0.4
  1. I set the sampling std back to normal: [vx: 0.2, vy: 0.2, vz: 0.4].
  2. Set lambda at 1.0

Testing?

  • I finally understood @SteveMacenski 's point about what 90% of maximum speed actually means. The maximum longitudinal speed is set to 0.26. So today I focused on this point. [see this video from 00:24] .
  • In this setting, the vx speed is mostly fixed to 0.26 and there is no chattering on this vx action.
  • You can also see there are little wobbly motions, at least from my perspective.
  • One very interesting thing, personally, can be seen from 00:53 of the same video. When the goal is set to the opposite direction, the robot starts to move backward until it has enough room to take a continuous turn (not a pivot turn). When it reaches a larger space, it then starts to turn. It looks like a three-point turn. Is this a common behavior in MPPI too?

Future Works?

  1. Please test this and sanity check that you can see the same improvements.
  2. At the start of the video, you can see that the robot retains its direction when the global path is difficult to follow. This is the same problem mentioned in issue 96. I think it depends on the design choice of the controller. It someone wants a robot that exactly follows the global path, it should be fixed accordingly, maybe using optimization recovery or zero mean sampling. However, I think this behavior is more natural, because the global path will be adjusted accordingly to this behavior.
  3. I don't clearly understand why lambda=10.0 is bad and lambda=1.0 is working.
  4. I tried to tune the $\Omega$ stuff with 2~3 trials but failed. I will try it next time.

Misc.

I used this controller_server setting, which is basically the same as the example on the README. The only things that have been changed are temperature and d_constraint.

controller_server:
  ros__parameters:
    controller_frequency: 30.0
    FollowPath:
      plugin: "mppi::Controller"
      time_steps: 30
      model_dt: 0.1
      batch_size: 2000
      vx_std: 0.2 
      vy_std: 0.2 
      wz_std: 0.4 
      vx_max: 0.5
      vx_min: -0.35
      vy_max: 0.5
      wz_max: 1.3
      d_vx_max: 0.2 # d_constraint
      d_vx_min: -0.2 # d_constraint
      d_vy_max: 0.2 # d_constraint
      d_wz_max: 0.4 # d_constraint
      iteration_count: 1
      prune_distance: 1.7
      transform_tolerance: 0.1
      temperature: 1. # 0.15 # this is inverse_temperature
      motion_model: "DiffDrive"
      visualize: true
      TrajectoryVisualizer:
        trajectory_step: 5
        time_step: 3
      AckermannConstrains:
        min_turning_r: 0.2
      critics: ["ObstaclesCritic", "GoalCritic", "GoalAngleCritic", "PathAlignCritic", "PathFollowCritic", "PathAngleCritic", "PreferForwardCritic"]
      GoalCritic:
        enabled: true
        cost_power: 1
        cost_weight: 5.0
      GoalAngleCritic:
        enabled: true
        cost_power: 1
        cost_weight: 3.0
        threshold_to_consider_goal_angle: 0.35
      ObstaclesCritic:
        enabled: true
        cost_power: 2
        cost_weight: 1.65
        consider_footprint: false
        collision_cost: 2000.0
      PathAlignCritic:
        enabled: true
        cost_power: 1
        cost_weight: 2.0
        path_point_step: 2
        trajectory_point_step: 3
      PathFollowCritic:
        enabled: true
        cost_power: 1
        cost_weight: 3.0
        offset_from_furthest: 10
        max_path_ratio: 0.40
      PathAngleCritic:
        enabled: true
        cost_power: 1
        cost_weight: 2.0
        offset_from_furthest: 4
      PreferForwardCritic:
        enabled: true
        cost_power: 1
        cost_weight: 3.0
      # TwirlingCritic:
      #   twirling_cost_power: 1
      #   twirling_cost_weight: 10.0

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Oct 7, 2022

When the goal is set to the opposite direction, the robot starts to move backward until it has enough room to take a continuous turn (not a pivot turn). When it reaches a larger space, it then starts to turn. It looks like a three-point turn. Is this a common behavior in MPPI too?

Yes 😄 That's one of the things that when I first took a glance at @artofnothingness project's made me very excited about MPPI's prospects to replace TEB. It does this maneuver in a super clean, smooth way, where timed elastic bands tend to "whip" around dangerously. If you start the robot with its front up against a wall too (like if docked), it does a really nice back out -> turn -> move forward 3-point move as well that's just delightful. That simulates what would happen if a person started to charge a robot - it would actually back up a bit to give itself some space before moving around the person so that it doesn't scrap super close by it (like DWA would). I can say from experience deploying service robots, that behavior built into a controller is worth its weight in gold. I haven't tested this in a crowd setting yet, but I'm very optimistic. Maybe someone at ROSCon will have a robot and we can test during a reception.

I'll play with this today! First off though is getting #99 since that's an incremental change to decouple the SMPPI stuff from just fixing general MPPI stuff. Your video looks very promising though!

You can see that the robot retains its direction when the global path is difficult to follow. This is the same problem mentioned in #96.

Yes, I'll test but if its no better/worse than before, then its not a problem that needs to be solved here. I don't love that it does that, but I understand why its happening. This is really only an issue in this situation where we have a minefield of relatively small obstacles that a 3 second time horizon can actually "engulf" in the wrong direction. While this map isn't great for most testing, its pretty good for controller testing.


Why did you choose the values you did for the d_ constraints? They seem kind of arbitrary. Does that represent something physical that we can describe for users -- or at least what it represents ideologically in the README for tuning configuration guides? Can it be thought of as the change in velocity possible due to acceleration over a timestep model_dt? If so, would this be something we could make analytically found via acceleration parameters (e.g. parameter is instead the accel_x that we multiply by model_dt to get the constraint)? Not asking you to make that change, but I will if that's the case. That's actually a really interesting interpretation this adds to MPPI and buried the lead in the SMPPI paper for those of us not using NN dynamic models. Enforces some rudimentary acceleration constraints by limiting the changes in each timestep.

What does d mean in d_constraints? I assume its derivative, but just asking so when I clean this all up I know what I should call these 2 constraint types.

I also don't see the behavior you do when I apply these changes to my smppi branch. I'm not seeing things getting up to full speed. With these changes, I'm getting aboubt 0.34m/s of my requested 0.5m/s and commented out all of the $\Omega$ stuff. I have lambda (temperature) as 1.0 and I think everything else is the same. I do totally agree though, this is an improvement, things aren't wacko now.

I actually also tried to just clone your exact branch using those parameters and still get the same ~0.34m/s number. Did you miss a config change or change something else here that wasn't mentioned or pushed to the branch? Even when I reduce to 0.26m/s as you showed in the videos, I see less than that. So I think something is missing 😄

@tkkim-robot
Copy link
Author

tkkim-robot commented Oct 8, 2022

Yes smile That's one of the things that when I first took a glance at artofnothingness project's made me very excited about MPPI's prospects to replace TEB.

Absolutely! I was very excited to see this behavior for the first time. I have always thought that the MPPI would definitely be able to solve problems including backward motions of the robot, such as the parking task demonstrated in the box-DDP paper. The motion here from this repo is nice and smooth!

This is really only an issue in this situation where we have a minefield of relatively small obstacles that a 3 second time horizon can actually "engulf" in the wrong direction.

I understand this point. It should be improved then. I will follow #96 as well.

Why did you choose the values you did for the d_ constraints? ...... If so, would this be something we could make analytically found via acceleration parameters (e.g. parameter is instead the accel_x that we multiply by model_dt to get the constraint)?

Yes, d_ means the derivative. As you mentioned above, d_constraints go for, in this case, clamping the control sequence based on the acceleration properties. When I used SMPPI to control the actuators, constraints go for the maximum motor angle, and d_constraints go for the maximum speed (angular velocity of motors). Because I can directly handle the derivative action space (which is the sampling domain) in SMPPI, it was easy for me to introduce this simple idea. It is very reasonable to enforce the sampled trajectories being inside of the kinematically and dynamically feasible region. So your explanation is basically correct. But I empirically found that d_constraints should not be set precisely according to the accel parameters. I usually set those with a fairly large margin. This helps when there are dynamic obstacles and the robot is required to adjust the entire future optimal sequence rapidly. In a static environment with zero obstacles (such as ideal Matlab simulation), tightly setting the `d_constraints' parameters also works well.

I didn't tune seriously about d_constraints parameters here. I just copied the sampling std for each control variable and there is no exact meaning because I was not familiar with the system parameters of the turtlebot's hardware. We may achieve better performance if we tune these parameters later, but I think it is not an urgent option.

You should notice that this d_contraints should not be included when you apply 1% zero mean sampling in SMPPI. It will hinder those samples from being drawn on zero mean.

It is strange that you could not reproduce this. I will double check my commit and test on 0.5 m/s max speed within few days. Thanks for checking this out.

@SteveMacenski
Copy link
Collaborator

I tried with 0.26m/s too and wasn't able to get to full speed with that either using your branch. That's why I thought maybe there were changes not reflected here from your latest experiment showing the video where things appear to get to full speed quickly

Got it! Thanks for the details! That all makes sense and good to know

@tkkim-robot
Copy link
Author

During the last couple of days, I tried to achieve 0.5 m/s full speed with SMPPI implementation. All the tests here were based on my last commit. I didn't change any codes during these days.

The first thing I did yesterday is to naively increase the max speed from 0.26 m/s to 0.5 m/s. I found that lambda = 1.0 is not a valid option for this. I mentioned above that lambda = 1.0 is a good parameter for 0.26 m/s, but I eventually found that this was wrong. I set lambda = 0.1 and it works well for both 0.26 m/s and 0.5 m/s. lambda = 0.1 even works better than lambda = 1.0 at 0.26 m/s speed. I was biased that lambda should be larger than 0.1, and it took me a long time to figure out lambda was the key parameter for achieving full speed. Anyway, this is the video that I recorded yesterday, and you can see it almost gets full speed around 0.45 m/s.

Secondly, I tried to make it consistently get full speed, not like oscillating around 90% ~ 95% of max speed. As you might know, when we are using sampling based optimization, an exact 100% of full speed is not achievable. This is because there are some trajectories with slower speeds and they will inevitably get some weights. They are bad, in terms of speed, but they are not lethal, so they will influence the optimized trajectory when they pass through the softmax function.

So, I just found a simple trick to achieve full speed, which is setting a bit higher max speed in the controller. I set 0.6 m/s here. Then, I set the actual max speed (i.e., 0.5 m/s) in the velocity smoother. Therefore, it guarantees that every action command is bounded to 0.5 m/s, while allowing the SMPPI to consider for higher speeds. This helps to achieve the full speed consistently even if we are using sampling-based optimization. This is a tricky solution, but it is acceptable because there is no theoretical violation and the commands sent to the robot never exceed the actual max speed. This is the video showing my latest results. The robot usually gets full speed and there are few wobbly motions. Please check this out and let me know your ideas.

BTW, I wonder if the reason why you could not reproduce my results is because of the yaml setting. I'm attaching my yaml settings, for controller_server and velocity smoother:

controller_server:
  ros__parameters:
    controller_frequency: 30.0
    FollowPath:
      plugin: "mppi::Controller"
      time_steps: 30
      model_dt: 0.1
      batch_size: 2000
      vx_std: 0.2
      vy_std: 0.2
      wz_std: 0.4
      vx_max: 0.6 #0.5
      vx_min: -0.35
      vy_max: 0.5
      wz_max: 1.3
      d_vx_max: 0.2
      d_vx_min: -0.2
      d_vy_max: 0.2
      d_wz_max: 0.4
      iteration_count: 1
      prune_distance: 1.7
      transform_tolerance: 0.1
      temperature: 0.1 # 015 # this is inverse_temperature
      motion_model: "DiffDrive"
      visualize: true
      TrajectoryVisualizer:
        trajectory_step: 5
        time_step: 3
      AckermannConstrains:
        min_turning_r: 0.2
      critics: ["ObstaclesCritic", "GoalCritic", "GoalAngleCritic", "PathAlignCritic", "PathFollowCritic", "PathAngleCritic", "PreferForwardCritic"]
      GoalCritic:
        enabled: true
        cost_power: 1
        cost_weight: 5.0
      GoalAngleCritic:
        enabled: true
        cost_power: 1
        cost_weight: 3.0
        threshold_to_consider_goal_angle: 0.35
      ObstaclesCritic:
        enabled: true
        cost_power: 2
        cost_weight: 1.65
        consider_footprint: false
        collision_cost: 2000.0
      PathAlignCritic:
        enabled: true
        cost_power: 1
        cost_weight: 2.0
        path_point_step: 2
        trajectory_point_step: 3
      PathFollowCritic:
        enabled: true
        cost_power: 1
        cost_weight: 3.0
        offset_from_furthest: 10
        max_path_ratio: 0.40
      PathAngleCritic:
        enabled: true
        cost_power: 1
        cost_weight: 2.0
        offset_from_furthest: 4
      PreferForwardCritic:
        enabled: true
        cost_power: 1
        cost_weight: 3.0
      # TwirlingCritic:
      #   twirling_cost_power: 1
      #   twirling_cost_weight: 10.0
velocity_smoother:
  ros__parameters:
    smoothing_frequency: 20.0
    scale_velocities: False
    feedback: "OPEN_LOOP"
    max_velocity: [0.5, 0.0, 1.0] 
    min_velocity: [-0.35, 0.0, -1.0] 
    max_accel: [2.5, 0.0, 3.2]
    max_decel: [-2.5, 0.0, -3.2]
    odom_topic: "odom"
    odom_duration: 0.1
    deadband_velocity: [0.0, 0.0, 0.0]
    velocity_timeout: 1.0

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Oct 10, 2022

I agree that this makes it faster, but also not stable, its wobbly when doing path tracking that would be unacceptable for any practical application (e.g. unsmooth). Is that what the d_avx and d_awz terms would help resolve?

Removing evalControlCost from this branch, its still not stable, which leaves just the Action stuff again (though from #99 I mention that the control cost does actually help make things a bit more stable, which I also see here when its removed - the action stuff is causing the instability, it seems. The Control Cost helps mask it.).

Perhaps its also just retuning of the entire system due to the new control cost function. As explored in #99, the squared control cost STD term requires basically retuning all of the penalty functions from scratch. So maybe I need to retune the system in accordance with the changes in #99 and then circle back here if those weights make this more stable as to see if the SMPPI helps make things smoother. But it seems concerning to me that this actually seems much worse with the same gains than what we see in #99 or current behavior with the current weights. It doesn't seem to me like retuning the system for the Simga control costs term will fix the issues for the Action part of these branches. But that's my intuition, not fact, for all I know, that is it.

But maybe retuning the d_constraints would, I'm not sure. But something about the Action stuff isn't quite right and itself is causing a more unsmoothness than existed before. Is it that 0.2 is unsensibly large? I could believe that: if running with model_dt = 0.1 as we do now, that would mean in a given $\Delta t$ across a trajectory, we can change up to 0.2m/s (at most, but normally distributed much less typically) in 0.1s timestep. I see if I drop all of the d_ constraints by half (arbitrarily), it seems more reasonable (Though, I'm not sure its "more" smooth than before any of these changes, its definitely on the right track). Is that part of the key of this technique that maybe wasn't described in the paper? The constraints on the control space being used to limit the changes? Is there something bad that'll happen if we overconstrain these too small? Originally none of this worked at all until that d_constraint was added, which makes it somewhat palpable now, so maybe we just didn't restrict it enough?

So, I just found a simple trick to achieve full speed, which is setting a bit higher max speed in the controller. I set 0.6 m/s here.

If I use exactly your parameters with 0.6 it "seems" more stable because the velocity smoother is quashing the noise coming out of the controller. If you use the same parameters but back down to 0.5m/s for the max, it gets wobbly again. Do you not see that behavior? The only other source of error could be if you haven't pulled in dev recently and there are new commits that you're missing. I'm looking at values from /cmd_vel_nav topic which is the output of the controller, before applying the velocity smoother. That velocity smoother can remove small issues but shouldn't be relied upon to fix the actual controller algorithm when unstable.

Either way, that's a pretty hacky solution. But that's also not really the problem we're trying to solve right now, if its ~0.45m/s of a 0.5m/s requested velocity, that's actually fine for right now. If we can do 90% speed smoothly and stable-y I'd be pretty happy with that. I'm hoping that making it smoother via this method will let us drop some of temperature or gamma terms to up the cost more (or the $\Omega$ term, when applied, would push velocities higher, no? If that can get us 3-5% more, I'd call it done)

@tkkim-robot
Copy link
Author

That's really a sharp observation. Let me answer them one by one.

If you use the same parameters but back down to 0.5m/s for the max, it gets wobbly again. Do you not see that behavior?

I understand your point that just naively achieving target velocity is not what we are trying to solve here. So, please ignore the related section of my latest comment. I knew that it is a tricky solution, so we can just simply wipe out this option.

Removing evalControlCost from this branch, its still not stable, which leaves just the Action stuff again (though from #99 I mention that the control cost does actually help make things a bit more stable, which I also see here when its removed - the action stuff is causing the instability, it seems. The Control Cost helps mask it.).

MPPI is a powerful implementation for addressing nonlinear and non-convex tasks. The beauty of MPPI, I believe, is that there are two types of theoretical derivation that can result in the same final form of MPPI: 1) Stochastic Optimal Control, and 2) Information Theoretic. Given that they have enticing theoretical integrity, I believe evalControlCost() with squared inverse std should be applied here. So my tests were all done on top of this manner. I might be wrong, but still, I think in this way. So removing evalControlCost() had not been an option for me.

Is that part of the key of this technique that maybe wasn't described in the paper? The constraints on the control space being used to limit the changes? Is there something bad that'll happen if we overconstrain these too small?

Applying d_constraints is an important technique in SMPPI. But, the performance is not heavily dependent on d_constraints. I usually tune these by 2 times magnitude and do not heavily tune them. The only thing we have to do is setting them with larger values than the acceleration system parameters of the robot. If they are too small, it is harder to avoid dynamic obstacles (and this is the only disadvantage). So I usually set those with a large margin and don't heavily tune them.

If you use the same parameters but back down to 0.5m/s for the max, it gets wobbly again. Do you not see that behavior?

This is the video I recorded last time (the first video of my previous comment): https://youtu.be/UnhKPjepPBg. It was tested with my branch, and without my tricky part (i.e., I set 0.5 m/s as max velocity). The velocity in the recorded video was from cmd_vel, not from cmd_vel_nav though. But, are these motions in the video wobbly? If so, I have to re-visit my code and seek for improvements. I noticed that I had not pulled the latest commit (#95) on dev branch. I will pull it and test it again to see if I can reproduce the same results from my video.

Lastly, I think it is better for us to solve the problem in MPPI first, and then pull those improvements into this branch and continue with the SMPPI stuff later. Like you said, one thing at a time!

@SteveMacenski
Copy link
Collaborator

So removing evalControlCost() had not been an option for me.

Agreed, it should be there. I just removed it for testing purposes to try to isolate changes to see what effect was coming from where.

Applying d_constraints is an important technique in SMPPI.

Got it. So why aren't we seeing something more smooth like you show in Figure 6 of your paper where it converges quite quickly / smoothly whereas MPPI / SGF MPPI rattled around a bit? From looking, usual acceleration params for robots are between 2-4m/s^2, so values of 0.2-0.4 are in line with that over a 0.1 model_dt, so I agree, generally speaking, those parameters are correct in magnitude. I thought those were too high, because dropping it makes it smoother. But that's also now lowering the acceleration below that of the robot platform itself.

Maybe the $\Omega$ term we commented out (or set weights to 0 on) would help? That should help smooth out within the trajectories themselves, no?

I'm also wondering now if maybe one of the critics is causing the wobbliness, since if we set a high temperature, then we smooth out alot. That would mask it when we drop lambda down. Most are pretty smooth, so the only ones I could think of that would cause wobbling are the path follow/align/angle. I'll try removing them in a more trivial path tracking environment and see if that helps at all. That would leave only the goal and obstacle critics (and prefer forward, which wouldn't have any kind of wobbly behavior). That should absolutely be smooth, so if we're still seeing wobble, that's got to be from somewhere else (noise distribution, balancing of critics to low-energy/smoothness term, SMPPI action stuff).

The other thing is that our dynamics model is just pass through (noised velocity in is velocity out), so maybe we should be applying acceleration constraints there? But again, without action stuff, that seems to work "fine".

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Oct 11, 2022

OK, so I took a look at the current dev behavior, the behavior of the other PR against dev with gamma, and then started from the basic critic functions and added them one at a time looking at the rqt plot of the velocities from the controller before velocity smoothing.

Goal critic, goal angle critic, prefer forward, path angle, and path follow together make for pretty smooth, non-jerky behavior in both X and theta axes.

When adding in Obstacles or Path Align, that's when things get a little more jerky.

Obstacles / Path Align

I think what's happening here is that there's a fight between the Obstacles/PathAlign critics and the Path Follow/Goal critics, where the path follow critic wants to drive towards a point on the path just outside of the trajectories being scored (e.g. we find the last point on the trajectories then correlate it to the nearest path point -- then add some N points after that) while the obstacles critic is scoring negatively because that's incentivizing motion that could lead to a collision in a turn. The Path align critic comes to help a bit because it incentives trajectories closer to the original path, presumably which are collision free. Something to keep in mind while tuning, but I suppose probably not the root of the main noise.

However, there's just alot of noise in the calculations of Obstacles and Path Align. When I include either, looking at RQT Plot the rotational velocity becomes much less smooth even on straight-aways where there shouldn't be that much in-fighting.

Obstacles

So one thing I tried was to use the integrated cost of the trajectory rather than the single highest point cost. That seems to make things a bit better. Its still jerky, but on a different scale (spikes in ~0.15 rad/s range rather than ~0.35 rad/s). Probably passable for now - adding to #99.

Path Align

This is definitely the highest contributor of noise, by far. Without this, things are pretty acceptable(ish). This term is what is causing the appreciable and noticable in end behavior wobble @artofnothingness. Any thoughts on how to maybe resolve that? I tried going back to full resolution on the path / trajectory markers (e.g. steps = 1) but see the same issue.

What I think is happening is that the very discrete, unaligned, and not similarly-distributed nature of the path point markers and the trajectory point markers makes this comparison not great. Maybe instead of discrete points, we should use the line segments that make up the path / trajectories and integrate the area more directly. I haven't thought too deeply about the technical difficulty of that, but what I'm thinking since I think that would reduce the "jitter" in that function. I have to think we can a method for computing the area between 2 sets of contiguous line segments. Someone in graphics must need to do that. If not, maybe fit a spline to the path locally and use that to have a continuous function to integrate against.

Or maybe there's a more direct point-set comparison distance metric we can come up with that is less jittery (or use this same one but try to smooth it out a bit?)

Edit: Sorry, I hate doing this, but there's alot of cross pollination between these 2 PRs. Its basically 1 huge topic now. I'm going to focus on adding the smoothness/low-energy term in the other PR first to get our foundations in order. If we can reduce jitter and tune the system over there, hopefully applying this over here will be more straight forward.

Quadratic Critics

I started this (didn't have time to get far though, the other critic analysis here took me the majority of the time I had to give to this today) and found that "yes" if I increase the orders to quadratic, it makes sense to push temperature to 0.7 (from 0.15) and gamma to 0.05 (from 0.015). I think your + the MPPI paper had things like 10 for lambda was because your main critic for desired speed has large values for moving in the km/hr range, where we're talking about 0.5m/s, so the size of the critic costs are orders of magnitude smaller. But the trend holds that higher critic values = higher lambdas.

@tkkim-robot
Copy link
Author

Thanks for your experiments. It makes sense the parameters' magnitudes are dependent on the critic functions. The first thing should be resolved seems to be the critic functions now. I will get back to here when #99 has resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants