Optimize tensors further #88

artofnothingness · 2022-07-21T17:39:59Z

No description provided.

SteveMacenski · 2022-07-21T20:54:47Z

well I don't see that this helps much with the jitter, but it also did another big drop in CPU - which will also correspondingly drop the jitters' impacts. I see this running well under 10ms now (we started the week at 20ms!) with maximum peaks mid-25ms, which is below the 30 hz rate. 6-11ms is typical.

I'd say its worth the time to finish up this PR. The general performance is more stable now and average slightly lowers.

SteveMacenski · 2022-07-21T21:35:40Z

I cleaned things up & handled the merge conflicts. The remaining stuff is w.r.t. visualization getOptimizedTrajectory and getGeneratedTrajectories based on the updated Trajectories wrapped tensors. Could you work your xtensor magic? You can merge this one that's added, I've tested everything else and very happy with the stabilization improvement this adds.

SteveMacenski · 2022-07-21T21:49:55Z

After this change, I re-benchmarked where the jitter is coming from and see numbers where the PathAlign is the biggest contributor now. The entire spike of the trajectory generation process is going from 2-3ms to 5-6ms (2-3ms deltas are hardly worth doing much about, I doubt we even could). PathAlign can go from 4-6ms steady state up to 11-12ms.

I think if we can improve PathAlign, that would be the point where we can/should stop hunting for optimizations and pivot to #79/#80 to complete the first iteration of this software.

By the way, I'm going to be on vacation from July 24 - Aug 17 so you won't hear from me much over the next couple of weeks while I'm traveling. But this remains a priority for me! I'd like to close out the optimization ticket before I leave, but I fear I will not get the Path Align optimizations done tomorrow 😢

SteveMacenski · 2022-07-22T03:26:25Z

I tested with TBB parallel for loops for path align, it doesn't appear to be any faster and the jitter still has the same consistent range. I also made an attempt to use xtensor's xaxis_iterator but after about an hour, I couldn't get it to compile with the types and whatnot. Might be worth a shot on your end, since using iterators will compile some xstrides which might be more optimized than manual looping over the data.

besides that, we could try to reduce the steps, but otherwise I think we'd be reaching the end of what we can do with it as currently implemented and might be worth rethinking how that's implemented to see if there's a lighter weight alternative.

Perhaps if we established a distance potential field to the path, we could use that to reference the distance from the path for each trajectory point without needing to iterate through the path for each trajectory point and compute the distance. The discrete nature of a grid potential field though would reduce the quality of its exact alignment, however.

We could also try to fit a spline to the path in a localized region to have an analytical equation to work with. That would buy us some simplification of the math if the spline's order was analytically differentiable to program in (or wolfram alpha-able).

but I'll have a few hours tomorrow before I leave to think about this more and see if there's solutions with what we have.

SteveMacenski · 2022-07-22T21:53:08Z

I'm quite surprised by the degree at which increasing the point steps for both trajectories and path points seems to not really impact performance of the critic much. Making them larger than would be good (5, 10 each) reduces by only 0.5ms but the peaks still hit 11ms. There's no amount of tuning those numbers higher than currently set which will meaningfully impact performance (though lower obviously has alot more compute).

I removed the contents of the loops, so that they just loop emptily and I can still actually see spikes occurring upwards of 10ms. The CPU time drops about in half to ~2ms steady state, so half of the compute time isn't even the main loop.

Given those 3 pieces of evidence (multithreading the loops doesn't help; reducing the resolution of the loop doesn't help; removing the loop content doesn't help), I don't think the reason that this critic is expensive is actually to do with the core logic itself, which is interesting - or perhaps the lambdas.

artofnothingness · 2022-07-25T19:58:33Z

Now that works with 35 time steps and 2000 batch on my 4 gen notebook within 20 hz

SteveMacenski

Back from PTO, this pretty much all looks great to me but I think I found one error and wanted to ask 1 question that might also be a problem (or just at least good to clarify)

SteveMacenski · 2022-08-19T22:46:28Z

src/optimizer.cpp

-  xt::view(yaw_offseted, xt::all(), xt::range(1, _)) =
-    xt::view(yaw, xt::all(), xt::range(_, -1));
-  xt::view(yaw_offseted, xt::all(), 0) = initial_yaw;
+  traj_yaws = xt::cumsum(utils::normalize_angles(wz) * settings_.model_dt, 0) + initial_yaw;


I believe this needs to be utils::normalize_angles(wz * settings_.model_dt + initial_yaw) so it can be normalized as angles after the time & offsets are applied

Which actually the other implementation of this function does properly

yeah that's a mistake

src/optimizer.cpp

SteveMacenski · 2022-08-19T23:24:03Z

@artofnothingness I did some testing on the current state of dev on runtime. I never see anything > 20ms anymore! Really great work on finishing this up 🥇 That puts is at 50hz at the settings I've been playing with.

SteveMacenski and others added 6 commits July 21, 2022 20:38

optimizations

e153cfa

remove dt, unused

76a7e47

further optimizations

6464e9e

starting point

21a560c

adding trajectory class

923fefa

working stuffs

281de6a

artofnothingness requested a review from SteveMacenski July 21, 2022 17:40

resolve extra functions

0c8a918

SteveMacenski added 3 commits July 21, 2022 14:02

handle merge conflicts

3c56c83

cleanup

c9f856b

readd transformed path

6049ea5

artofnothingness added 5 commits July 24, 2022 01:59

vis

a380a93

bugs somewhere

c8cc55b

fix integration and vix

2045e47

fix idxing

6773bdc

path model

e56aaec

artofnothingness merged commit 4b51d5f into develop Jul 25, 2022

artofnothingness deleted the sep_ten_traj branch July 25, 2022 19:58

artofnothingness changed the title ~~Sep ten traj~~ Optimize tensors further Jul 25, 2022

artofnothingness mentioned this pull request Jul 25, 2022

Optimize Code to run real-time by removing or reducing CPU jitter #68

Closed

SteveMacenski reviewed Aug 19, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize tensors further #88

Optimize tensors further #88

artofnothingness commented Jul 21, 2022

SteveMacenski commented Jul 21, 2022 •

edited

Loading

SteveMacenski commented Jul 21, 2022 •

edited

Loading

SteveMacenski commented Jul 21, 2022 •

edited

Loading

SteveMacenski commented Jul 22, 2022 •

edited

Loading

SteveMacenski commented Jul 22, 2022 •

edited

Loading

artofnothingness commented Jul 25, 2022

SteveMacenski left a comment

SteveMacenski Aug 19, 2022

SteveMacenski Aug 19, 2022

artofnothingness Aug 21, 2022

SteveMacenski commented Aug 19, 2022

Optimize tensors further #88

Optimize tensors further #88

Conversation

artofnothingness commented Jul 21, 2022

SteveMacenski commented Jul 21, 2022 • edited Loading

SteveMacenski commented Jul 21, 2022 • edited Loading

SteveMacenski commented Jul 21, 2022 • edited Loading

SteveMacenski commented Jul 22, 2022 • edited Loading

SteveMacenski commented Jul 22, 2022 • edited Loading

artofnothingness commented Jul 25, 2022

SteveMacenski left a comment

Choose a reason for hiding this comment

SteveMacenski Aug 19, 2022

Choose a reason for hiding this comment

SteveMacenski Aug 19, 2022

Choose a reason for hiding this comment

artofnothingness Aug 21, 2022

Choose a reason for hiding this comment

SteveMacenski commented Aug 19, 2022

SteveMacenski commented Jul 21, 2022 •

edited

Loading

SteveMacenski commented Jul 21, 2022 •

edited

Loading

SteveMacenski commented Jul 21, 2022 •

edited

Loading

SteveMacenski commented Jul 22, 2022 •

edited

Loading

SteveMacenski commented Jul 22, 2022 •

edited

Loading