-
Notifications
You must be signed in to change notification settings - Fork 80
Functional system redesign #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Whoah. I need to take some time to digest this, but I admire your commitment ;) Looks like a nice idea, though - making the crate cooperate with the compiler optimizations better will certainly be a plus. |
|
Oh, I did forget to mention two things:
Also, |
|
I think I need to take some extra time to optimize the case where both |
|
This is done via the only kind of specialization Rust currently supports, which is a default implementation of a hidden function on The added hidden function, However, when the This is in addition to the default implementation of EDIT: After I posted this I realized there is still that one mix of arguments that cannot be optimized, even with the new technique. Rust's lack of real specialization makes it even more difficult. The only one that isn't optimized is I'll try to find a way to do that one, too. |
Added missing MappedGenericSequence implementation for &mut S
|
So it turns out that I could use that exact same trick to specialize the default implementation of
and their mutable equivalents is correctly optimized. No compromises for real this time, I hope. |
|
Is there anything else you'd like me to add? Perhaps a EDIT: It was so simple I went ahead and added it anyway. I can't think of anything else to add or improve, but if you have any thoughts I'd love to hear them! |
|
This is awesome. I don't have any other ideas, except maybe |
|
Yeah, those wouldn’t be possible because of the compile time length. However, if it sounds like a good idea to you, after this is merged I can open a new PR to add standard library support with a cargo feature. Or rather, disable it with a Then we can have |
|
Alright, this is ready to merge. I also have the |
|
Awesome! I'll merge it. Regarding the Anyway, we can discuss it separately - for now, thanks a lot for this PR, it is a piece of awesome work :) |
|
So while browsing the auto-generated documentation I noticed that So either you can do it or I can open a new pull request, but Sorry about that. |
|
No worries, I'll add this. Nice catch! |
This explanation is going to be a bit wild.
Basically, my entire motivation for this and the previous work with
zip,mapand so forth was to organize safe operations in a way conducive to compiler optimizations, specifically auto-vectorization.Unfortunately, this only seems to work on slices with a known size at compile-time. I guess because they are an intrinsic type. Any and all attempts to get a custom iterator to optimize like that has failed, even with unstable features.
Even though they technically worked, I wasn't happy with how the previous work did functional operations, with
map/map_refandzip/zip_ref. It felt a bit unintuitive. They were also strictly attached to theGenericArraytype, so they were useless withGenericSequenceby itself, like with generics.So, I've redefined
GenericSequencelike this:where
Sequenceis defined asSelffor theGenericArrayimplementation.That may seem redundant, but now
GenericSequenceis broadly implemented for&'a Sand&'a mut S, and carries over the sameSequence.So:
Furthermore,
IntoIteratoris now implemented for&'a GenericArray<T, N>and&'a mut GenericArray<T, N>, where both of those implementations use slice iterators, and each reference type automatically implementsGenericSequence<T>Next, I've added a new trait called
MappedGenericSequence, which looks like:and the implementation of that for
GenericArrayis just:As you can see, it just defines another arbitrary
GenericArraywith the same length. The transformation allows for proving oneGenericArraycan be created from another, which leads into theFunctionalSequencetrait.You can see the default implementation for it in
src/functional.rs, which uses the fact that anyGenericSequenceisIntoIteratorand the associatedSequenceisFromIteratorto map/zip sequences using only simple iterators.FunctionalSequenceis also automatically implemented for&'a Sand&'a mut SwhereS: GenericSequence<T>, so they automatically work with&GenericArrayas well.Furthermore, it's implemented directly on
GenericArrayas well, which uses theArrayConsumersystem to provide a lightweight and optimizable implementation, rather than relying onGenericArrayIter, which cannot be optimized.As a result, code like in the assembly test:
will correctly be optimized into a single VPADDD instruction, just as desired.
The downside of this is that non-reference RHS arguments will kill this optimization, because it will useI found a good way around this currently..into_iter()andGenericArrayIter. There really isn't a good way around this currently.The upside of all of this is that pass any random
GenericSequencewithout knowing the length is finally feasible, as shown intests/std.rs, and here:Which still has zero runtime length checking, but we've avoided having to know the length of the sequence. Furthermore, now
test_genericcan work forGenericArray,&GenericArrayand&mut GenericArraywith no problems.BREAKING CHANGES:
FromIteratorforGenericArraynow panics when the given iterator doesn't produce enough elements, wherein before it padded it with defaults.map_refandzip_refare gone, replaced with the new functional system.Fixedmap/zipcan fail to optimize unless used with references.vec::IntoIterin the worst case.What do you think? Perhaps I should write up some examples for the docs, too?
If I failed to explain anything, made a mistake or could improve on anything, please let me know. I just want to make the best things I can.