-
Notifications
You must be signed in to change notification settings - Fork 78
Functional system redesign #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Whoah. I need to take some time to digest this, but I admire your commitment ;) Looks like a nice idea, though - making the crate cooperate with the compiler optimizations better will certainly be a plus. |
Oh, I did forget to mention two things:
Also, |
I think I need to take some extra time to optimize the case where both |
This is done via the only kind of specialization Rust currently supports, which is a default implementation of a hidden function on The added hidden function, However, when the This is in addition to the default implementation of EDIT: After I posted this I realized there is still that one mix of arguments that cannot be optimized, even with the new technique. Rust's lack of real specialization makes it even more difficult. The only one that isn't optimized is I'll try to find a way to do that one, too. |
Added missing MappedGenericSequence implementation for &mut S
So it turns out that I could use that exact same trick to specialize the default implementation of
and their mutable equivalents is correctly optimized. No compromises for real this time, I hope. |
Is there anything else you'd like me to add? Perhaps a EDIT: It was so simple I went ahead and added it anyway. I can't think of anything else to add or improve, but if you have any thoughts I'd love to hear them! |
This is awesome. I don't have any other ideas, except maybe |
Yeah, those wouldn’t be possible because of the compile time length. However, if it sounds like a good idea to you, after this is merged I can open a new PR to add standard library support with a cargo feature. Or rather, disable it with a Then we can have |
Alright, this is ready to merge. I also have the |
Awesome! I'll merge it. Regarding the Anyway, we can discuss it separately - for now, thanks a lot for this PR, it is a piece of awesome work :) |
So while browsing the auto-generated documentation I noticed that So either you can do it or I can open a new pull request, but Sorry about that. |
No worries, I'll add this. Nice catch! |
This explanation is going to be a bit wild.
Basically, my entire motivation for this and the previous work with
zip
,map
and so forth was to organize safe operations in a way conducive to compiler optimizations, specifically auto-vectorization.Unfortunately, this only seems to work on slices with a known size at compile-time. I guess because they are an intrinsic type. Any and all attempts to get a custom iterator to optimize like that has failed, even with unstable features.
Even though they technically worked, I wasn't happy with how the previous work did functional operations, with
map
/map_ref
andzip
/zip_ref
. It felt a bit unintuitive. They were also strictly attached to theGenericArray
type, so they were useless withGenericSequence
by itself, like with generics.So, I've redefined
GenericSequence
like this:where
Sequence
is defined asSelf
for theGenericArray
implementation.That may seem redundant, but now
GenericSequence
is broadly implemented for&'a S
and&'a mut S
, and carries over the sameSequence
.So:
Furthermore,
IntoIterator
is now implemented for&'a GenericArray<T, N>
and&'a mut GenericArray<T, N>
, where both of those implementations use slice iterators, and each reference type automatically implementsGenericSequence<T>
Next, I've added a new trait called
MappedGenericSequence
, which looks like:and the implementation of that for
GenericArray
is just:As you can see, it just defines another arbitrary
GenericArray
with the same length. The transformation allows for proving oneGenericArray
can be created from another, which leads into theFunctionalSequence
trait.You can see the default implementation for it in
src/functional.rs
, which uses the fact that anyGenericSequence
isIntoIterator
and the associatedSequence
isFromIterator
to map/zip sequences using only simple iterators.FunctionalSequence
is also automatically implemented for&'a S
and&'a mut S
whereS: GenericSequence<T>
, so they automatically work with&GenericArray
as well.Furthermore, it's implemented directly on
GenericArray
as well, which uses theArrayConsumer
system to provide a lightweight and optimizable implementation, rather than relying onGenericArrayIter
, which cannot be optimized.As a result, code like in the assembly test:
will correctly be optimized into a single VPADDD instruction, just as desired.
The downside of this is that non-reference RHS arguments will kill this optimization, because it will useI found a good way around this currently..into_iter()
andGenericArrayIter
. There really isn't a good way around this currently.The upside of all of this is that pass any random
GenericSequence
without knowing the length is finally feasible, as shown intests/std.rs
, and here:Which still has zero runtime length checking, but we've avoided having to know the length of the sequence. Furthermore, now
test_generic
can work forGenericArray
,&GenericArray
and&mut GenericArray
with no problems.BREAKING CHANGES:
FromIterator
forGenericArray
now panics when the given iterator doesn't produce enough elements, wherein before it padded it with defaults.map_ref
andzip_ref
are gone, replaced with the new functional system.Fixedmap
/zip
can fail to optimize unless used with references.vec::IntoIter
in the worst case.What do you think? Perhaps I should write up some examples for the docs, too?
If I failed to explain anything, made a mistake or could improve on anything, please let me know. I just want to make the best things I can.