Skip to content

Are non-SIMD fallbacks and automatic multiple-target support planned? #9

Closed
@pedrocr

Description

@pedrocr

I'm the author of rawloader and am trying to figure out the best way to speed up image operations without a lot of code duplication. Ideally one could write the function once and have the best SIMD implementation be used in several architectures. For this a few things would need to happen:

  1. When SIMD isn't available at all fallback to a implementation of the same instructions
  2. Have the same function call the ideal function depending on if the target has SSE/AVX/etc
  3. Auto-generate all the target variations (e.g., with and without AVX on x86-64) and dispatch between then at runtime

Having 1) would make it much easier to add SIMD support to applications without having to add special cases everywhere but at least for some applications it's not strictly needed as you would never want to run it on a CPU so basic. Having 2) would take a lot of the effort away from writing SIMD implementations but it only makes sense if the performance downside of mixing SIMD with non-SIMD code doesn't make it slower than the fully non-SIMD version. I've proposed the equivalent of 3) for normal LLVM generation here:

rust-lang/rust#42432

I'm curious what the general opinion on this is. At least for basic operations like doing a FMA on a bunch of values it would be great to be able to write once and target most architectures efficiently with good fallbacks. Having a way to also use OpenCL with the same code would be even nicer and probably possible for a few simple operations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions