Open
Description
I tried this code: https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=f7612dfa44bc15ef44bd7d759dce1b87
pub fn demo(x: Box<i32>) -> Box<f32> {
let i = *x;
drop(x);
let f = i as f32;
Box::new(f)
}
Given that i32
and f32
have the same layout (at least on my target) I was hoping that the compiler would be able to -- as an optimization -- avoid the dealloc+alloc dance here.
But it looks like, right now at least, it doesn't re-use it even when the optimizer puts the __rust_dealloc
+__rust_alloc
right next to each other with matching layout arguments. Obviously this wouldn't be guaranteed, but hopefully in a bunch of simple cases it could just work.
; playground::demo
; Function Attrs: nounwind nonlazybind uwtable
define noalias nonnull align 4 float* @_ZN10playground4demo17h6a3fa9b52448831aE(i32* noalias nonnull align 4 %x) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
start:
%i = load i32, i32* %x, align 4
%_2.i.i.i.i = bitcast i32* %x to i8*
tail call void @__rust_dealloc(i8* nonnull %_2.i.i.i.i, i64 4, i64 4) #4 // <-- HERE
%0 = tail call dereferenceable_or_null(4) i8* @__rust_alloc(i64 4, i64 4) #4 // <-- HERE
%1 = icmp eq i8* %0, null
br i1 %1, label %bb3.i.i, label %"_ZN5alloc5boxed12Box$LT$T$GT$3new17hdd831c84a3ba3a35E.exit"
bb3.i.i: ; preds = %start
; call alloc::alloc::handle_alloc_error
tail call void @_ZN5alloc5alloc18handle_alloc_error17he10e441498789810E(i64 4, i64 4) #5
unreachable
"_ZN5alloc5boxed12Box$LT$T$GT$3new17hdd831c84a3ba3a35E.exit": ; preds = %start
%f = sitofp i32 %i to float
%2 = bitcast i8* %0 to float*
store float %f, float* %2, align 4
ret float* %2
}
cc #93653, about APIs to do this manually
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Our favorite opsem complicationArea: MIR optimizationsCategory: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.