-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Rewrite local labels in inline assembly #81570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
70f627c
4bb2d5d
bce67de
e78a1bf
b72835f
be00fa4
1e27711
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -200,6 +200,7 @@ impl AsmBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> { | |||||
|
||||||
// Build the template string | ||||||
let mut template_str = String::new(); | ||||||
let mut labels = vec![]; | ||||||
for piece in template { | ||||||
match *piece { | ||||||
InlineAsmTemplatePiece::String(ref s) => { | ||||||
|
@@ -212,6 +213,10 @@ impl AsmBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> { | |||||
} | ||||||
} | ||||||
} else { | ||||||
if let Some(label) = get_llvm_label_from_str(s) { | ||||||
labels.push(label.to_owned()); | ||||||
} | ||||||
|
||||||
template_str.push_str(s) | ||||||
} | ||||||
} | ||||||
|
@@ -244,6 +249,30 @@ impl AsmBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> { | |||||
} | ||||||
} | ||||||
|
||||||
// Labels should be made local to prevent issues with inlined `asm!` and duplicate labels | ||||||
// Fixes https://github.com/rust-lang/rust/issues/74262 | ||||||
for label in labels.iter() { | ||||||
let ranges: Vec<_> = template_str | ||||||
.match_indices(label) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this checks if the character before the match is a non-identifier character. This means that if you have a label |
||||||
.filter_map(|(idx, _)| { | ||||||
let label_end_idx = idx + label.len(); | ||||||
let next_char = template_str[label_end_idx..].chars().nth(0); | ||||||
if next_char.is_none() || !llvm_ident_continuation(next_char.unwrap()) { | ||||||
Some(idx..label_end_idx) | ||||||
} else { | ||||||
None | ||||||
} | ||||||
}) | ||||||
.collect(); | ||||||
|
||||||
// We add a constant length of 18 to the string every time | ||||||
// from ${:private} and ${:uid} | ||||||
for (i, range) in ranges.iter().enumerate() { | ||||||
let real_range = range.start + (i * 18)..range.end + (i * 18); | ||||||
template_str.replace_range(real_range, &format!("${{:private}}{}${{:uid}}", label)) | ||||||
} | ||||||
} | ||||||
|
||||||
if !options.contains(InlineAsmOptions::PRESERVES_FLAGS) { | ||||||
match asm_arch { | ||||||
InlineAsmArch::AArch64 | InlineAsmArch::Arm => { | ||||||
|
@@ -874,3 +903,35 @@ fn llvm_fixup_output_type( | |||||
_ => layout.llvm_type(cx), | ||||||
} | ||||||
} | ||||||
|
||||||
// Valid llvm symbols: https://llvm.org/docs/AMDGPUOperandSyntax.html#symbols | ||||||
// First char is [a-zA-Z_.] | ||||||
fn llvm_ident_start(c: char) -> bool { | ||||||
('a'..='z').contains(&c) || ('A'..='Z').contains(&c) || '_' == c || '.' == c | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||
} | ||||||
|
||||||
// All subsequent chars are [a-zA-Z0-9_$.@] | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer if we disallowed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That should happen elsewhere, possibly in the same location parsing for formatting specifiers happens. Should also be a concern for a separate PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My main concern here is that these may not be valid characters in symbol names for all targets and may have other special meanings on those targets. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I agree. I still think this is the the wrong place in the codebase to enforce what characters we allow in asm labels. Probably should happen as a separate PR too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't actually prevent the user from using such labels. We just don't provide automatic rewriting into local labels for them. Also we're not handling all possible label definitions. For example labels can be defined after a We're not trying to handle all possible cases here since that would require us to effectively parse the asm, which we really want to leave to LLVM. We're just trying to make the most common types of labels work correctly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Aha, I see the distinction you're trying to make now.
That seems like a recipe for a pretty poor experience with a very non-obvious cliff at which things may start failing in non-obvious ways. For instance, user with passing familiarity with Rust's
and it might very well work out in basic tests, but start failing once the function is released to the field. If we do go the route of making things automagic, its better they're magic all the time, not only sometimes.
I believe just lexing would be sufficient, but I do agree that we'd rather not do that at all. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At the very least I would like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I want to confirm what we want to accept as a valid identifier for the purposes of replacing labels. Denying symbols outright should be properly implemented somewhere else, preferably with a compiler warning or error. This code can reflect what symbols are accepted, but it will ignore other things. For example, Is the set of symbols we will accept in a label Additionally, do we want to accept labels that are strings like nagisa commented here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that for label renaming we should only support |
||||||
fn llvm_ident_continuation(c: char) -> bool { | ||||||
('a'..='z').contains(&c) | ||||||
|| ('A'..='Z').contains(&c) | ||||||
|| ('0'..='9').contains(&c) | ||||||
|| '_' == c | ||||||
|| '$' == c | ||||||
|| '.' == c | ||||||
|| '@' == c | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A |
||||||
} | ||||||
|
||||||
/// Gets a LLVM label from a &str if it exists | ||||||
fn get_llvm_label_from_str(s: &str) -> Option<&str> { | ||||||
if let Some(colon_idx) = s.find(':') { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||||||
let substr = &s[..colon_idx]; | ||||||
let mut chars = substr.chars(); | ||||||
if let Some(c) = chars.next() { | ||||||
if llvm_ident_start(c) && chars.all(llvm_ident_continuation) { | ||||||
return Some(substr); | ||||||
} | ||||||
} | ||||||
} | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One way to get more arbitrary stuff into a asm label is to quote it as a string, like this: void test(void) {
__asm__(".intel_syntax noprefix\n"
"\"01☺☺@banana.?@!\": jmp \"01☺☺@banana.?@!\"");
} or in Rust #![feature(asm)]
pub fn test() {
unsafe {
asm!(r#"
"☺helloworld?@!$%^&": jmp "☺helloworld?@!$%^&"
"#);
}
} So that's something we should handle or forbid too. |
||||||
|
||||||
None | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
// compile-flags: -g | ||
// build-pass | ||
|
||
#![feature(asm)] | ||
|
||
#[inline(always)] | ||
fn asm() { | ||
unsafe { | ||
asm!("duplabel: nop",); | ||
} | ||
} | ||
|
||
fn main() { | ||
asm(); | ||
asm(); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
// compile-flags: -g | ||
// build-pass | ||
|
||
#![feature(asm)] | ||
|
||
fn asm1() { | ||
unsafe { | ||
asm!("duplabel: nop",); | ||
} | ||
} | ||
|
||
fn asm2() { | ||
unsafe { | ||
asm!("nop", "duplabel: nop",); | ||
} | ||
} | ||
|
||
fn main() { | ||
asm1(); | ||
asm2(); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
// compile-flags: -g | ||
// build-pass | ||
|
||
#![feature(asm)] | ||
|
||
fn main() { | ||
unsafe { | ||
asm!("jmp j", "nop", "j: nop"); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
// compile-flags: -g | ||
// build-pass | ||
|
||
#![feature(asm)] | ||
|
||
fn main() { | ||
unsafe { | ||
asm!("jmp l", "l: nop", "jmp l2", "l2: nop"); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
// compile-flags: -g | ||
// build-pass | ||
|
||
#![feature(asm)] | ||
|
||
fn main() { | ||
unsafe { | ||
asm!("jmp l", "nop", "l: nop"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A template string piece may consist of multiple lines (e.g. raw string literals). These strings could define more than one label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks, I didn't think of this case