Skip to content

[LLVM][WebAssembly] Implement branch hinting proposal #146230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Lukasdoe
Copy link

@Lukasdoe Lukasdoe commented Jun 28, 2025

This PR implements the WebAssembly branch hinting proposal, as detailed at https://webassembly.github.io/branch-hinting/metadata/code/binary.html. This proposal introduces a mechanism to convey branch likelihood information to the WebAssembly engine, allowing for more effective performance optimizations.

The proposal specifies a new custom section named metadata.code.branch_hint. This section can contain a sequence of hints, where each hint is a single byte that applies to a corresponding br_if or if instruction. The hint values are:

  • 0x00 (unlikely): The branch is unlikely to be taken.
  • 0x01 (likely): The branch is likely to be taken.

This implementation includes the following changes:

  • Addition of the "branch-hinting" feature (flag)
  • Collection of edge probabilities in CFGStackify pass
  • Outputting of metadata.code.branch_hint section in WebAssemblyAsmPrinter
  • Addition of the WebAssembly::Specifier::S_DEBUG_REF symbol ref specifier
  • Custom relaxation of leb128 fragments for storage of uleb128 encoded function indices and instruction offsets
  • Custom handling of code metadata sections in lld, required since the proposal requires code metadata sections to start with a combined count of function hints, followed by an ordered list of function hints.

This change is purely an optimization and does not alter the semantics of WebAssembly programs.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category lld backend:WebAssembly clang:frontend Language frontend issues, e.g. anything involving "Sema" mc Machine (object) code lld:wasm labels Jun 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2025

@llvm/pr-subscribers-objectyaml
@llvm/pr-subscribers-backend-webassembly
@llvm/pr-subscribers-lld
@llvm/pr-subscribers-mc
@llvm/pr-subscribers-lld-wasm

@llvm/pr-subscribers-clang

Author: Lukas Döllerer (Lukasdoe)

Changes

This commit implements the WebAssembly branch hinting proposal, as detailed at https://webassembly.github.io/branch-hinting/metadata/code/binary.html. This proposal introduces a mechanism to convey branch likelihood information to the WebAssembly engine, allowing for more effective performance optimizations.

The proposal specifies a new custom section named metadata.code.branch_hint. This section can contain a sequence of hints, where each hint is a single byte that applies to a corresponding br_if or if instruction. The hint values are:

  • 0x00 (unlikely): The branch is unlikely to be taken.
  • 0x01 (likely): The branch is likely to be taken.

This implementation includes the following changes:

  • Addition of the "branch-hinting" feature (flag)
  • Collection of edge probabilities in CFGStackify pass
  • Outputting of metadata.code.branch_hint section in WebAssemblyAsmPrinter
  • Addition of the WebAssembly::Specifier::S_DEBUG_REF symbol ref specifier
  • Custom relaxation of leb128 fragments for storage of uleb128 encoded function indices and instruction offsets
  • Custom handling of code metadata sections in lld, required since the proposal requires code metadata sections to start with a combined count of function hints, followed by an ordered list of function hints.

This change is purely an optimization and does not alter the semantics of WebAssembly programs.


Patch is 30.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/146230.diff

20 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+2)
  • (modified) clang/lib/Basic/Targets/WebAssembly.cpp (+12)
  • (modified) clang/lib/Basic/Targets/WebAssembly.h (+1)
  • (added) lld/test/wasm/Inputs/branch-hints.ll (+29)
  • (added) lld/test/wasm/code-metadata-branch-hints.ll (+37)
  • (modified) lld/wasm/OutputSections.cpp (+56)
  • (modified) lld/wasm/OutputSections.h (+9)
  • (modified) lld/wasm/Writer.cpp (+8-5)
  • (modified) llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyAsmBackend.cpp (+26)
  • (modified) llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCExpr.h (+1)
  • (modified) llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyWasmObjectWriter.cpp (+4-1)
  • (modified) llvm/lib/Target/WebAssembly/WebAssembly.td (+5-1)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (+69)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h (+9)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (+19-1)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td (+4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.h (+2)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h (+2)
  • (added) llvm/test/MC/WebAssembly/branch-hints-custom-high-low-thresholds.ll (+79)
  • (added) llvm/test/MC/WebAssembly/branch-hints.ll (+66)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index bd8df8f6a749a..8cb3a875a5c82 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5224,6 +5224,8 @@ def mtail_call : Flag<["-"], "mtail-call">, Group<m_wasm_Features_Group>;
 def mno_tail_call : Flag<["-"], "mno-tail-call">, Group<m_wasm_Features_Group>;
 def mwide_arithmetic : Flag<["-"], "mwide-arithmetic">, Group<m_wasm_Features_Group>;
 def mno_wide_arithmetic : Flag<["-"], "mno-wide-arithmetic">, Group<m_wasm_Features_Group>;
+def mbranch_hinting : Flag<["-"], "mbranch-hinting">, Group<m_wasm_Features_Group>;
+def mno_branch_hinting : Flag<["-"], "mno-branch-hinting">, Group<m_wasm_Features_Group>;
 def mexec_model_EQ : Joined<["-"], "mexec-model=">, Group<m_wasm_Features_Driver_Group>,
                      Values<"command,reactor">,
                      HelpText<"Execution model (WebAssembly only)">,
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp
index f19c57f1a3a50..14c9d501bc1fa 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -69,6 +69,7 @@ bool WebAssemblyTargetInfo::hasFeature(StringRef Feature) const {
       .Case("simd128", SIMDLevel >= SIMD128)
       .Case("tail-call", HasTailCall)
       .Case("wide-arithmetic", HasWideArithmetic)
+      .Case("branch-hinting", HasBranchHinting)
       .Default(false);
 }
 
@@ -116,6 +117,8 @@ void WebAssemblyTargetInfo::getTargetDefines(const LangOptions &Opts,
     Builder.defineMacro("__wasm_tail_call__");
   if (HasWideArithmetic)
     Builder.defineMacro("__wasm_wide_arithmetic__");
+  if (HasBranchHinting)
+    Builder.defineMacro("__wasm_branch_hinting__");
 
   Builder.defineMacro("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_1");
   Builder.defineMacro("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2");
@@ -194,6 +197,7 @@ bool WebAssemblyTargetInfo::initFeatureMap(
     Features["multimemory"] = true;
     Features["tail-call"] = true;
     Features["wide-arithmetic"] = true;
+    Features["branch-hinting"] = true;
     setSIMDLevel(Features, RelaxedSIMD, true);
   };
   if (CPU == "generic") {
@@ -347,6 +351,14 @@ bool WebAssemblyTargetInfo::handleTargetFeatures(
       HasWideArithmetic = false;
       continue;
     }
+    if (Feature == "+branch-hinting") {
+      HasBranchHinting = true;
+      continue;
+    }
+    if (Feature == "-branch-hinting") {
+      HasBranchHinting = false;
+      continue;
+    }
 
     Diags.Report(diag::err_opt_not_valid_with_opt)
         << Feature << "-target-feature";
diff --git a/clang/lib/Basic/Targets/WebAssembly.h b/clang/lib/Basic/Targets/WebAssembly.h
index 04f0cb5df4601..666da09e61636 100644
--- a/clang/lib/Basic/Targets/WebAssembly.h
+++ b/clang/lib/Basic/Targets/WebAssembly.h
@@ -71,6 +71,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public TargetInfo {
   bool HasSignExt = false;
   bool HasTailCall = false;
   bool HasWideArithmetic = false;
+  bool HasBranchHinting = false;
 
   std::string ABI;
 
diff --git a/lld/test/wasm/Inputs/branch-hints.ll b/lld/test/wasm/Inputs/branch-hints.ll
new file mode 100644
index 0000000000000..1a92259707171
--- /dev/null
+++ b/lld/test/wasm/Inputs/branch-hints.ll
@@ -0,0 +1,29 @@
+target triple = "wasm32-unknown-unknown"
+
+define i32 @test0(i32 %a) {
+entry:
+  %cmp0 = icmp eq i32 %a, 0
+  ; This metadata hints that the true branch is overwhelmingly likely.
+  br i1 %cmp0, label %if.then, label %ret1, !prof !0
+if.then:
+  %cmp1 = icmp eq i32 %a, 1
+  br i1 %cmp1, label %ret1, label %ret2, !prof !1
+ret1:
+  ret i32 2
+ret2:
+  ret i32 1
+}
+
+define i32 @test1(i32 %a) {
+entry:
+  %cmp = icmp eq i32 %a, 0
+  br i1 %cmp, label %if.then, label %if.else, !prof !1
+if.then:
+  ret i32 1
+if.else:
+  ret i32 2
+}
+
+; the resulting branch hint is actually reversed, since llvm-br is turned into br_unless, inverting branch probs
+!0 = !{!"branch_weights", i32 2000, i32 1}
+!1 = !{!"branch_weights", i32 1, i32 2000}
\ No newline at end of file
diff --git a/lld/test/wasm/code-metadata-branch-hints.ll b/lld/test/wasm/code-metadata-branch-hints.ll
new file mode 100644
index 0000000000000..076aa40881152
--- /dev/null
+++ b/lld/test/wasm/code-metadata-branch-hints.ll
@@ -0,0 +1,37 @@
+; RUN: llc -filetype=obj %s -o %t1.o -mattr=+branch-hinting
+; RUN: llc -filetype=obj %S/Inputs/branch-hints.ll -o %t2.o -mattr=+branch-hinting
+; RUN: wasm-ld --export-all -o %t.wasm %t1.o %t2.o
+; RUN: obj2yaml %t.wasm | FileCheck %s
+
+define i32 @_start(i32 %a) {
+entry:
+  %cmp = icmp eq i32 %a, 0
+  br i1 %cmp, label %if.then, label %if.else, !prof !0
+if.then:
+  ret i32 1
+if.else:
+  ret i32 2
+}
+
+define i32 @test_func1(i32 %a) {
+entry:
+  %cmp = icmp eq i32 %a, 0
+  br i1 %cmp, label %if.then, label %if.else, !prof !1
+if.then:
+  ret i32 1
+if.else:
+  ret i32 2
+}
+
+!0 = !{!"branch_weights", i32 2000, i32 1}
+!1 = !{!"branch_weights", i32 1, i32 2000}
+
+; CHECK:        - Type:            CUSTOM
+; CHECK-NEXT:     Name:            metadata.code.branch_hint
+; CHECK-NEXT:     Payload:         '84808080008180808000010501008280808000010501018380808000020701000E0101848080800001050101'
+
+; CHECK:        - Type:            CUSTOM
+; CHECK:          Name:            target_features
+; CHECK-NEXT:     Features:
+; CHECK:            - Prefix:          USED
+; CHECK-NEXT:         Name:            branch-hinting
\ No newline at end of file
diff --git a/lld/wasm/OutputSections.cpp b/lld/wasm/OutputSections.cpp
index cd80254a18d5c..cd65dd4a3855b 100644
--- a/lld/wasm/OutputSections.cpp
+++ b/lld/wasm/OutputSections.cpp
@@ -271,6 +271,62 @@ void CustomSection::writeTo(uint8_t *buf) {
     section->writeTo(buf);
 }
 
+void CodeMetaDataSection::writeTo(uint8_t *buf) {
+  log("writing " + toString(*this) + " offset=" + Twine(offset) +
+      " size=" + Twine(getSize()) + " chunks=" + Twine(inputSections.size()));
+
+  assert(offset);
+  buf += offset;
+
+  // Write section header
+  memcpy(buf, header.data(), header.size());
+  buf += header.size();
+  memcpy(buf, nameData.data(), nameData.size());
+  buf += nameData.size();
+
+  uint32_t TotalNumHints = 0;
+  for (const InputChunk *section :
+       make_range(inputSections.rbegin(), inputSections.rend())) {
+    section->writeTo(buf);
+    unsigned EncodingSize;
+    uint32_t NumHints =
+        decodeULEB128(buf + section->outSecOff, &EncodingSize, nullptr);
+    if (EncodingSize != 5) {
+      fatal("Unexpected encoding size for function hint vec size in " + name +
+            ": must be exactly 5 bytes.");
+    }
+    TotalNumHints += NumHints;
+  }
+  encodeULEB128(TotalNumHints, buf, 5);
+}
+
+void CodeMetaDataSection::finalizeContents() {
+  finalizeInputSections();
+
+  raw_string_ostream os(nameData);
+  encodeULEB128(name.size(), os);
+  os << name;
+
+  bool firstSection = true;
+  for (InputChunk *section : inputSections) {
+    assert(!section->discarded);
+    payloadSize = alignTo(payloadSize, section->alignment);
+    if (firstSection) {
+      section->outSecOff = payloadSize;
+      payloadSize += section->getSize();
+      firstSection = false;
+    } else {
+      // adjust output offset so that each section write overwrites exactly the
+      // subsequent section's function hint vector size (which deduplicates)
+      section->outSecOff = payloadSize - 5;
+      // payload size should not include the hint vector size, which is deduped
+      payloadSize += section->getSize() - 5;
+    }
+  }
+
+  createHeader(payloadSize + nameData.size());
+}
+
 uint32_t CustomSection::getNumRelocations() const {
   uint32_t count = 0;
   for (const InputChunk *inputSect : inputSections)
diff --git a/lld/wasm/OutputSections.h b/lld/wasm/OutputSections.h
index 4b0329dd16cf2..6580c71ab6f5a 100644
--- a/lld/wasm/OutputSections.h
+++ b/lld/wasm/OutputSections.h
@@ -132,6 +132,15 @@ class CustomSection : public OutputSection {
   std::string nameData;
 };
 
+class CodeMetaDataSection : public CustomSection {
+public:
+  CodeMetaDataSection(std::string name, ArrayRef<InputChunk *> inputSections)
+      : CustomSection(name, inputSections) {}
+
+  void writeTo(uint8_t *buf) override;
+  void finalizeContents() override;
+};
+
 } // namespace wasm
 } // namespace lld
 
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index 3cc3e0d498979..e9638bce8e86b 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -170,14 +170,17 @@ void Writer::createCustomSections() {
   for (auto &pair : customSectionMapping) {
     StringRef name = pair.first;
     LLVM_DEBUG(dbgs() << "createCustomSection: " << name << "\n");
-
-    OutputSection *sec = make<CustomSection>(std::string(name), pair.second);
+    OutputSection *Sec;
+    if (name == "metadata.code.branch_hint")
+      Sec = make<CodeMetaDataSection>(std::string(name), pair.second);
+    else
+      Sec = make<CustomSection>(std::string(name), pair.second);
     if (ctx.arg.relocatable || ctx.arg.emitRelocs) {
-      auto *sym = make<OutputSectionSymbol>(sec);
+      auto *sym = make<OutputSectionSymbol>(Sec);
       out.linkingSec->addToSymtab(sym);
-      sec->sectionSym = sym;
+      Sec->sectionSym = sym;
     }
-    addSection(sec);
+    addSection(Sec);
   }
 }
 
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyAsmBackend.cpp b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyAsmBackend.cpp
index 91a1db80deb3c..309954199325a 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyAsmBackend.cpp
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyAsmBackend.cpp
@@ -13,6 +13,7 @@
 
 #include "MCTargetDesc/WebAssemblyFixupKinds.h"
 #include "MCTargetDesc/WebAssemblyMCTargetDesc.h"
+#include "WebAssemblyMCExpr.h"
 #include "llvm/MC/MCAsmBackend.h"
 #include "llvm/MC/MCAssembler.h"
 #include "llvm/MC/MCExpr.h"
@@ -21,6 +22,7 @@
 #include "llvm/MC/MCSubtargetInfo.h"
 #include "llvm/MC/MCSymbol.h"
 #include "llvm/MC/MCWasmObjectWriter.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/raw_ostream.h"
 
 using namespace llvm;
@@ -46,6 +48,9 @@ class WebAssemblyAsmBackend final : public MCAsmBackend {
   std::unique_ptr<MCObjectTargetWriter>
   createObjectTargetWriter() const override;
 
+  std::pair<bool, bool> relaxLEB128(const MCAssembler &Asm, MCLEBFragment &LF,
+                                    int64_t &Value) const override;
+
   bool writeNopData(raw_ostream &OS, uint64_t Count,
                     const MCSubtargetInfo *STI) const override;
 };
@@ -72,6 +77,27 @@ WebAssemblyAsmBackend::getFixupKindInfo(MCFixupKind Kind) const {
   return Infos[Kind - FirstTargetFixupKind];
 }
 
+std::pair<bool, bool>
+WebAssemblyAsmBackend::relaxLEB128(const MCAssembler &assembler,
+                                   MCLEBFragment &LF, int64_t &Value) const {
+  const MCExpr &Expr = LF.getValue();
+  if (Expr.getKind() == MCExpr::ExprKind::SymbolRef) {
+    const MCSymbolRefExpr &SymExpr = llvm::cast<MCSymbolRefExpr>(Expr);
+    if (static_cast<WebAssembly::Specifier>(SymExpr.getSpecifier()) ==
+        WebAssembly::S_DEBUG_REF) {
+      Value = assembler.getSymbolOffset(SymExpr.getSymbol());
+      return std::make_pair(true, false);
+    }
+  }
+  // currently, this is only used for leb128 encoded function indices
+  // that require relocations
+  LF.getFixups().push_back(
+      MCFixup::create(0, &Expr, WebAssembly::fixup_uleb128_i32, Expr.getLoc()));
+  // ensure that the stored placeholder is large enough to hold any 32-bit val
+  Value = UINT32_MAX;
+  return std::make_pair(true, false);
+}
+
 bool WebAssemblyAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
                                          const MCSubtargetInfo *STI) const {
   for (uint64_t I = 0; I < Count; ++I)
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCExpr.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCExpr.h
index f74af06fb84fa..8276fad49baae 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCExpr.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCExpr.h
@@ -23,6 +23,7 @@ enum Specifier {
   S_TBREL,     // Table index relative to __table_base
   S_TLSREL,    // Memory address relative to __tls_base
   S_TYPEINDEX, // Reference to a symbol's type (signature)
+  S_DEBUG_REF, // Marker placed for generation of metadata.code.* section
 };
 } // namespace llvm::WebAssembly
 
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyWasmObjectWriter.cpp b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyWasmObjectWriter.cpp
index 33cf12e59870c..4a7bb2f4acc1a 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyWasmObjectWriter.cpp
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyWasmObjectWriter.cpp
@@ -93,10 +93,13 @@ unsigned WebAssemblyWasmObjectWriter::getRelocType(
   case WebAssembly::S_None:
     break;
   case WebAssembly::S_FUNCINDEX:
+    if (static_cast<unsigned>(Fixup.getKind()) ==
+        WebAssembly::fixup_uleb128_i32)
+      return wasm::R_WASM_FUNCTION_INDEX_LEB;
     return wasm::R_WASM_FUNCTION_INDEX_I32;
   }
 
-  switch (unsigned(Fixup.getKind())) {
+  switch (static_cast<unsigned>(Fixup.getKind())) {
   case WebAssembly::fixup_sleb128_i32:
     if (SymA.isFunction())
       return wasm::R_WASM_TABLE_INDEX_SLEB;
diff --git a/llvm/lib/Target/WebAssembly/WebAssembly.td b/llvm/lib/Target/WebAssembly/WebAssembly.td
index 13603f8181198..ec3889e2037e4 100644
--- a/llvm/lib/Target/WebAssembly/WebAssembly.td
+++ b/llvm/lib/Target/WebAssembly/WebAssembly.td
@@ -90,6 +90,10 @@ def FeatureWideArithmetic :
       SubtargetFeature<"wide-arithmetic", "HasWideArithmetic", "true",
                        "Enable wide-arithmetic instructions">;
 
+def FeatureBranchHinting :
+      SubtargetFeature<"branch-hinting", "HasBranchHinting", "true",
+                       "Enable branch hints for branch instructions">;
+
 //===----------------------------------------------------------------------===//
 // Architectures.
 //===----------------------------------------------------------------------===//
@@ -142,7 +146,7 @@ def : ProcessorModel<"bleeding-edge", NoSchedModel,
                        FeatureMultivalue, FeatureMutableGlobals,
                        FeatureNontrappingFPToInt, FeatureRelaxedSIMD,
                        FeatureReferenceTypes, FeatureSIMD128, FeatureSignExt,
-                       FeatureTailCall]>;
+                       FeatureTailCall, FeatureBranchHinting]>;
 
 //===----------------------------------------------------------------------===//
 // Target Declaration
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
index c61ed3c7d5d81..6eaab5939163d 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
@@ -54,6 +54,15 @@ using namespace llvm;
 #define DEBUG_TYPE "asm-printer"
 
 extern cl::opt<bool> WasmKeepRegisters;
+// values are divided by 1<<31 to calculate the probability
+static cl::opt<uint32_t> WasmHighBranchProb(
+    "wasm-branch-prob-high", cl::Hidden,
+    cl::desc("lowest branch probability to not be annotated as likely taken"),
+    cl::init(0x40000000));
+static cl::opt<uint32_t> WasmLowBranchProb(
+    "wasm-branch-prob-low", cl::Hidden,
+    cl::desc("highest branch probability to be annotated as unlikely taken"),
+    cl::init(0x40000000));
 
 //===----------------------------------------------------------------------===//
 // Helpers.
@@ -441,6 +450,38 @@ void WebAssemblyAsmPrinter::emitEndOfAsmFile(Module &M) {
   EmitProducerInfo(M);
   EmitTargetFeatures(M);
   EmitFunctionAttributes(M);
+
+  // Subtarget may be null if no functions have been defined in file
+  if (Subtarget && Subtarget->hasBranchHinting())
+    EmitBranchHintSection();
+}
+
+void WebAssemblyAsmPrinter::EmitBranchHintSection() const {
+  MCSectionWasm *BranchHintsSection = OutContext.getWasmSection(
+      "metadata.code.branch_hint", SectionKind::getMetadata());
+  OutStreamer->pushSection();
+  OutStreamer->switchSection(BranchHintsSection);
+  // should we emit empty branch hints section?
+  OutStreamer->emitULEB128IntValue(branchHints.size(), 5);
+  for (const auto &BHR : branchHints) {
+    if (!BHR)
+      continue;
+    // emit relocatable function index for the function symbol
+    OutStreamer->emitULEB128Value(MCSymbolRefExpr::create(
+        BHR->func_sym, WebAssembly::S_FUNCINDEX, OutContext));
+    // emit the number of hints for this function (is constant -> does not need
+    // handling by target streamer for reloc)
+    OutStreamer->emitULEB128IntValue(BHR->hints.size());
+    for (const auto &[instrSym, hint] : BHR->hints) {
+      assert(hint == 0 || hint == 1);
+      // offset from function start
+      OutStreamer->emitULEB128Value(MCSymbolRefExpr::create(
+          instrSym, WebAssembly::S_DEBUG_REF, OutContext));
+      OutStreamer->emitULEB128IntValue(1); // hint size
+      OutStreamer->emitULEB128IntValue(hint);
+    }
+  }
+  OutStreamer->popSection();
 }
 
 void WebAssemblyAsmPrinter::EmitProducerInfo(Module &M) {
@@ -696,6 +737,34 @@ void WebAssemblyAsmPrinter::emitInstruction(const MachineInstr *MI) {
     WebAssemblyMCInstLower MCInstLowering(OutContext, *this);
     MCInst TmpInst;
     MCInstLowering.lower(MI, TmpInst);
+    if (Subtarget->hasBranchHinting() &&
+        MI->getOpcode() == WebAssembly::BR_IF && MFI &&
+        MFI->BranchProbabilities.contains(MI)) {
+      MCSymbol *BrIfSym = OutContext.createTempSymbol();
+      OutStreamer->emitLabel(BrIfSym);
+
+      constexpr uint8_t HintLikely = 0x01;
+      constexpr uint8_t HintUnlikely = 0x00;
+      const BranchProbability &Prob = MFI->BranchProbabilities[MI];
+      uint8_t HintValue;
+      if (Prob > BranchProbability::getRaw(WasmHighBranchProb.getValue()))
+        HintValue = HintLikely;
+      else if (Prob <= BranchProbability::getRaw(WasmLowBranchProb.getValue()))
+        HintValue = HintUnlikely;
+      else
+        goto emit; // Don't emit branch hint between thresholds
+
+      // we know that we only emit branch hints for internal functions,
+      // therefore we can directly cast and don't need getMCSymbolForFunction
+      MCSymbol *FuncSym = cast<MCSymbolWasm>(getSymbol(&MF->getFunction()));
+      uint32_t LocalFuncIdx = MF->getFunctionNumber();
+      if (branchHints.size() <= LocalFuncIdx) {
+        branchHints.resize(LocalFuncIdx + 1);
+        branchHints[LocalFuncIdx] = BranchHintRecord{FuncSym, {}};
+      }
+      branchHints[LocalFuncIdx]->hints.emplace_back(BrIfSym, HintValue);
+    }
+  emit:
     EmitToStreamer(*OutStreamer, TmpInst);
     break;
   }
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h
index 46063bbe0fba1..b6595492cf26c 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h
@@ -18,6 +18,11 @@
 namespace llvm {
 class WebAssemblyTargetStreamer;
 
+struct BranchHintRecord {
+  MCSymbol *func_sym;
+  SmallVector<std::pair<MCSymbol *, uint8_t>, 0> hints;
+};
+
 class LLVM_LIBRARY_VISIBILITY WebAssemblyAsmPrinter final : public AsmPrinter {
 public:
   static char ID;
@@ -28,6 +33,9 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyAsmPrinter final : public AsmPrinter {
   WebAssemblyFunctionInfo *MFI;
   bool signaturesEmitted = false;
 
+  // vec idx == local func_idx
+  std::vector<std::optional<BranchHintRecord>> branchHints;
+
 public:
   explicit WebAssemblyAsmPrinter(TargetMachine &TM,
                                  std::unique_ptr<MCStreamer> Streamer)
@@ -59,6 +67,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyAsmPrinter final : public AsmPrinter {
   void EmitProducerInfo(Module &M);
   void EmitTargetFeatures(Module &M);
   void EmitFunctionAttributes(Module &M);
+  void EmitBranchHintSection() const;
   void emitSymbolType(const MCSymbolWasm *Sym);
   void emitGlobalVariable(const GlobalVariable *GV) override;
   void emitJumpTableInfo() override;
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
index 640be5fe8e8c9..84611030e448d 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
@@ -31,6 +31,7 @@
 #include "WebAssemblyUtilities.h"
 #include "llvm/...
[truncated]

@Lukasdoe Lukasdoe marked this pull request as draft June 28, 2025 18:15
@Lukasdoe Lukasdoe marked this pull request as ready for review June 28, 2025 20:20
@Lukasdoe
Copy link
Author

edit: rebased

@Lukasdoe
Copy link
Author

@yuri91 @sbc100

Copy link
Contributor

@aengelke aengelke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on the LLVM MC part. I'm not too familiar with the Wasm back-end or LLD.

// therefore we can directly cast and don't need getMCSymbolForFunction
MCSymbol *FuncSym = cast<MCSymbolWasm>(getSymbol(&MF->getFunction()));
uint32_t LocalFuncIdx = MF->getFunctionNumber();
if (branchHints.size() <= LocalFuncIdx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is the compilation order of functions different than the numbering?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check whether I understood correctly what you meant here. The resizing is now done once for each function.

@Lukasdoe
Copy link
Author

Changes:

  • Added branch hinting section parser to wasm2yaml
  • Threshold setting parameters are now floats
  • Fixed all other things aengelke found

OutputSection *sec = make<CustomSection>(std::string(name), pair.second);
OutputSection *Sec;
if (name == "metadata.code.branch_hint")
Sec = make<CodeMetaDataSection>(std::string(name), pair.second);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that creating the branch hint section here is too late: this will emit it after the code section, but according to the spec it must be before (unlike pretty much all other custom sections).

(Apologies if you handle this somewhere else, or I am misreading this code.)

I have been experimenting with branch hinting myself, and I handled this ordering issue like this:

https://github.com/kripken/llvm-project/pull/1/files#diff-e826be2acc8b58c5d040525dc8a509e90810d3edcd93190d4810e476919ef9aaR669-R672

Basically I added a call before the code section just for branch hints, and then erased it from the map, so that createCustomSections does not emit it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are entirely correct, the Overview.md file of the branch-hinting proposal mentions "The branch hints section should appear only once in a module, and only before the code section." However (what I based the implementation on), the final code metadata specification does not mention this require any longer.

It does not specify any other placement requirement though and I see the appeal of having the branch hints before the code section for fast baseline compilation in a streaming setup.

So, following your suggestion, I'll amend lld to insert the combined branch-hint section at the appropriate place. Thanks for spotting this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, that might be a missing point in the spec, then - @yuri91 , should the spec mention that the branch hint section goes before the code?

Or, if the order is intentionally not constrained in the spec then this would be a bug in V8, as atm it only works if branch hints appear earlier.

Copy link

@yuri91 yuri91 Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a missing point in the spec. The constraint is there to help streaming compilation. I will add it to the spec text, thanks for pointing this out!

WebAssemblyFunctionInfo *MFI = MF.getInfo<WebAssemblyFunctionInfo>();
assert(!MFI->BranchProbabilities.contains(&MI));
MFI->BranchProbabilities[&MI] = Prob;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option than adding code inside WebAssemblyCFGStackify is to compute branch targets and probabilities on MachineBasicBlocks, which I experimented with here, with guidance from @aheejin :

https://github.com/kripken/llvm-project/pull/1/files#diff-e894be52bbaa145e9b032dd0cedfcb8f9a59ebf9ff73ba2d19caf046eda51f86R679

I don't know if that is better in any way, just wanted to mention it in the chance it could be interesting.

Copy link
Member

@aheejin aheejin Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How often br_ifs do usually have branch probability data attached? If every br_if has it, it may increase in-memory size of WebAssemblyMachineFunctionInfo. That may or may not be negligible; I'm just not sure.

  2. If you'd like to do this in CFGStackify, I think it is better to create a separate function for it and call it before rewriteImmediates, because this task is technically not rewriting immediates.

  3. If you do it here, you won't be able to handle br_unless, a codegen-only instruction:

    let isCodeGenOnly = 1 in
    defm BR_UNLESS : I<(outs), (ins bb_op:$dst, I32:$cond),
    (outs), (ins bb_op:$dst), []>;

    We convert them to br_ifs in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyLowerBrUnless.cpp, which runs after CFGStackify. And saving br_unlesses to WebAssemblyMachineFunctionInfo does not work because they will be removed and replaced by newly created br_ifs in that pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1- I don't think that's much of an issue, ~16 bytes/basic block shouldn't be noticeable.
3- That seems to be a big issue, it might also cause dangling pointers. I think we need some other way to store the information.

Copy link
Author

@Lukasdoe Lukasdoe Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are some very good points you're raising there.

  1. I quickly collected some numbers in the CFGStackify pass for the very branchy fd_decode function of fadec:
    • Num branch hints: 8464
    • Num br_if instrs: 8464 (about 2.9% of instructions)
    • Num br_unless instrs: 4600 (about 1.58% of instructions)
    • Branches without edge probability: 0
    • Total instruction count: 291088

These results definitely show that it's important for us to also handle br_unless instrs!

  1. True, I'll reconsider this iff we actually keep this in CFGStackify, however as it currently seems like this needs to moved to after LowerBrUnless anyways, I won't fix this just yet.

  2. I think this is the most crucial point here. I'll have a look at the implementation of @kripken and how they recover branch hints in the asm printer, since this seems like the superior / more stable method with less memory overhead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: The changes in the CFGGraphStackify pass are now completely reverted. Branch hints are only generated based on br_if instructions encountered during ASM printing, mostly based on @kripken 's earlier contribution. This change significantly reduces the memory overhead (and probably runtime overhead), since we do not store branch hints for branches that are removed by passes after CFGStackify.

Copy link
Author

@Lukasdoe Lukasdoe Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fadec_normalized
Preliminary benchmark with the new code. Workload: 500 times decoding of the first 1000000 bytes of libLLVM.so (llvm-18) using fadec fd_decode. The wamrc-aot-wasi-branch-hints uses a patched wamr version that supports branch hints.

D8: V8 version 13.9.0 (candidate)
WAMR_IWASM: iwasm 2.3.1
CLANG_NATIVE: clang version 21.0.0git (https://github.com/Lukasdoe/llvm-project 7cca0fe)
GCC_NATIVE: gcc-14 (Ubuntu 14.2.0-4ubuntu2~24.04) 14.2.0
Intel Xeon Gold 6430, Ubuntu 24.04.2 LTS x86_64

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! So far I've only read the CFG parts and feature-adding parts, but some comments:

WebAssemblyFunctionInfo *MFI = MF.getInfo<WebAssemblyFunctionInfo>();
assert(!MFI->BranchProbabilities.contains(&MI));
MFI->BranchProbabilities[&MI] = Prob;
}
Copy link
Member

@aheejin aheejin Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How often br_ifs do usually have branch probability data attached? If every br_if has it, it may increase in-memory size of WebAssemblyMachineFunctionInfo. That may or may not be negligible; I'm just not sure.

  2. If you'd like to do this in CFGStackify, I think it is better to create a separate function for it and call it before rewriteImmediates, because this task is technically not rewriting immediates.

  3. If you do it here, you won't be able to handle br_unless, a codegen-only instruction:

    let isCodeGenOnly = 1 in
    defm BR_UNLESS : I<(outs), (ins bb_op:$dst, I32:$cond),
    (outs), (ins bb_op:$dst), []>;

    We convert them to br_ifs in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyLowerBrUnless.cpp, which runs after CFGStackify. And saving br_unlesses to WebAssemblyMachineFunctionInfo does not work because they will be removed and replaced by newly created br_ifs in that pass.

This commit implements the WebAssembly branch hinting proposal, as detailed at https://webassembly.github.io/branch-hinting/metadata/code/binary.html. This proposal introduces a mechanism to convey branch likelihood information to the WebAssembly engine, allowing for more effective performance optimizations.

The proposal specifies a new custom section named `metadata.code.branch_hint`. This section can contain a sequence of hints, where each hint is a single byte that applies to a corresponding `br_if` or `if` instruction. The hint values are:
 - `0x00` (`unlikely`): The branch is unlikely to be taken.
 - `0x01` (`likely`): The branch is likely to be taken.

This implementation includes the following changes:
 - Addition of the "branch-hinting" feature (flag)
 - Collection of edge probabilities in CFGStackify pass
 - Outputting of `metadata.code.branch_hint` section in WebAssemblyAsmPrinter
 - Addition of the `WebAssembly::Specifier::S_DEBUG_REF` symbol ref specifier
 - Custom relaxation of leb128 fragments for storage of uleb128 encoded function indices and instruction offsets
 - Custom handling of code metadata sections in lld, required since the proposal requires code metadata sections to start with a combined count of function hints, followed by an ordered list of function hints.

This change is purely an optimization and does not alter the semantics of WebAssembly programs.
@Lukasdoe
Copy link
Author

Lukasdoe commented Jul 2, 2025

  • lld now inserts combined branch hint section directly in front of code section
  • fixed failing tests
  • reordered branch hints option to fit into alphabetically sorted options declarations

@kripken
Copy link
Member

kripken commented Jul 2, 2025

@Lukasdoe You may be interested in this Binaryen PR which adds passes for branch hint debugging:

WebAssembly/binaryen#7695

For example you could use it to verify that the code you emit here has correct branch hints in practice (assuming your inputs have correct branch hints).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:WebAssembly clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category lld:wasm lld llvm:binary-utilities mc Machine (object) code objectyaml
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants