Skip to content

[clang] Extend SourceLocation to 64 bits. #147292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

hokein
Copy link
Collaborator

@hokein hokein commented Jul 7, 2025

This patch extends the SourceLocation class from 4 bytes to 8 bytes, as discussed here https://discourse.llvm.org/t/revisiting-64-bit-source-locations/86556

Key Changes

  • For now, only the lower 40 bits are in use, leaving 24 bits reserved for future use.
    In theory, we can use up to 48 bits within the current encoding scheme, where a SourceLocation is represented as a 64-bit pair <ModuleFileIndex, SourceLocationOffset>, with ModuleFileIndex taking 16 bits.
    Starting with 40 bits should be sufficient; most AST nodes can still store locations within their StmtBitFields. If needed, we can easily adjust this limit in the future.

  • AST Changes:

    • Moved SourceLocation fields out of StmtBitFields in some AST nodes where the bitfields no longer had enough space.
    • Moved the source range of CXXOperatorNameExpr to the heap to avoid increasing the size of DeclarationNameLoc. This minimizes the size growth of frequently used nodes such as DeclRefExpr (which now grows from 32 to 40 bytes).
    • Moved NumArgs into CallExpr's bitfields (using 20 bits, which should be sufficient).
    • The remaining changes are mostly mechanical adjustments.
  • libclang: to maintain ABI compatibility in libclang, we continue using 32-bit source locations there. These are converted from 64-bit locations with bounds checking. If the value does not fit, an invalid source location is returned.

Performance Impact

hokein added 8 commits July 7, 2025 14:16
fix

Reduce the Stmt size back to 8 bytes.

Reduce the CallExpr size

Fix the ObjCContainerDecl bit field

Change the SourceLocation::UIntTy to uint64_t

Update other SourceManager's getDecomposedSpellingLoc APIs, and fix many
failing tests.

Remaining failures:

  Clang :: Index/IBOutletCollection.m
  Clang :: Index/annotate-macro-args.m
  Clang :: Index/annotate-module.m
  Clang :: Index/annotate-tokens-pp.c
  Clang :: Index/annotate-tokens.c
  Clang :: Index/annotate-toplevel-in-objccontainer.m
  Clang :: Index/hidden-redecls.m
  Clang :: Index/index-module-with-vfs.m
  Clang :: Index/index-module.m
  Clang :: Index/index-pch-objc.m
  Clang :: Index/index-pch-with-module.m
  Clang :: Index/index-pch.cpp
  Clang :: Index/targeted-annotation.c
  Clang :: Lexer/SourceLocationsOverflow.c
  Clang-Unit :: ./AllClangUnitTests/PPMemoryAllocationsTest/PPMacroDefinesAllocations
  Clang-Unit :: ./AllClangUnitTests/SourceLocationEncoding/Individual
  Clang-Unit :: ./AllClangUnitTests/SourceLocationEncoding/Sequence
  Clang-Unit :: libclang/./libclangTests/14/53
  Clang-Unit :: libclang/./libclangTests/45/53
  Clang-Unit :: libclang/./libclangTests/47/53
  Clang-Unit :: libclang/./libclangTests/48/53
  Clang-Unit :: libclang/./libclangTests/49/53
  Clang-Unit :: libclang/./libclangTests/50/53
  Clang-Unit :: libclang/./libclangTests/52/53

Fix libclang failures

Fix Rewrite APIs

Fix PPMemoryAllocationsTest

Fix SourceLocationEncodingTest

More unsigned -> SourceLocation::UIntTy changes in the SourceManager APIs

Update the type of std::pair<FileID, unsigned> in CIndex.cpp

Fix SourceLocationEncodingTest

Tweak the SourceLocation Implementation.

The source location has a Bit which specify the number of bits used
for the offset. 40 by default;

Make MathExtra templates constexpr

Test Bits=64 perf

Try 48 bits

No bitfields

Fix CallExpr optimization.

Test Bits=64 perf

Switch Bits back to 40.

Reduce SubstNonTypeTemplateParmExpr size: 48 -> 40 bytes

Reduce OpaqueValueExpr: 32 -> 24 bytes

Reduce CXXDependentScopeMemberExpr size: 88 -> 80 bytes

Reduce DeclRefExpr size: 48 -> 40 bytes.

by moving out the two source locations for CXXOpName from DeclarationNameLoc

Fix some merge conflicts.

Move the Loc back to the StmtBitFields if possible to save AST size.

Improve getFildIDLocal binary search.

Optimize binary search by using a dedicate offset table

improve the cache performance

Revert the static_assert change for ObjCContainerDeclBitfields.

Fix the compile failures for include-cleaner.

Fix clang-tidy build.

Fix clangd unittest

Fix windows build failures.

unsigned long is 32 bits on MSVC

More windows fix

Change the underlying StmtBitField type to uint64_t, fix windows
failures.

So that the sizeof(Stmt) can stay with 8 bytes.

More window fix

Fix merge failures

Update comments for SourceLocation.

clang-format

revert the Rewrite change.

Don't change the FileIDAndOffset type.

Revert the change in ObjCContainerDeclBitfields

Revert the changei n HTMLReport.cpp

Revert the unsigned -> UIntTy change in Diagnostic.h

Revert the unsigned->UIntTy change in SourceManager.

revert the binary optimization change.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang-tools-extra clangd clang-tidy clang-format clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang:codegen IR generation bugs: mangling, exceptions, etc. clang:as-a-library libclang and C++ API labels Jul 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 7, 2025

@llvm/pr-subscribers-lldb
@llvm/pr-subscribers-clang-modules
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang-tools-extra

@llvm/pr-subscribers-clangd

Author: Haojian Wu (hokein)

Changes

This patch extends the SourceLocation class from 4 bytes to 8 bytes, as discussed here https://discourse.llvm.org/t/revisiting-64-bit-source-locations/86556

Key Changes

  • For now, only the lower 40 bits are in use, leaving 24 bits reserved for future use.
    In theory, we can use up to 48 bits within the current encoding scheme, where a SourceLocation is represented as a 64-bit pair &lt;ModuleFileIndex, SourceLocationOffset&gt;, with ModuleFileIndex taking 16 bits.
    Starting with 40 bits should be sufficient; most AST nodes can still store locations within their StmtBitFields. If needed, we can easily adjust this limit in the future.

  • AST Changes:

    • Moved SourceLocation fields out of StmtBitFields in some AST nodes where the bitfields no longer had enough space.
    • Moved the source range of CXXOperatorNameExpr to the heap to avoid increasing the size of DeclarationNameLoc. This minimizes the size growth of frequently used nodes such as DeclRefExpr (which now grows from 32 to 40 bytes).
    • Moved NumArgs into CallExpr's bitfields (using 20 bits, which should be sufficient).
    • The remaining changes are mostly mechanical adjustments.
  • libclang: to maintain ABI compatibility in libclang, we continue using 32-bit source locations there. These are converted from 64-bit locations with bounds checking. If the value does not fit, an invalid source location is returned.

Performance Impact

  • Compile time overhead has been offset by optimizations in #146510, #146782, #146604

  • Peak memory overhead is up to 2.8%. I believe this is a reasonable trade-off.


Patch is 118.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/147292.diff

45 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/unittests/SourceCodeTests.cpp (+3-1)
  • (modified) clang/include/clang/AST/ASTContext.h (+2)
  • (modified) clang/include/clang/AST/DeclBase.h (+3-7)
  • (modified) clang/include/clang/AST/DeclObjC.h (+4-2)
  • (modified) clang/include/clang/AST/DeclarationName.h (+22-24)
  • (modified) clang/include/clang/AST/Expr.h (+40-26)
  • (modified) clang/include/clang/AST/ExprCXX.h (+52-29)
  • (modified) clang/include/clang/AST/ExprConcepts.h (+3-2)
  • (modified) clang/include/clang/AST/ExternalASTSource.h (+1-1)
  • (modified) clang/include/clang/AST/Stmt.h (+200-139)
  • (modified) clang/include/clang/Basic/SourceLocation.h (+22-6)
  • (modified) clang/include/clang/Basic/SourceManager.h (+3-3)
  • (modified) clang/include/clang/Sema/MultiplexExternalSemaSource.h (+1-1)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+3-1)
  • (modified) clang/include/clang/Serialization/SourceLocationEncoding.h (+21-12)
  • (modified) clang/lib/AST/ASTContext.cpp (+10-1)
  • (modified) clang/lib/AST/ASTImporter.cpp (+2-1)
  • (modified) clang/lib/AST/DeclarationName.cpp (+1-1)
  • (modified) clang/lib/AST/Expr.cpp (+16-15)
  • (modified) clang/lib/AST/ExprCXX.cpp (+4-4)
  • (modified) clang/lib/AST/ExprConcepts.cpp (+1-1)
  • (modified) clang/lib/AST/ExternalASTSource.cpp (+1-1)
  • (modified) clang/lib/AST/Stmt.cpp (+1-1)
  • (modified) clang/lib/Basic/SourceLocation.cpp (+38)
  • (modified) clang/lib/CodeGen/CoverageMappingGen.cpp (+2-1)
  • (modified) clang/lib/Format/FormatTokenLexer.cpp (+2-1)
  • (modified) clang/lib/Lex/Lexer.cpp (+5-2)
  • (modified) clang/lib/Parse/ParseStmtAsm.cpp (+2-1)
  • (modified) clang/lib/Sema/MultiplexExternalSemaSource.cpp (+1-1)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+8-5)
  • (modified) clang/lib/Sema/SemaLambda.cpp (+2-2)
  • (modified) clang/lib/Sema/SemaOverload.cpp (+6-3)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+2-1)
  • (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+19-14)
  • (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+1-2)
  • (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+2-2)
  • (removed) clang/test/Lexer/SourceLocationsOverflow.c (-38)
  • (modified) clang/tools/libclang/CIndex.cpp (+47-25)
  • (modified) clang/tools/libclang/CXIndexDataConsumer.cpp (+6-4)
  • (modified) clang/tools/libclang/CXSourceLocation.cpp (+44-25)
  • (modified) clang/tools/libclang/CXSourceLocation.h (+19-5)
  • (modified) clang/tools/libclang/Indexing.cpp (+16-9)
  • (modified) clang/unittests/Lex/PPMemoryAllocationsTest.cpp (+1-1)
  • (modified) clang/unittests/Serialization/SourceLocationEncodingTest.cpp (+7-3)
diff --git a/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp b/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
index e1fb42b8210e2..ecde1f7c90080 100644
--- a/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
+++ b/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
@@ -127,12 +127,12 @@ struct CognitiveComplexity final {
   // https://sonarcloud.io/projects?languages=c%2Ccpp&size=5   we can estimate:
   // value ~20 would result in no allocs for 98% of functions, ~12 for 96%, ~10
   // for 91%, ~8 for 88%, ~6 for 84%, ~4 for 77%, ~2 for 64%, and ~1 for 37%.
-  static_assert(sizeof(Detail) <= 8,
+  static_assert(sizeof(Detail) <= 16,
                 "Since we use SmallVector to minimize the amount of "
                 "allocations, we also need to consider the price we pay for "
                 "that in terms of stack usage. "
                 "Thus, it is good to minimize the size of the Detail struct.");
-  SmallVector<Detail, DefaultLimit> Details; // 25 elements is 200 bytes.
+  SmallVector<Detail, DefaultLimit> Details; // 25 elements is 400 bytes.
   // Yes, 25 is a magic number. This is the seemingly-sane default for the
   // upper limit for function cognitive complexity. Thus it would make sense
   // to avoid allocations for any function that does not violate the limit.
diff --git a/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp b/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
index 801d535c1b9d0..931241845c54a 100644
--- a/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
+++ b/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
@@ -829,7 +829,9 @@ TEST(SourceCodeTests, isSpelledInSource) {
   // FIXME: Should it return false on SourceLocation()? Does it matter?
   EXPECT_TRUE(isSpelledInSource(SourceLocation(), SM));
   EXPECT_FALSE(isSpelledInSource(
-      SourceLocation::getFromRawEncoding(SourceLocation::UIntTy(1 << 31)), SM));
+      SourceLocation::getFromRawEncoding(
+          SourceLocation::UIntTy(1ULL << (SourceLocation::Bits - 1))),
+      SM));
 }
 
 struct IncrementalTestStep {
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 2b9cd035623cc..8ae212cc6cc94 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -3356,6 +3356,8 @@ class ASTContext : public RefCountedBase<ASTContext> {
   getTrivialTypeSourceInfo(QualType T,
                            SourceLocation Loc = SourceLocation()) const;
 
+  CXXOperatorSourceInfo *getCXXOperatorSourceInfo(SourceRange R) const;
+
   /// Add a deallocation callback that will be invoked when the
   /// ASTContext is destroyed.
   ///
diff --git a/clang/include/clang/AST/DeclBase.h b/clang/include/clang/AST/DeclBase.h
index dd67ebc9873ff..a76d54ccd8387 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -1952,17 +1952,13 @@ class DeclContext {
     friend class ObjCContainerDecl;
     /// For the bits in DeclContextBitfields
     LLVM_PREFERRED_TYPE(DeclContextBitfields)
-    uint32_t : NumDeclContextBits;
+    uint64_t : NumDeclContextBits;
 
-    // Not a bitfield but this saves space.
-    // Note that ObjCContainerDeclBitfields is full.
-    SourceLocation AtStart;
+    uint64_t AtStart : SourceLocation::Bits;
   };
 
   /// Number of inherited and non-inherited bits in ObjCContainerDeclBitfields.
-  /// Note that here we rely on the fact that SourceLocation is 32 bits
-  /// wide. We check this with the static_assert in the ctor of DeclContext.
-  enum { NumObjCContainerDeclBits = 64 };
+  enum { NumObjCContainerDeclBits = NumDeclContextBits + SourceLocation::Bits };
 
   /// Stores the bits used by LinkageSpecDecl.
   /// If modified NumLinkageSpecDeclBits and the accessor
diff --git a/clang/include/clang/AST/DeclObjC.h b/clang/include/clang/AST/DeclObjC.h
index 9014d76f8433b..794059012ae9e 100644
--- a/clang/include/clang/AST/DeclObjC.h
+++ b/clang/include/clang/AST/DeclObjC.h
@@ -1090,10 +1090,12 @@ class ObjCContainerDecl : public NamedDecl, public DeclContext {
   /// Note, the superclass's properties are not included in the list.
   virtual void collectPropertiesToImplement(PropertyMap &PM) const {}
 
-  SourceLocation getAtStartLoc() const { return ObjCContainerDeclBits.AtStart; }
+  SourceLocation getAtStartLoc() const {
+    return SourceLocation::getFromRawEncoding(ObjCContainerDeclBits.AtStart);
+  }
 
   void setAtStartLoc(SourceLocation Loc) {
-    ObjCContainerDeclBits.AtStart = Loc;
+    ObjCContainerDeclBits.AtStart = Loc.getRawEncoding();
   }
 
   // Marks the end of the container.
diff --git a/clang/include/clang/AST/DeclarationName.h b/clang/include/clang/AST/DeclarationName.h
index 284228dc0ee47..bbb91fc14fdce 100644
--- a/clang/include/clang/AST/DeclarationName.h
+++ b/clang/include/clang/AST/DeclarationName.h
@@ -682,6 +682,11 @@ class DeclarationNameTable {
   DeclarationName getCXXLiteralOperatorName(const IdentifierInfo *II);
 };
 
+struct CXXOperatorSourceInfo {
+  SourceLocation BeginOpNameLoc;
+  SourceLocation EndOpNameLoc;
+};
+
 /// DeclarationNameLoc - Additional source/type location info
 /// for a declaration name. Needs a DeclarationName in order
 /// to be interpreted correctly.
@@ -698,8 +703,7 @@ class DeclarationNameLoc {
 
   // The location (if any) of the operator keyword is stored elsewhere.
   struct CXXOpName {
-    SourceLocation BeginOpNameLoc;
-    SourceLocation EndOpNameLoc;
+    CXXOperatorSourceInfo *OInfo;
   };
 
   // The location (if any) of the operator keyword is stored elsewhere.
@@ -719,11 +723,6 @@ class DeclarationNameLoc {
 
   void setNamedTypeLoc(TypeSourceInfo *TInfo) { NamedType.TInfo = TInfo; }
 
-  void setCXXOperatorNameRange(SourceRange Range) {
-    CXXOperatorName.BeginOpNameLoc = Range.getBegin();
-    CXXOperatorName.EndOpNameLoc = Range.getEnd();
-  }
-
   void setCXXLiteralOperatorNameLoc(SourceLocation Loc) {
     CXXLiteralOperatorName.OpNameLoc = Loc;
   }
@@ -739,12 +738,16 @@ class DeclarationNameLoc {
 
   /// Return the beginning location of the getCXXOperatorNameRange() range.
   SourceLocation getCXXOperatorNameBeginLoc() const {
-    return CXXOperatorName.BeginOpNameLoc;
+    if (!CXXOperatorName.OInfo)
+      return {};
+    return CXXOperatorName.OInfo->BeginOpNameLoc;
   }
 
   /// Return the end location of the getCXXOperatorNameRange() range.
   SourceLocation getCXXOperatorNameEndLoc() const {
-    return CXXOperatorName.EndOpNameLoc;
+    if (!CXXOperatorName.OInfo)
+      return {};
+    return CXXOperatorName.OInfo->EndOpNameLoc;
   }
 
   /// Return the range of the operator name (without the operator keyword).
@@ -771,15 +774,10 @@ class DeclarationNameLoc {
   }
 
   /// Construct location information for a non-literal C++ operator.
-  static DeclarationNameLoc makeCXXOperatorNameLoc(SourceLocation BeginLoc,
-                                                   SourceLocation EndLoc) {
-    return makeCXXOperatorNameLoc(SourceRange(BeginLoc, EndLoc));
-  }
-
-  /// Construct location information for a non-literal C++ operator.
-  static DeclarationNameLoc makeCXXOperatorNameLoc(SourceRange Range) {
+  static DeclarationNameLoc
+  makeCXXOperatorNameLoc(CXXOperatorSourceInfo *OInfo) {
     DeclarationNameLoc DNL;
-    DNL.setCXXOperatorNameRange(Range);
+    DNL.CXXOperatorName.OInfo = OInfo;
     return DNL;
   }
 
@@ -849,6 +847,13 @@ struct DeclarationNameInfo {
     LocInfo = DeclarationNameLoc::makeNamedTypeLoc(TInfo);
   }
 
+  /// Sets the range of the operator name (without the operator keyword).
+  /// Assumes it is a C++ operator.
+  void setCXXOperatorNameInfo(CXXOperatorSourceInfo *OInfo) {
+    assert(Name.getNameKind() == DeclarationName::CXXOperatorName);
+    LocInfo = DeclarationNameLoc::makeCXXOperatorNameLoc(OInfo);
+  }
+
   /// getCXXOperatorNameRange - Gets the range of the operator name
   /// (without the operator keyword). Assumes it is a (non-literal) operator.
   SourceRange getCXXOperatorNameRange() const {
@@ -857,13 +862,6 @@ struct DeclarationNameInfo {
     return LocInfo.getCXXOperatorNameRange();
   }
 
-  /// setCXXOperatorNameRange - Sets the range of the operator name
-  /// (without the operator keyword). Assumes it is a C++ operator.
-  void setCXXOperatorNameRange(SourceRange R) {
-    assert(Name.getNameKind() == DeclarationName::CXXOperatorName);
-    LocInfo = DeclarationNameLoc::makeCXXOperatorNameLoc(R);
-  }
-
   /// getCXXLiteralOperatorNameLoc - Returns the location of the literal
   /// operator name (not the operator keyword).
   /// Assumes it is a literal operator.
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index d95396fd59b95..483522547ea77 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -1182,7 +1182,7 @@ class OpaqueValueExpr : public Expr {
                   ExprObjectKind OK = OK_Ordinary, Expr *SourceExpr = nullptr)
       : Expr(OpaqueValueExprClass, T, VK, OK), SourceExpr(SourceExpr) {
     setIsUnique(false);
-    OpaqueValueExprBits.Loc = Loc;
+    OpaqueValueExprBits.Loc = Loc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -1195,7 +1195,9 @@ class OpaqueValueExpr : public Expr {
     : Expr(OpaqueValueExprClass, Empty) {}
 
   /// Retrieve the location of this expression.
-  SourceLocation getLocation() const { return OpaqueValueExprBits.Loc; }
+  SourceLocation getLocation() const {
+    return SourceLocation::getFromRawEncoding(OpaqueValueExprBits.Loc);
+  }
 
   SourceLocation getBeginLoc() const LLVM_READONLY {
     return SourceExpr ? SourceExpr->getBeginLoc() : getLocation();
@@ -1270,6 +1272,9 @@ class DeclRefExpr final
   friend class ASTStmtWriter;
   friend TrailingObjects;
 
+  /// The location of the declaration name itself.
+  SourceLocation Loc;
+
   /// The declaration that we are referencing.
   ValueDecl *D;
 
@@ -1341,13 +1346,13 @@ class DeclRefExpr final
     return DeclarationNameInfo(getDecl()->getDeclName(), getLocation(), DNLoc);
   }
 
-  SourceLocation getLocation() const { return DeclRefExprBits.Loc; }
-  void setLocation(SourceLocation L) { DeclRefExprBits.Loc = L; }
+  SourceLocation getLocation() const { return Loc; }
+  void setLocation(SourceLocation L) { Loc = L; }
 
   SourceLocation getBeginLoc() const {
     if (hasQualifier())
       return getQualifierLoc().getBeginLoc();
-    return DeclRefExprBits.Loc;
+    return Loc;
   }
 
   SourceLocation getEndLoc() const LLVM_READONLY;
@@ -2004,6 +2009,9 @@ class PredefinedExpr final
   friend class ASTStmtReader;
   friend TrailingObjects;
 
+  /// The location of this PredefinedExpr.
+  SourceLocation Loc;
+
   // PredefinedExpr is optionally followed by a single trailing
   // "Stmt *" for the predefined identifier. It is present if and only if
   // hasFunctionName() is true and is always a "StringLiteral *".
@@ -2041,8 +2049,8 @@ class PredefinedExpr final
 
   bool isTransparent() const { return PredefinedExprBits.IsTransparent; }
 
-  SourceLocation getLocation() const { return PredefinedExprBits.Loc; }
-  void setLocation(SourceLocation L) { PredefinedExprBits.Loc = L; }
+  SourceLocation getLocation() const { return Loc; }
+  void setLocation(SourceLocation L) { Loc = L; }
 
   StringLiteral *getFunctionName() {
     return hasFunctionName()
@@ -2240,6 +2248,7 @@ class ParenExpr : public Expr {
 class UnaryOperator final
     : public Expr,
       private llvm::TrailingObjects<UnaryOperator, FPOptionsOverride> {
+  SourceLocation Loc;
   Stmt *Val;
 
   FPOptionsOverride &getTrailingFPFeatures() {
@@ -2284,8 +2293,8 @@ class UnaryOperator final
   void setSubExpr(Expr *E) { Val = E; }
 
   /// getOperatorLoc - Return the location of the operator.
-  SourceLocation getOperatorLoc() const { return UnaryOperatorBits.Loc; }
-  void setOperatorLoc(SourceLocation L) { UnaryOperatorBits.Loc = L; }
+  SourceLocation getOperatorLoc() const { return Loc; }
+  void setOperatorLoc(SourceLocation L) { Loc = L; }
 
   /// Returns true if the unary operator can cause an overflow. For instance,
   ///   signed int i = INT_MAX; i++;
@@ -2728,7 +2737,7 @@ class ArraySubscriptExpr : public Expr {
       : Expr(ArraySubscriptExprClass, t, VK, OK) {
     SubExprs[LHS] = lhs;
     SubExprs[RHS] = rhs;
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = rbracketloc;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = rbracketloc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -2765,10 +2774,11 @@ class ArraySubscriptExpr : public Expr {
   SourceLocation getEndLoc() const { return getRBracketLoc(); }
 
   SourceLocation getRBracketLoc() const {
-    return ArrayOrMatrixSubscriptExprBits.RBracketLoc;
+    return SourceLocation::getFromRawEncoding(
+        ArrayOrMatrixSubscriptExprBits.RBracketLoc);
   }
   void setRBracketLoc(SourceLocation L) {
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L.getRawEncoding();
   }
 
   SourceLocation getExprLoc() const LLVM_READONLY {
@@ -2806,7 +2816,7 @@ class MatrixSubscriptExpr : public Expr {
     SubExprs[BASE] = Base;
     SubExprs[ROW_IDX] = RowIdx;
     SubExprs[COLUMN_IDX] = ColumnIdx;
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = RBracketLoc;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = RBracketLoc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -2847,10 +2857,11 @@ class MatrixSubscriptExpr : public Expr {
   }
 
   SourceLocation getRBracketLoc() const {
-    return ArrayOrMatrixSubscriptExprBits.RBracketLoc;
+    return SourceLocation::getFromRawEncoding(
+        ArrayOrMatrixSubscriptExprBits.RBracketLoc);
   }
   void setRBracketLoc(SourceLocation L) {
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L.getRawEncoding();
   }
 
   static bool classof(const Stmt *T) {
@@ -2875,9 +2886,6 @@ class MatrixSubscriptExpr : public Expr {
 class CallExpr : public Expr {
   enum { FN = 0, PREARGS_START = 1 };
 
-  /// The number of arguments in the call expression.
-  unsigned NumArgs;
-
   /// The location of the right parentheses. This has a different meaning for
   /// the derived classes of CallExpr.
   SourceLocation RParenLoc;
@@ -2904,7 +2912,7 @@ class CallExpr : public Expr {
   // the begin source location, which has a significant impact on perf as
   // getBeginLoc is assumed to be cheap.
   // The layourt is as follow:
-  // CallExpr | Begin | 4 bytes left | Trailing Objects
+  // CallExpr | Begin  | Trailing Objects
   // CXXMemberCallExpr | Trailing Objects
   // A bit in CallExprBitfields indicates if source locations are present.
 
@@ -3063,7 +3071,7 @@ class CallExpr : public Expr {
   }
 
   /// getNumArgs - Return the number of actual arguments to this call.
-  unsigned getNumArgs() const { return NumArgs; }
+  unsigned getNumArgs() const { return CallExprBits.NumArgs; }
 
   /// Retrieve the call arguments.
   Expr **getArgs() {
@@ -3111,13 +3119,15 @@ class CallExpr : public Expr {
   void shrinkNumArgs(unsigned NewNumArgs) {
     assert((NewNumArgs <= getNumArgs()) &&
            "shrinkNumArgs cannot increase the number of arguments!");
-    NumArgs = NewNumArgs;
+    CallExprBits.NumArgs = NewNumArgs;
   }
 
   /// Bluntly set a new number of arguments without doing any checks whatsoever.
   /// Only used during construction of a CallExpr in a few places in Sema.
   /// FIXME: Find a way to remove it.
-  void setNumArgsUnsafe(unsigned NewNumArgs) { NumArgs = NewNumArgs; }
+  void setNumArgsUnsafe(unsigned NewNumArgs) {
+    CallExprBits.NumArgs = NewNumArgs;
+  }
 
   typedef ExprIterator arg_iterator;
   typedef ConstExprIterator const_arg_iterator;
@@ -3303,6 +3313,8 @@ class MemberExpr final
   /// MemberLoc - This is the location of the member name.
   SourceLocation MemberLoc;
 
+  SourceLocation OperatorLoc;
+
   size_t numTrailingObjects(OverloadToken<NestedNameSpecifierLoc>) const {
     return hasQualifier();
   }
@@ -3464,7 +3476,7 @@ class MemberExpr final
                                MemberLoc, MemberDNLoc);
   }
 
-  SourceLocation getOperatorLoc() const { return MemberExprBits.OperatorLoc; }
+  SourceLocation getOperatorLoc() const { return OperatorLoc; }
 
   bool isArrow() const { return MemberExprBits.IsArrow; }
   void setArrow(bool A) { MemberExprBits.IsArrow = A; }
@@ -3958,6 +3970,7 @@ class CStyleCastExpr final
 class BinaryOperator : public Expr {
   enum { LHS, RHS, END_EXPR };
   Stmt *SubExprs[END_EXPR];
+  SourceLocation OpLoc;
 
 public:
   typedef BinaryOperatorKind Opcode;
@@ -3997,8 +4010,8 @@ class BinaryOperator : public Expr {
                                 ExprObjectKind OK, SourceLocation opLoc,
                                 FPOptionsOverride FPFeatures);
   SourceLocation getExprLoc() const { return getOperatorLoc(); }
-  SourceLocation getOperatorLoc() const { return BinaryOperatorBits.OpLoc; }
-  void setOperatorLoc(SourceLocation L) { BinaryOperatorBits.OpLoc = L; }
+  SourceLocation getOperatorLoc() const { return OpLoc; }
+  void setOperatorLoc(SourceLocation L) { OpLoc = L; }
 
   Opcode getOpcode() const {
     return static_cast<Opcode>(BinaryOperatorBits.Opc);
@@ -6449,7 +6462,8 @@ class GenericSelectionExpr final
   }
 
   SourceLocation getGenericLoc() const {
-    return GenericSelectionExprBits.GenericLoc;
+    return SourceLocation::getFromRawEncoding(
+        GenericSelectionExprBits.GenericLoc);
   }
   SourceLocation getDefaultLoc() const { return DefaultLoc; }
   SourceLocation getRParenLoc() const { return RParenLoc; }
diff --git a/clang/include/clang/AST/ExprCXX.h b/clang/include/clang/AST/ExprCXX.h
index 477373f07f25d..1eac5f80608f3 100644
--- a/clang/include/clang/AST/ExprCXX.h
+++ b/clang/include/clang/AST/ExprCXX.h
@@ -84,7 +84,7 @@ class CXXOperatorCallExpr final : public CallExpr {
   friend class ASTStmtReader;
   friend class ASTStmtWriter;
 
-  SourceRange Range;
+  SourceLocation BeginLoc;
 
   // CXXOperatorCallExpr has some trailing objects belonging
   // to CallExpr. See CallExpr for the details.
@@ -158,9 +158,9 @@ class CXXOperatorCallExpr final : public CallExpr {
                : getOperatorLoc();
   }
 
-  SourceLocation getBeginLoc() const { return Range.getBegin(); }
-  SourceLocation getEndLoc() const { return Range.getEnd(); }
-  SourceRange getSourceRange() const { return Range; }
+  SourceLocation getBeginLoc() const { return BeginLoc; }
+  SourceLocation getEndLoc() const { return getSourceRangeImpl().getEnd(); }
+  SourceRange getSourceRange() const { return getSourceRangeImpl(); }
 
   static bool classof(const Stmt *T) {
     return T->getStmtClass() == CXXOperatorCallExprClass;
@@ -724,7 +724,7 @@ class CXXBoolLiteralExpr : public Expr {
   CXXBoolLiteralExpr(bool Val, QualType Ty, SourceLocation Loc)
       : Expr(CXXBoolLiteralExprClass, Ty, VK_PRValue, OK_Ordinary) {
     CXXBoolLiteralExprBits.Value = Val;
-    CXXBoolLiteralExprBits.Loc = Loc;
+    CXXBoolLiteralExprBits.Loc = Loc.getRawEncoding();
     setDependence(ExprDependence::None);
   }
 
@@ -742,8 +742,12 @@ class CXXBoolLiteralExpr : public Expr {
   SourceLocation getBeginLoc() const { return getLocation(); }
   SourceLocation getEndLoc() const { return getLocation(); }
 
-  SourceLocation getLocation() const { return CXXBoolLiteralExprBits.Loc; }
-  void setLocation(SourceLocation L) { CXXBoolLiteralExprBits.Loc = L; }
+  SourceLocation getLocation() const {
+    return SourceLocation::getFromRawEncoding(CXXBoolLiteralExprBits.Loc);
+  }
+  void setLocation(SourceLocation L) {
+    CXXBoolLiteralExprBits.Loc = L.getRawEncoding();
+  }
 
   static bool classof(const Stmt *T) {
     return T->getStmtClass() == CXXBoolLiteralExprClass;
@@ -768,7 +772,7 @@ class CXXNullPtrLiteralExpr : public Expr {
 public:
   CXXNullPtrLiteralExpr(QualType Ty, SourceLocation Loc)
       : Expr(CXXNullPtrLiteralExprClass, Ty, VK_PRValue, OK_Ordinary) {
-    CXXNullPtrLiteralExprBits.Loc = Loc;
+    CXXNullPtrLiteralExprBits.Loc = Loc.getRawEncoding();
     setDependence(ExprDependence::None);
   }
 
@@ -778,8 +782,12 @@ class CXXNullPtrLiteralExpr : public Expr {
   SourceLocation getBeginLoc() const { return getLocation(); }
   SourceLocation getEndLoc() const { return getLocation(); }
 
-  SourceLocation getLocation() const { return CXXNullPtrLiteralExprBits.Loc; }...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jul 7, 2025

@llvm/pr-subscribers-clang-tidy

Author: Haojian Wu (hokein)

Changes

This patch extends the SourceLocation class from 4 bytes to 8 bytes, as discussed here https://discourse.llvm.org/t/revisiting-64-bit-source-locations/86556

Key Changes

  • For now, only the lower 40 bits are in use, leaving 24 bits reserved for future use.
    In theory, we can use up to 48 bits within the current encoding scheme, where a SourceLocation is represented as a 64-bit pair &lt;ModuleFileIndex, SourceLocationOffset&gt;, with ModuleFileIndex taking 16 bits.
    Starting with 40 bits should be sufficient; most AST nodes can still store locations within their StmtBitFields. If needed, we can easily adjust this limit in the future.

  • AST Changes:

    • Moved SourceLocation fields out of StmtBitFields in some AST nodes where the bitfields no longer had enough space.
    • Moved the source range of CXXOperatorNameExpr to the heap to avoid increasing the size of DeclarationNameLoc. This minimizes the size growth of frequently used nodes such as DeclRefExpr (which now grows from 32 to 40 bytes).
    • Moved NumArgs into CallExpr's bitfields (using 20 bits, which should be sufficient).
    • The remaining changes are mostly mechanical adjustments.
  • libclang: to maintain ABI compatibility in libclang, we continue using 32-bit source locations there. These are converted from 64-bit locations with bounds checking. If the value does not fit, an invalid source location is returned.

Performance Impact

  • Compile time overhead has been offset by optimizations in #146510, #146782, #146604

  • Peak memory overhead is up to 2.8%. I believe this is a reasonable trade-off.


Patch is 118.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/147292.diff

45 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/unittests/SourceCodeTests.cpp (+3-1)
  • (modified) clang/include/clang/AST/ASTContext.h (+2)
  • (modified) clang/include/clang/AST/DeclBase.h (+3-7)
  • (modified) clang/include/clang/AST/DeclObjC.h (+4-2)
  • (modified) clang/include/clang/AST/DeclarationName.h (+22-24)
  • (modified) clang/include/clang/AST/Expr.h (+40-26)
  • (modified) clang/include/clang/AST/ExprCXX.h (+52-29)
  • (modified) clang/include/clang/AST/ExprConcepts.h (+3-2)
  • (modified) clang/include/clang/AST/ExternalASTSource.h (+1-1)
  • (modified) clang/include/clang/AST/Stmt.h (+200-139)
  • (modified) clang/include/clang/Basic/SourceLocation.h (+22-6)
  • (modified) clang/include/clang/Basic/SourceManager.h (+3-3)
  • (modified) clang/include/clang/Sema/MultiplexExternalSemaSource.h (+1-1)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+3-1)
  • (modified) clang/include/clang/Serialization/SourceLocationEncoding.h (+21-12)
  • (modified) clang/lib/AST/ASTContext.cpp (+10-1)
  • (modified) clang/lib/AST/ASTImporter.cpp (+2-1)
  • (modified) clang/lib/AST/DeclarationName.cpp (+1-1)
  • (modified) clang/lib/AST/Expr.cpp (+16-15)
  • (modified) clang/lib/AST/ExprCXX.cpp (+4-4)
  • (modified) clang/lib/AST/ExprConcepts.cpp (+1-1)
  • (modified) clang/lib/AST/ExternalASTSource.cpp (+1-1)
  • (modified) clang/lib/AST/Stmt.cpp (+1-1)
  • (modified) clang/lib/Basic/SourceLocation.cpp (+38)
  • (modified) clang/lib/CodeGen/CoverageMappingGen.cpp (+2-1)
  • (modified) clang/lib/Format/FormatTokenLexer.cpp (+2-1)
  • (modified) clang/lib/Lex/Lexer.cpp (+5-2)
  • (modified) clang/lib/Parse/ParseStmtAsm.cpp (+2-1)
  • (modified) clang/lib/Sema/MultiplexExternalSemaSource.cpp (+1-1)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+8-5)
  • (modified) clang/lib/Sema/SemaLambda.cpp (+2-2)
  • (modified) clang/lib/Sema/SemaOverload.cpp (+6-3)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+2-1)
  • (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+19-14)
  • (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+1-2)
  • (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+2-2)
  • (removed) clang/test/Lexer/SourceLocationsOverflow.c (-38)
  • (modified) clang/tools/libclang/CIndex.cpp (+47-25)
  • (modified) clang/tools/libclang/CXIndexDataConsumer.cpp (+6-4)
  • (modified) clang/tools/libclang/CXSourceLocation.cpp (+44-25)
  • (modified) clang/tools/libclang/CXSourceLocation.h (+19-5)
  • (modified) clang/tools/libclang/Indexing.cpp (+16-9)
  • (modified) clang/unittests/Lex/PPMemoryAllocationsTest.cpp (+1-1)
  • (modified) clang/unittests/Serialization/SourceLocationEncodingTest.cpp (+7-3)
diff --git a/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp b/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
index e1fb42b8210e2..ecde1f7c90080 100644
--- a/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
+++ b/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
@@ -127,12 +127,12 @@ struct CognitiveComplexity final {
   // https://sonarcloud.io/projects?languages=c%2Ccpp&size=5   we can estimate:
   // value ~20 would result in no allocs for 98% of functions, ~12 for 96%, ~10
   // for 91%, ~8 for 88%, ~6 for 84%, ~4 for 77%, ~2 for 64%, and ~1 for 37%.
-  static_assert(sizeof(Detail) <= 8,
+  static_assert(sizeof(Detail) <= 16,
                 "Since we use SmallVector to minimize the amount of "
                 "allocations, we also need to consider the price we pay for "
                 "that in terms of stack usage. "
                 "Thus, it is good to minimize the size of the Detail struct.");
-  SmallVector<Detail, DefaultLimit> Details; // 25 elements is 200 bytes.
+  SmallVector<Detail, DefaultLimit> Details; // 25 elements is 400 bytes.
   // Yes, 25 is a magic number. This is the seemingly-sane default for the
   // upper limit for function cognitive complexity. Thus it would make sense
   // to avoid allocations for any function that does not violate the limit.
diff --git a/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp b/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
index 801d535c1b9d0..931241845c54a 100644
--- a/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
+++ b/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
@@ -829,7 +829,9 @@ TEST(SourceCodeTests, isSpelledInSource) {
   // FIXME: Should it return false on SourceLocation()? Does it matter?
   EXPECT_TRUE(isSpelledInSource(SourceLocation(), SM));
   EXPECT_FALSE(isSpelledInSource(
-      SourceLocation::getFromRawEncoding(SourceLocation::UIntTy(1 << 31)), SM));
+      SourceLocation::getFromRawEncoding(
+          SourceLocation::UIntTy(1ULL << (SourceLocation::Bits - 1))),
+      SM));
 }
 
 struct IncrementalTestStep {
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 2b9cd035623cc..8ae212cc6cc94 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -3356,6 +3356,8 @@ class ASTContext : public RefCountedBase<ASTContext> {
   getTrivialTypeSourceInfo(QualType T,
                            SourceLocation Loc = SourceLocation()) const;
 
+  CXXOperatorSourceInfo *getCXXOperatorSourceInfo(SourceRange R) const;
+
   /// Add a deallocation callback that will be invoked when the
   /// ASTContext is destroyed.
   ///
diff --git a/clang/include/clang/AST/DeclBase.h b/clang/include/clang/AST/DeclBase.h
index dd67ebc9873ff..a76d54ccd8387 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -1952,17 +1952,13 @@ class DeclContext {
     friend class ObjCContainerDecl;
     /// For the bits in DeclContextBitfields
     LLVM_PREFERRED_TYPE(DeclContextBitfields)
-    uint32_t : NumDeclContextBits;
+    uint64_t : NumDeclContextBits;
 
-    // Not a bitfield but this saves space.
-    // Note that ObjCContainerDeclBitfields is full.
-    SourceLocation AtStart;
+    uint64_t AtStart : SourceLocation::Bits;
   };
 
   /// Number of inherited and non-inherited bits in ObjCContainerDeclBitfields.
-  /// Note that here we rely on the fact that SourceLocation is 32 bits
-  /// wide. We check this with the static_assert in the ctor of DeclContext.
-  enum { NumObjCContainerDeclBits = 64 };
+  enum { NumObjCContainerDeclBits = NumDeclContextBits + SourceLocation::Bits };
 
   /// Stores the bits used by LinkageSpecDecl.
   /// If modified NumLinkageSpecDeclBits and the accessor
diff --git a/clang/include/clang/AST/DeclObjC.h b/clang/include/clang/AST/DeclObjC.h
index 9014d76f8433b..794059012ae9e 100644
--- a/clang/include/clang/AST/DeclObjC.h
+++ b/clang/include/clang/AST/DeclObjC.h
@@ -1090,10 +1090,12 @@ class ObjCContainerDecl : public NamedDecl, public DeclContext {
   /// Note, the superclass's properties are not included in the list.
   virtual void collectPropertiesToImplement(PropertyMap &PM) const {}
 
-  SourceLocation getAtStartLoc() const { return ObjCContainerDeclBits.AtStart; }
+  SourceLocation getAtStartLoc() const {
+    return SourceLocation::getFromRawEncoding(ObjCContainerDeclBits.AtStart);
+  }
 
   void setAtStartLoc(SourceLocation Loc) {
-    ObjCContainerDeclBits.AtStart = Loc;
+    ObjCContainerDeclBits.AtStart = Loc.getRawEncoding();
   }
 
   // Marks the end of the container.
diff --git a/clang/include/clang/AST/DeclarationName.h b/clang/include/clang/AST/DeclarationName.h
index 284228dc0ee47..bbb91fc14fdce 100644
--- a/clang/include/clang/AST/DeclarationName.h
+++ b/clang/include/clang/AST/DeclarationName.h
@@ -682,6 +682,11 @@ class DeclarationNameTable {
   DeclarationName getCXXLiteralOperatorName(const IdentifierInfo *II);
 };
 
+struct CXXOperatorSourceInfo {
+  SourceLocation BeginOpNameLoc;
+  SourceLocation EndOpNameLoc;
+};
+
 /// DeclarationNameLoc - Additional source/type location info
 /// for a declaration name. Needs a DeclarationName in order
 /// to be interpreted correctly.
@@ -698,8 +703,7 @@ class DeclarationNameLoc {
 
   // The location (if any) of the operator keyword is stored elsewhere.
   struct CXXOpName {
-    SourceLocation BeginOpNameLoc;
-    SourceLocation EndOpNameLoc;
+    CXXOperatorSourceInfo *OInfo;
   };
 
   // The location (if any) of the operator keyword is stored elsewhere.
@@ -719,11 +723,6 @@ class DeclarationNameLoc {
 
   void setNamedTypeLoc(TypeSourceInfo *TInfo) { NamedType.TInfo = TInfo; }
 
-  void setCXXOperatorNameRange(SourceRange Range) {
-    CXXOperatorName.BeginOpNameLoc = Range.getBegin();
-    CXXOperatorName.EndOpNameLoc = Range.getEnd();
-  }
-
   void setCXXLiteralOperatorNameLoc(SourceLocation Loc) {
     CXXLiteralOperatorName.OpNameLoc = Loc;
   }
@@ -739,12 +738,16 @@ class DeclarationNameLoc {
 
   /// Return the beginning location of the getCXXOperatorNameRange() range.
   SourceLocation getCXXOperatorNameBeginLoc() const {
-    return CXXOperatorName.BeginOpNameLoc;
+    if (!CXXOperatorName.OInfo)
+      return {};
+    return CXXOperatorName.OInfo->BeginOpNameLoc;
   }
 
   /// Return the end location of the getCXXOperatorNameRange() range.
   SourceLocation getCXXOperatorNameEndLoc() const {
-    return CXXOperatorName.EndOpNameLoc;
+    if (!CXXOperatorName.OInfo)
+      return {};
+    return CXXOperatorName.OInfo->EndOpNameLoc;
   }
 
   /// Return the range of the operator name (without the operator keyword).
@@ -771,15 +774,10 @@ class DeclarationNameLoc {
   }
 
   /// Construct location information for a non-literal C++ operator.
-  static DeclarationNameLoc makeCXXOperatorNameLoc(SourceLocation BeginLoc,
-                                                   SourceLocation EndLoc) {
-    return makeCXXOperatorNameLoc(SourceRange(BeginLoc, EndLoc));
-  }
-
-  /// Construct location information for a non-literal C++ operator.
-  static DeclarationNameLoc makeCXXOperatorNameLoc(SourceRange Range) {
+  static DeclarationNameLoc
+  makeCXXOperatorNameLoc(CXXOperatorSourceInfo *OInfo) {
     DeclarationNameLoc DNL;
-    DNL.setCXXOperatorNameRange(Range);
+    DNL.CXXOperatorName.OInfo = OInfo;
     return DNL;
   }
 
@@ -849,6 +847,13 @@ struct DeclarationNameInfo {
     LocInfo = DeclarationNameLoc::makeNamedTypeLoc(TInfo);
   }
 
+  /// Sets the range of the operator name (without the operator keyword).
+  /// Assumes it is a C++ operator.
+  void setCXXOperatorNameInfo(CXXOperatorSourceInfo *OInfo) {
+    assert(Name.getNameKind() == DeclarationName::CXXOperatorName);
+    LocInfo = DeclarationNameLoc::makeCXXOperatorNameLoc(OInfo);
+  }
+
   /// getCXXOperatorNameRange - Gets the range of the operator name
   /// (without the operator keyword). Assumes it is a (non-literal) operator.
   SourceRange getCXXOperatorNameRange() const {
@@ -857,13 +862,6 @@ struct DeclarationNameInfo {
     return LocInfo.getCXXOperatorNameRange();
   }
 
-  /// setCXXOperatorNameRange - Sets the range of the operator name
-  /// (without the operator keyword). Assumes it is a C++ operator.
-  void setCXXOperatorNameRange(SourceRange R) {
-    assert(Name.getNameKind() == DeclarationName::CXXOperatorName);
-    LocInfo = DeclarationNameLoc::makeCXXOperatorNameLoc(R);
-  }
-
   /// getCXXLiteralOperatorNameLoc - Returns the location of the literal
   /// operator name (not the operator keyword).
   /// Assumes it is a literal operator.
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index d95396fd59b95..483522547ea77 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -1182,7 +1182,7 @@ class OpaqueValueExpr : public Expr {
                   ExprObjectKind OK = OK_Ordinary, Expr *SourceExpr = nullptr)
       : Expr(OpaqueValueExprClass, T, VK, OK), SourceExpr(SourceExpr) {
     setIsUnique(false);
-    OpaqueValueExprBits.Loc = Loc;
+    OpaqueValueExprBits.Loc = Loc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -1195,7 +1195,9 @@ class OpaqueValueExpr : public Expr {
     : Expr(OpaqueValueExprClass, Empty) {}
 
   /// Retrieve the location of this expression.
-  SourceLocation getLocation() const { return OpaqueValueExprBits.Loc; }
+  SourceLocation getLocation() const {
+    return SourceLocation::getFromRawEncoding(OpaqueValueExprBits.Loc);
+  }
 
   SourceLocation getBeginLoc() const LLVM_READONLY {
     return SourceExpr ? SourceExpr->getBeginLoc() : getLocation();
@@ -1270,6 +1272,9 @@ class DeclRefExpr final
   friend class ASTStmtWriter;
   friend TrailingObjects;
 
+  /// The location of the declaration name itself.
+  SourceLocation Loc;
+
   /// The declaration that we are referencing.
   ValueDecl *D;
 
@@ -1341,13 +1346,13 @@ class DeclRefExpr final
     return DeclarationNameInfo(getDecl()->getDeclName(), getLocation(), DNLoc);
   }
 
-  SourceLocation getLocation() const { return DeclRefExprBits.Loc; }
-  void setLocation(SourceLocation L) { DeclRefExprBits.Loc = L; }
+  SourceLocation getLocation() const { return Loc; }
+  void setLocation(SourceLocation L) { Loc = L; }
 
   SourceLocation getBeginLoc() const {
     if (hasQualifier())
       return getQualifierLoc().getBeginLoc();
-    return DeclRefExprBits.Loc;
+    return Loc;
   }
 
   SourceLocation getEndLoc() const LLVM_READONLY;
@@ -2004,6 +2009,9 @@ class PredefinedExpr final
   friend class ASTStmtReader;
   friend TrailingObjects;
 
+  /// The location of this PredefinedExpr.
+  SourceLocation Loc;
+
   // PredefinedExpr is optionally followed by a single trailing
   // "Stmt *" for the predefined identifier. It is present if and only if
   // hasFunctionName() is true and is always a "StringLiteral *".
@@ -2041,8 +2049,8 @@ class PredefinedExpr final
 
   bool isTransparent() const { return PredefinedExprBits.IsTransparent; }
 
-  SourceLocation getLocation() const { return PredefinedExprBits.Loc; }
-  void setLocation(SourceLocation L) { PredefinedExprBits.Loc = L; }
+  SourceLocation getLocation() const { return Loc; }
+  void setLocation(SourceLocation L) { Loc = L; }
 
   StringLiteral *getFunctionName() {
     return hasFunctionName()
@@ -2240,6 +2248,7 @@ class ParenExpr : public Expr {
 class UnaryOperator final
     : public Expr,
       private llvm::TrailingObjects<UnaryOperator, FPOptionsOverride> {
+  SourceLocation Loc;
   Stmt *Val;
 
   FPOptionsOverride &getTrailingFPFeatures() {
@@ -2284,8 +2293,8 @@ class UnaryOperator final
   void setSubExpr(Expr *E) { Val = E; }
 
   /// getOperatorLoc - Return the location of the operator.
-  SourceLocation getOperatorLoc() const { return UnaryOperatorBits.Loc; }
-  void setOperatorLoc(SourceLocation L) { UnaryOperatorBits.Loc = L; }
+  SourceLocation getOperatorLoc() const { return Loc; }
+  void setOperatorLoc(SourceLocation L) { Loc = L; }
 
   /// Returns true if the unary operator can cause an overflow. For instance,
   ///   signed int i = INT_MAX; i++;
@@ -2728,7 +2737,7 @@ class ArraySubscriptExpr : public Expr {
       : Expr(ArraySubscriptExprClass, t, VK, OK) {
     SubExprs[LHS] = lhs;
     SubExprs[RHS] = rhs;
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = rbracketloc;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = rbracketloc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -2765,10 +2774,11 @@ class ArraySubscriptExpr : public Expr {
   SourceLocation getEndLoc() const { return getRBracketLoc(); }
 
   SourceLocation getRBracketLoc() const {
-    return ArrayOrMatrixSubscriptExprBits.RBracketLoc;
+    return SourceLocation::getFromRawEncoding(
+        ArrayOrMatrixSubscriptExprBits.RBracketLoc);
   }
   void setRBracketLoc(SourceLocation L) {
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L.getRawEncoding();
   }
 
   SourceLocation getExprLoc() const LLVM_READONLY {
@@ -2806,7 +2816,7 @@ class MatrixSubscriptExpr : public Expr {
     SubExprs[BASE] = Base;
     SubExprs[ROW_IDX] = RowIdx;
     SubExprs[COLUMN_IDX] = ColumnIdx;
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = RBracketLoc;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = RBracketLoc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -2847,10 +2857,11 @@ class MatrixSubscriptExpr : public Expr {
   }
 
   SourceLocation getRBracketLoc() const {
-    return ArrayOrMatrixSubscriptExprBits.RBracketLoc;
+    return SourceLocation::getFromRawEncoding(
+        ArrayOrMatrixSubscriptExprBits.RBracketLoc);
   }
   void setRBracketLoc(SourceLocation L) {
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L.getRawEncoding();
   }
 
   static bool classof(const Stmt *T) {
@@ -2875,9 +2886,6 @@ class MatrixSubscriptExpr : public Expr {
 class CallExpr : public Expr {
   enum { FN = 0, PREARGS_START = 1 };
 
-  /// The number of arguments in the call expression.
-  unsigned NumArgs;
-
   /// The location of the right parentheses. This has a different meaning for
   /// the derived classes of CallExpr.
   SourceLocation RParenLoc;
@@ -2904,7 +2912,7 @@ class CallExpr : public Expr {
   // the begin source location, which has a significant impact on perf as
   // getBeginLoc is assumed to be cheap.
   // The layourt is as follow:
-  // CallExpr | Begin | 4 bytes left | Trailing Objects
+  // CallExpr | Begin  | Trailing Objects
   // CXXMemberCallExpr | Trailing Objects
   // A bit in CallExprBitfields indicates if source locations are present.
 
@@ -3063,7 +3071,7 @@ class CallExpr : public Expr {
   }
 
   /// getNumArgs - Return the number of actual arguments to this call.
-  unsigned getNumArgs() const { return NumArgs; }
+  unsigned getNumArgs() const { return CallExprBits.NumArgs; }
 
   /// Retrieve the call arguments.
   Expr **getArgs() {
@@ -3111,13 +3119,15 @@ class CallExpr : public Expr {
   void shrinkNumArgs(unsigned NewNumArgs) {
     assert((NewNumArgs <= getNumArgs()) &&
            "shrinkNumArgs cannot increase the number of arguments!");
-    NumArgs = NewNumArgs;
+    CallExprBits.NumArgs = NewNumArgs;
   }
 
   /// Bluntly set a new number of arguments without doing any checks whatsoever.
   /// Only used during construction of a CallExpr in a few places in Sema.
   /// FIXME: Find a way to remove it.
-  void setNumArgsUnsafe(unsigned NewNumArgs) { NumArgs = NewNumArgs; }
+  void setNumArgsUnsafe(unsigned NewNumArgs) {
+    CallExprBits.NumArgs = NewNumArgs;
+  }
 
   typedef ExprIterator arg_iterator;
   typedef ConstExprIterator const_arg_iterator;
@@ -3303,6 +3313,8 @@ class MemberExpr final
   /// MemberLoc - This is the location of the member name.
   SourceLocation MemberLoc;
 
+  SourceLocation OperatorLoc;
+
   size_t numTrailingObjects(OverloadToken<NestedNameSpecifierLoc>) const {
     return hasQualifier();
   }
@@ -3464,7 +3476,7 @@ class MemberExpr final
                                MemberLoc, MemberDNLoc);
   }
 
-  SourceLocation getOperatorLoc() const { return MemberExprBits.OperatorLoc; }
+  SourceLocation getOperatorLoc() const { return OperatorLoc; }
 
   bool isArrow() const { return MemberExprBits.IsArrow; }
   void setArrow(bool A) { MemberExprBits.IsArrow = A; }
@@ -3958,6 +3970,7 @@ class CStyleCastExpr final
 class BinaryOperator : public Expr {
   enum { LHS, RHS, END_EXPR };
   Stmt *SubExprs[END_EXPR];
+  SourceLocation OpLoc;
 
 public:
   typedef BinaryOperatorKind Opcode;
@@ -3997,8 +4010,8 @@ class BinaryOperator : public Expr {
                                 ExprObjectKind OK, SourceLocation opLoc,
                                 FPOptionsOverride FPFeatures);
   SourceLocation getExprLoc() const { return getOperatorLoc(); }
-  SourceLocation getOperatorLoc() const { return BinaryOperatorBits.OpLoc; }
-  void setOperatorLoc(SourceLocation L) { BinaryOperatorBits.OpLoc = L; }
+  SourceLocation getOperatorLoc() const { return OpLoc; }
+  void setOperatorLoc(SourceLocation L) { OpLoc = L; }
 
   Opcode getOpcode() const {
     return static_cast<Opcode>(BinaryOperatorBits.Opc);
@@ -6449,7 +6462,8 @@ class GenericSelectionExpr final
   }
 
   SourceLocation getGenericLoc() const {
-    return GenericSelectionExprBits.GenericLoc;
+    return SourceLocation::getFromRawEncoding(
+        GenericSelectionExprBits.GenericLoc);
   }
   SourceLocation getDefaultLoc() const { return DefaultLoc; }
   SourceLocation getRParenLoc() const { return RParenLoc; }
diff --git a/clang/include/clang/AST/ExprCXX.h b/clang/include/clang/AST/ExprCXX.h
index 477373f07f25d..1eac5f80608f3 100644
--- a/clang/include/clang/AST/ExprCXX.h
+++ b/clang/include/clang/AST/ExprCXX.h
@@ -84,7 +84,7 @@ class CXXOperatorCallExpr final : public CallExpr {
   friend class ASTStmtReader;
   friend class ASTStmtWriter;
 
-  SourceRange Range;
+  SourceLocation BeginLoc;
 
   // CXXOperatorCallExpr has some trailing objects belonging
   // to CallExpr. See CallExpr for the details.
@@ -158,9 +158,9 @@ class CXXOperatorCallExpr final : public CallExpr {
                : getOperatorLoc();
   }
 
-  SourceLocation getBeginLoc() const { return Range.getBegin(); }
-  SourceLocation getEndLoc() const { return Range.getEnd(); }
-  SourceRange getSourceRange() const { return Range; }
+  SourceLocation getBeginLoc() const { return BeginLoc; }
+  SourceLocation getEndLoc() const { return getSourceRangeImpl().getEnd(); }
+  SourceRange getSourceRange() const { return getSourceRangeImpl(); }
 
   static bool classof(const Stmt *T) {
     return T->getStmtClass() == CXXOperatorCallExprClass;
@@ -724,7 +724,7 @@ class CXXBoolLiteralExpr : public Expr {
   CXXBoolLiteralExpr(bool Val, QualType Ty, SourceLocation Loc)
       : Expr(CXXBoolLiteralExprClass, Ty, VK_PRValue, OK_Ordinary) {
     CXXBoolLiteralExprBits.Value = Val;
-    CXXBoolLiteralExprBits.Loc = Loc;
+    CXXBoolLiteralExprBits.Loc = Loc.getRawEncoding();
     setDependence(ExprDependence::None);
   }
 
@@ -742,8 +742,12 @@ class CXXBoolLiteralExpr : public Expr {
   SourceLocation getBeginLoc() const { return getLocation(); }
   SourceLocation getEndLoc() const { return getLocation(); }
 
-  SourceLocation getLocation() const { return CXXBoolLiteralExprBits.Loc; }
-  void setLocation(SourceLocation L) { CXXBoolLiteralExprBits.Loc = L; }
+  SourceLocation getLocation() const {
+    return SourceLocation::getFromRawEncoding(CXXBoolLiteralExprBits.Loc);
+  }
+  void setLocation(SourceLocation L) {
+    CXXBoolLiteralExprBits.Loc = L.getRawEncoding();
+  }
 
   static bool classof(const Stmt *T) {
     return T->getStmtClass() == CXXBoolLiteralExprClass;
@@ -768,7 +772,7 @@ class CXXNullPtrLiteralExpr : public Expr {
 public:
   CXXNullPtrLiteralExpr(QualType Ty, SourceLocation Loc)
       : Expr(CXXNullPtrLiteralExprClass, Ty, VK_PRValue, OK_Ordinary) {
-    CXXNullPtrLiteralExprBits.Loc = Loc;
+    CXXNullPtrLiteralExprBits.Loc = Loc.getRawEncoding();
     setDependence(ExprDependence::None);
   }
 
@@ -778,8 +782,12 @@ class CXXNullPtrLiteralExpr : public Expr {
   SourceLocation getBeginLoc() const { return getLocation(); }
   SourceLocation getEndLoc() const { return getLocation(); }
 
-  SourceLocation getLocation() const { return CXXNullPtrLiteralExprBits.Loc; }...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jul 7, 2025

@llvm/pr-subscribers-clang

Author: Haojian Wu (hokein)

Changes

This patch extends the SourceLocation class from 4 bytes to 8 bytes, as discussed here https://discourse.llvm.org/t/revisiting-64-bit-source-locations/86556

Key Changes

  • For now, only the lower 40 bits are in use, leaving 24 bits reserved for future use.
    In theory, we can use up to 48 bits within the current encoding scheme, where a SourceLocation is represented as a 64-bit pair &lt;ModuleFileIndex, SourceLocationOffset&gt;, with ModuleFileIndex taking 16 bits.
    Starting with 40 bits should be sufficient; most AST nodes can still store locations within their StmtBitFields. If needed, we can easily adjust this limit in the future.

  • AST Changes:

    • Moved SourceLocation fields out of StmtBitFields in some AST nodes where the bitfields no longer had enough space.
    • Moved the source range of CXXOperatorNameExpr to the heap to avoid increasing the size of DeclarationNameLoc. This minimizes the size growth of frequently used nodes such as DeclRefExpr (which now grows from 32 to 40 bytes).
    • Moved NumArgs into CallExpr's bitfields (using 20 bits, which should be sufficient).
    • The remaining changes are mostly mechanical adjustments.
  • libclang: to maintain ABI compatibility in libclang, we continue using 32-bit source locations there. These are converted from 64-bit locations with bounds checking. If the value does not fit, an invalid source location is returned.

Performance Impact

  • Compile time overhead has been offset by optimizations in #146510, #146782, #146604

  • Peak memory overhead is up to 2.8%. I believe this is a reasonable trade-off.


Patch is 118.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/147292.diff

45 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/unittests/SourceCodeTests.cpp (+3-1)
  • (modified) clang/include/clang/AST/ASTContext.h (+2)
  • (modified) clang/include/clang/AST/DeclBase.h (+3-7)
  • (modified) clang/include/clang/AST/DeclObjC.h (+4-2)
  • (modified) clang/include/clang/AST/DeclarationName.h (+22-24)
  • (modified) clang/include/clang/AST/Expr.h (+40-26)
  • (modified) clang/include/clang/AST/ExprCXX.h (+52-29)
  • (modified) clang/include/clang/AST/ExprConcepts.h (+3-2)
  • (modified) clang/include/clang/AST/ExternalASTSource.h (+1-1)
  • (modified) clang/include/clang/AST/Stmt.h (+200-139)
  • (modified) clang/include/clang/Basic/SourceLocation.h (+22-6)
  • (modified) clang/include/clang/Basic/SourceManager.h (+3-3)
  • (modified) clang/include/clang/Sema/MultiplexExternalSemaSource.h (+1-1)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+3-1)
  • (modified) clang/include/clang/Serialization/SourceLocationEncoding.h (+21-12)
  • (modified) clang/lib/AST/ASTContext.cpp (+10-1)
  • (modified) clang/lib/AST/ASTImporter.cpp (+2-1)
  • (modified) clang/lib/AST/DeclarationName.cpp (+1-1)
  • (modified) clang/lib/AST/Expr.cpp (+16-15)
  • (modified) clang/lib/AST/ExprCXX.cpp (+4-4)
  • (modified) clang/lib/AST/ExprConcepts.cpp (+1-1)
  • (modified) clang/lib/AST/ExternalASTSource.cpp (+1-1)
  • (modified) clang/lib/AST/Stmt.cpp (+1-1)
  • (modified) clang/lib/Basic/SourceLocation.cpp (+38)
  • (modified) clang/lib/CodeGen/CoverageMappingGen.cpp (+2-1)
  • (modified) clang/lib/Format/FormatTokenLexer.cpp (+2-1)
  • (modified) clang/lib/Lex/Lexer.cpp (+5-2)
  • (modified) clang/lib/Parse/ParseStmtAsm.cpp (+2-1)
  • (modified) clang/lib/Sema/MultiplexExternalSemaSource.cpp (+1-1)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+8-5)
  • (modified) clang/lib/Sema/SemaLambda.cpp (+2-2)
  • (modified) clang/lib/Sema/SemaOverload.cpp (+6-3)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+2-1)
  • (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+19-14)
  • (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+1-2)
  • (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+2-2)
  • (removed) clang/test/Lexer/SourceLocationsOverflow.c (-38)
  • (modified) clang/tools/libclang/CIndex.cpp (+47-25)
  • (modified) clang/tools/libclang/CXIndexDataConsumer.cpp (+6-4)
  • (modified) clang/tools/libclang/CXSourceLocation.cpp (+44-25)
  • (modified) clang/tools/libclang/CXSourceLocation.h (+19-5)
  • (modified) clang/tools/libclang/Indexing.cpp (+16-9)
  • (modified) clang/unittests/Lex/PPMemoryAllocationsTest.cpp (+1-1)
  • (modified) clang/unittests/Serialization/SourceLocationEncodingTest.cpp (+7-3)
diff --git a/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp b/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
index e1fb42b8210e2..ecde1f7c90080 100644
--- a/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
+++ b/clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp
@@ -127,12 +127,12 @@ struct CognitiveComplexity final {
   // https://sonarcloud.io/projects?languages=c%2Ccpp&size=5   we can estimate:
   // value ~20 would result in no allocs for 98% of functions, ~12 for 96%, ~10
   // for 91%, ~8 for 88%, ~6 for 84%, ~4 for 77%, ~2 for 64%, and ~1 for 37%.
-  static_assert(sizeof(Detail) <= 8,
+  static_assert(sizeof(Detail) <= 16,
                 "Since we use SmallVector to minimize the amount of "
                 "allocations, we also need to consider the price we pay for "
                 "that in terms of stack usage. "
                 "Thus, it is good to minimize the size of the Detail struct.");
-  SmallVector<Detail, DefaultLimit> Details; // 25 elements is 200 bytes.
+  SmallVector<Detail, DefaultLimit> Details; // 25 elements is 400 bytes.
   // Yes, 25 is a magic number. This is the seemingly-sane default for the
   // upper limit for function cognitive complexity. Thus it would make sense
   // to avoid allocations for any function that does not violate the limit.
diff --git a/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp b/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
index 801d535c1b9d0..931241845c54a 100644
--- a/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
+++ b/clang-tools-extra/clangd/unittests/SourceCodeTests.cpp
@@ -829,7 +829,9 @@ TEST(SourceCodeTests, isSpelledInSource) {
   // FIXME: Should it return false on SourceLocation()? Does it matter?
   EXPECT_TRUE(isSpelledInSource(SourceLocation(), SM));
   EXPECT_FALSE(isSpelledInSource(
-      SourceLocation::getFromRawEncoding(SourceLocation::UIntTy(1 << 31)), SM));
+      SourceLocation::getFromRawEncoding(
+          SourceLocation::UIntTy(1ULL << (SourceLocation::Bits - 1))),
+      SM));
 }
 
 struct IncrementalTestStep {
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 2b9cd035623cc..8ae212cc6cc94 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -3356,6 +3356,8 @@ class ASTContext : public RefCountedBase<ASTContext> {
   getTrivialTypeSourceInfo(QualType T,
                            SourceLocation Loc = SourceLocation()) const;
 
+  CXXOperatorSourceInfo *getCXXOperatorSourceInfo(SourceRange R) const;
+
   /// Add a deallocation callback that will be invoked when the
   /// ASTContext is destroyed.
   ///
diff --git a/clang/include/clang/AST/DeclBase.h b/clang/include/clang/AST/DeclBase.h
index dd67ebc9873ff..a76d54ccd8387 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -1952,17 +1952,13 @@ class DeclContext {
     friend class ObjCContainerDecl;
     /// For the bits in DeclContextBitfields
     LLVM_PREFERRED_TYPE(DeclContextBitfields)
-    uint32_t : NumDeclContextBits;
+    uint64_t : NumDeclContextBits;
 
-    // Not a bitfield but this saves space.
-    // Note that ObjCContainerDeclBitfields is full.
-    SourceLocation AtStart;
+    uint64_t AtStart : SourceLocation::Bits;
   };
 
   /// Number of inherited and non-inherited bits in ObjCContainerDeclBitfields.
-  /// Note that here we rely on the fact that SourceLocation is 32 bits
-  /// wide. We check this with the static_assert in the ctor of DeclContext.
-  enum { NumObjCContainerDeclBits = 64 };
+  enum { NumObjCContainerDeclBits = NumDeclContextBits + SourceLocation::Bits };
 
   /// Stores the bits used by LinkageSpecDecl.
   /// If modified NumLinkageSpecDeclBits and the accessor
diff --git a/clang/include/clang/AST/DeclObjC.h b/clang/include/clang/AST/DeclObjC.h
index 9014d76f8433b..794059012ae9e 100644
--- a/clang/include/clang/AST/DeclObjC.h
+++ b/clang/include/clang/AST/DeclObjC.h
@@ -1090,10 +1090,12 @@ class ObjCContainerDecl : public NamedDecl, public DeclContext {
   /// Note, the superclass's properties are not included in the list.
   virtual void collectPropertiesToImplement(PropertyMap &PM) const {}
 
-  SourceLocation getAtStartLoc() const { return ObjCContainerDeclBits.AtStart; }
+  SourceLocation getAtStartLoc() const {
+    return SourceLocation::getFromRawEncoding(ObjCContainerDeclBits.AtStart);
+  }
 
   void setAtStartLoc(SourceLocation Loc) {
-    ObjCContainerDeclBits.AtStart = Loc;
+    ObjCContainerDeclBits.AtStart = Loc.getRawEncoding();
   }
 
   // Marks the end of the container.
diff --git a/clang/include/clang/AST/DeclarationName.h b/clang/include/clang/AST/DeclarationName.h
index 284228dc0ee47..bbb91fc14fdce 100644
--- a/clang/include/clang/AST/DeclarationName.h
+++ b/clang/include/clang/AST/DeclarationName.h
@@ -682,6 +682,11 @@ class DeclarationNameTable {
   DeclarationName getCXXLiteralOperatorName(const IdentifierInfo *II);
 };
 
+struct CXXOperatorSourceInfo {
+  SourceLocation BeginOpNameLoc;
+  SourceLocation EndOpNameLoc;
+};
+
 /// DeclarationNameLoc - Additional source/type location info
 /// for a declaration name. Needs a DeclarationName in order
 /// to be interpreted correctly.
@@ -698,8 +703,7 @@ class DeclarationNameLoc {
 
   // The location (if any) of the operator keyword is stored elsewhere.
   struct CXXOpName {
-    SourceLocation BeginOpNameLoc;
-    SourceLocation EndOpNameLoc;
+    CXXOperatorSourceInfo *OInfo;
   };
 
   // The location (if any) of the operator keyword is stored elsewhere.
@@ -719,11 +723,6 @@ class DeclarationNameLoc {
 
   void setNamedTypeLoc(TypeSourceInfo *TInfo) { NamedType.TInfo = TInfo; }
 
-  void setCXXOperatorNameRange(SourceRange Range) {
-    CXXOperatorName.BeginOpNameLoc = Range.getBegin();
-    CXXOperatorName.EndOpNameLoc = Range.getEnd();
-  }
-
   void setCXXLiteralOperatorNameLoc(SourceLocation Loc) {
     CXXLiteralOperatorName.OpNameLoc = Loc;
   }
@@ -739,12 +738,16 @@ class DeclarationNameLoc {
 
   /// Return the beginning location of the getCXXOperatorNameRange() range.
   SourceLocation getCXXOperatorNameBeginLoc() const {
-    return CXXOperatorName.BeginOpNameLoc;
+    if (!CXXOperatorName.OInfo)
+      return {};
+    return CXXOperatorName.OInfo->BeginOpNameLoc;
   }
 
   /// Return the end location of the getCXXOperatorNameRange() range.
   SourceLocation getCXXOperatorNameEndLoc() const {
-    return CXXOperatorName.EndOpNameLoc;
+    if (!CXXOperatorName.OInfo)
+      return {};
+    return CXXOperatorName.OInfo->EndOpNameLoc;
   }
 
   /// Return the range of the operator name (without the operator keyword).
@@ -771,15 +774,10 @@ class DeclarationNameLoc {
   }
 
   /// Construct location information for a non-literal C++ operator.
-  static DeclarationNameLoc makeCXXOperatorNameLoc(SourceLocation BeginLoc,
-                                                   SourceLocation EndLoc) {
-    return makeCXXOperatorNameLoc(SourceRange(BeginLoc, EndLoc));
-  }
-
-  /// Construct location information for a non-literal C++ operator.
-  static DeclarationNameLoc makeCXXOperatorNameLoc(SourceRange Range) {
+  static DeclarationNameLoc
+  makeCXXOperatorNameLoc(CXXOperatorSourceInfo *OInfo) {
     DeclarationNameLoc DNL;
-    DNL.setCXXOperatorNameRange(Range);
+    DNL.CXXOperatorName.OInfo = OInfo;
     return DNL;
   }
 
@@ -849,6 +847,13 @@ struct DeclarationNameInfo {
     LocInfo = DeclarationNameLoc::makeNamedTypeLoc(TInfo);
   }
 
+  /// Sets the range of the operator name (without the operator keyword).
+  /// Assumes it is a C++ operator.
+  void setCXXOperatorNameInfo(CXXOperatorSourceInfo *OInfo) {
+    assert(Name.getNameKind() == DeclarationName::CXXOperatorName);
+    LocInfo = DeclarationNameLoc::makeCXXOperatorNameLoc(OInfo);
+  }
+
   /// getCXXOperatorNameRange - Gets the range of the operator name
   /// (without the operator keyword). Assumes it is a (non-literal) operator.
   SourceRange getCXXOperatorNameRange() const {
@@ -857,13 +862,6 @@ struct DeclarationNameInfo {
     return LocInfo.getCXXOperatorNameRange();
   }
 
-  /// setCXXOperatorNameRange - Sets the range of the operator name
-  /// (without the operator keyword). Assumes it is a C++ operator.
-  void setCXXOperatorNameRange(SourceRange R) {
-    assert(Name.getNameKind() == DeclarationName::CXXOperatorName);
-    LocInfo = DeclarationNameLoc::makeCXXOperatorNameLoc(R);
-  }
-
   /// getCXXLiteralOperatorNameLoc - Returns the location of the literal
   /// operator name (not the operator keyword).
   /// Assumes it is a literal operator.
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index d95396fd59b95..483522547ea77 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -1182,7 +1182,7 @@ class OpaqueValueExpr : public Expr {
                   ExprObjectKind OK = OK_Ordinary, Expr *SourceExpr = nullptr)
       : Expr(OpaqueValueExprClass, T, VK, OK), SourceExpr(SourceExpr) {
     setIsUnique(false);
-    OpaqueValueExprBits.Loc = Loc;
+    OpaqueValueExprBits.Loc = Loc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -1195,7 +1195,9 @@ class OpaqueValueExpr : public Expr {
     : Expr(OpaqueValueExprClass, Empty) {}
 
   /// Retrieve the location of this expression.
-  SourceLocation getLocation() const { return OpaqueValueExprBits.Loc; }
+  SourceLocation getLocation() const {
+    return SourceLocation::getFromRawEncoding(OpaqueValueExprBits.Loc);
+  }
 
   SourceLocation getBeginLoc() const LLVM_READONLY {
     return SourceExpr ? SourceExpr->getBeginLoc() : getLocation();
@@ -1270,6 +1272,9 @@ class DeclRefExpr final
   friend class ASTStmtWriter;
   friend TrailingObjects;
 
+  /// The location of the declaration name itself.
+  SourceLocation Loc;
+
   /// The declaration that we are referencing.
   ValueDecl *D;
 
@@ -1341,13 +1346,13 @@ class DeclRefExpr final
     return DeclarationNameInfo(getDecl()->getDeclName(), getLocation(), DNLoc);
   }
 
-  SourceLocation getLocation() const { return DeclRefExprBits.Loc; }
-  void setLocation(SourceLocation L) { DeclRefExprBits.Loc = L; }
+  SourceLocation getLocation() const { return Loc; }
+  void setLocation(SourceLocation L) { Loc = L; }
 
   SourceLocation getBeginLoc() const {
     if (hasQualifier())
       return getQualifierLoc().getBeginLoc();
-    return DeclRefExprBits.Loc;
+    return Loc;
   }
 
   SourceLocation getEndLoc() const LLVM_READONLY;
@@ -2004,6 +2009,9 @@ class PredefinedExpr final
   friend class ASTStmtReader;
   friend TrailingObjects;
 
+  /// The location of this PredefinedExpr.
+  SourceLocation Loc;
+
   // PredefinedExpr is optionally followed by a single trailing
   // "Stmt *" for the predefined identifier. It is present if and only if
   // hasFunctionName() is true and is always a "StringLiteral *".
@@ -2041,8 +2049,8 @@ class PredefinedExpr final
 
   bool isTransparent() const { return PredefinedExprBits.IsTransparent; }
 
-  SourceLocation getLocation() const { return PredefinedExprBits.Loc; }
-  void setLocation(SourceLocation L) { PredefinedExprBits.Loc = L; }
+  SourceLocation getLocation() const { return Loc; }
+  void setLocation(SourceLocation L) { Loc = L; }
 
   StringLiteral *getFunctionName() {
     return hasFunctionName()
@@ -2240,6 +2248,7 @@ class ParenExpr : public Expr {
 class UnaryOperator final
     : public Expr,
       private llvm::TrailingObjects<UnaryOperator, FPOptionsOverride> {
+  SourceLocation Loc;
   Stmt *Val;
 
   FPOptionsOverride &getTrailingFPFeatures() {
@@ -2284,8 +2293,8 @@ class UnaryOperator final
   void setSubExpr(Expr *E) { Val = E; }
 
   /// getOperatorLoc - Return the location of the operator.
-  SourceLocation getOperatorLoc() const { return UnaryOperatorBits.Loc; }
-  void setOperatorLoc(SourceLocation L) { UnaryOperatorBits.Loc = L; }
+  SourceLocation getOperatorLoc() const { return Loc; }
+  void setOperatorLoc(SourceLocation L) { Loc = L; }
 
   /// Returns true if the unary operator can cause an overflow. For instance,
   ///   signed int i = INT_MAX; i++;
@@ -2728,7 +2737,7 @@ class ArraySubscriptExpr : public Expr {
       : Expr(ArraySubscriptExprClass, t, VK, OK) {
     SubExprs[LHS] = lhs;
     SubExprs[RHS] = rhs;
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = rbracketloc;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = rbracketloc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -2765,10 +2774,11 @@ class ArraySubscriptExpr : public Expr {
   SourceLocation getEndLoc() const { return getRBracketLoc(); }
 
   SourceLocation getRBracketLoc() const {
-    return ArrayOrMatrixSubscriptExprBits.RBracketLoc;
+    return SourceLocation::getFromRawEncoding(
+        ArrayOrMatrixSubscriptExprBits.RBracketLoc);
   }
   void setRBracketLoc(SourceLocation L) {
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L.getRawEncoding();
   }
 
   SourceLocation getExprLoc() const LLVM_READONLY {
@@ -2806,7 +2816,7 @@ class MatrixSubscriptExpr : public Expr {
     SubExprs[BASE] = Base;
     SubExprs[ROW_IDX] = RowIdx;
     SubExprs[COLUMN_IDX] = ColumnIdx;
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = RBracketLoc;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = RBracketLoc.getRawEncoding();
     setDependence(computeDependence(this));
   }
 
@@ -2847,10 +2857,11 @@ class MatrixSubscriptExpr : public Expr {
   }
 
   SourceLocation getRBracketLoc() const {
-    return ArrayOrMatrixSubscriptExprBits.RBracketLoc;
+    return SourceLocation::getFromRawEncoding(
+        ArrayOrMatrixSubscriptExprBits.RBracketLoc);
   }
   void setRBracketLoc(SourceLocation L) {
-    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L;
+    ArrayOrMatrixSubscriptExprBits.RBracketLoc = L.getRawEncoding();
   }
 
   static bool classof(const Stmt *T) {
@@ -2875,9 +2886,6 @@ class MatrixSubscriptExpr : public Expr {
 class CallExpr : public Expr {
   enum { FN = 0, PREARGS_START = 1 };
 
-  /// The number of arguments in the call expression.
-  unsigned NumArgs;
-
   /// The location of the right parentheses. This has a different meaning for
   /// the derived classes of CallExpr.
   SourceLocation RParenLoc;
@@ -2904,7 +2912,7 @@ class CallExpr : public Expr {
   // the begin source location, which has a significant impact on perf as
   // getBeginLoc is assumed to be cheap.
   // The layourt is as follow:
-  // CallExpr | Begin | 4 bytes left | Trailing Objects
+  // CallExpr | Begin  | Trailing Objects
   // CXXMemberCallExpr | Trailing Objects
   // A bit in CallExprBitfields indicates if source locations are present.
 
@@ -3063,7 +3071,7 @@ class CallExpr : public Expr {
   }
 
   /// getNumArgs - Return the number of actual arguments to this call.
-  unsigned getNumArgs() const { return NumArgs; }
+  unsigned getNumArgs() const { return CallExprBits.NumArgs; }
 
   /// Retrieve the call arguments.
   Expr **getArgs() {
@@ -3111,13 +3119,15 @@ class CallExpr : public Expr {
   void shrinkNumArgs(unsigned NewNumArgs) {
     assert((NewNumArgs <= getNumArgs()) &&
            "shrinkNumArgs cannot increase the number of arguments!");
-    NumArgs = NewNumArgs;
+    CallExprBits.NumArgs = NewNumArgs;
   }
 
   /// Bluntly set a new number of arguments without doing any checks whatsoever.
   /// Only used during construction of a CallExpr in a few places in Sema.
   /// FIXME: Find a way to remove it.
-  void setNumArgsUnsafe(unsigned NewNumArgs) { NumArgs = NewNumArgs; }
+  void setNumArgsUnsafe(unsigned NewNumArgs) {
+    CallExprBits.NumArgs = NewNumArgs;
+  }
 
   typedef ExprIterator arg_iterator;
   typedef ConstExprIterator const_arg_iterator;
@@ -3303,6 +3313,8 @@ class MemberExpr final
   /// MemberLoc - This is the location of the member name.
   SourceLocation MemberLoc;
 
+  SourceLocation OperatorLoc;
+
   size_t numTrailingObjects(OverloadToken<NestedNameSpecifierLoc>) const {
     return hasQualifier();
   }
@@ -3464,7 +3476,7 @@ class MemberExpr final
                                MemberLoc, MemberDNLoc);
   }
 
-  SourceLocation getOperatorLoc() const { return MemberExprBits.OperatorLoc; }
+  SourceLocation getOperatorLoc() const { return OperatorLoc; }
 
   bool isArrow() const { return MemberExprBits.IsArrow; }
   void setArrow(bool A) { MemberExprBits.IsArrow = A; }
@@ -3958,6 +3970,7 @@ class CStyleCastExpr final
 class BinaryOperator : public Expr {
   enum { LHS, RHS, END_EXPR };
   Stmt *SubExprs[END_EXPR];
+  SourceLocation OpLoc;
 
 public:
   typedef BinaryOperatorKind Opcode;
@@ -3997,8 +4010,8 @@ class BinaryOperator : public Expr {
                                 ExprObjectKind OK, SourceLocation opLoc,
                                 FPOptionsOverride FPFeatures);
   SourceLocation getExprLoc() const { return getOperatorLoc(); }
-  SourceLocation getOperatorLoc() const { return BinaryOperatorBits.OpLoc; }
-  void setOperatorLoc(SourceLocation L) { BinaryOperatorBits.OpLoc = L; }
+  SourceLocation getOperatorLoc() const { return OpLoc; }
+  void setOperatorLoc(SourceLocation L) { OpLoc = L; }
 
   Opcode getOpcode() const {
     return static_cast<Opcode>(BinaryOperatorBits.Opc);
@@ -6449,7 +6462,8 @@ class GenericSelectionExpr final
   }
 
   SourceLocation getGenericLoc() const {
-    return GenericSelectionExprBits.GenericLoc;
+    return SourceLocation::getFromRawEncoding(
+        GenericSelectionExprBits.GenericLoc);
   }
   SourceLocation getDefaultLoc() const { return DefaultLoc; }
   SourceLocation getRParenLoc() const { return RParenLoc; }
diff --git a/clang/include/clang/AST/ExprCXX.h b/clang/include/clang/AST/ExprCXX.h
index 477373f07f25d..1eac5f80608f3 100644
--- a/clang/include/clang/AST/ExprCXX.h
+++ b/clang/include/clang/AST/ExprCXX.h
@@ -84,7 +84,7 @@ class CXXOperatorCallExpr final : public CallExpr {
   friend class ASTStmtReader;
   friend class ASTStmtWriter;
 
-  SourceRange Range;
+  SourceLocation BeginLoc;
 
   // CXXOperatorCallExpr has some trailing objects belonging
   // to CallExpr. See CallExpr for the details.
@@ -158,9 +158,9 @@ class CXXOperatorCallExpr final : public CallExpr {
                : getOperatorLoc();
   }
 
-  SourceLocation getBeginLoc() const { return Range.getBegin(); }
-  SourceLocation getEndLoc() const { return Range.getEnd(); }
-  SourceRange getSourceRange() const { return Range; }
+  SourceLocation getBeginLoc() const { return BeginLoc; }
+  SourceLocation getEndLoc() const { return getSourceRangeImpl().getEnd(); }
+  SourceRange getSourceRange() const { return getSourceRangeImpl(); }
 
   static bool classof(const Stmt *T) {
     return T->getStmtClass() == CXXOperatorCallExprClass;
@@ -724,7 +724,7 @@ class CXXBoolLiteralExpr : public Expr {
   CXXBoolLiteralExpr(bool Val, QualType Ty, SourceLocation Loc)
       : Expr(CXXBoolLiteralExprClass, Ty, VK_PRValue, OK_Ordinary) {
     CXXBoolLiteralExprBits.Value = Val;
-    CXXBoolLiteralExprBits.Loc = Loc;
+    CXXBoolLiteralExprBits.Loc = Loc.getRawEncoding();
     setDependence(ExprDependence::None);
   }
 
@@ -742,8 +742,12 @@ class CXXBoolLiteralExpr : public Expr {
   SourceLocation getBeginLoc() const { return getLocation(); }
   SourceLocation getEndLoc() const { return getLocation(); }
 
-  SourceLocation getLocation() const { return CXXBoolLiteralExprBits.Loc; }
-  void setLocation(SourceLocation L) { CXXBoolLiteralExprBits.Loc = L; }
+  SourceLocation getLocation() const {
+    return SourceLocation::getFromRawEncoding(CXXBoolLiteralExprBits.Loc);
+  }
+  void setLocation(SourceLocation L) {
+    CXXBoolLiteralExprBits.Loc = L.getRawEncoding();
+  }
 
   static bool classof(const Stmt *T) {
     return T->getStmtClass() == CXXBoolLiteralExprClass;
@@ -768,7 +772,7 @@ class CXXNullPtrLiteralExpr : public Expr {
 public:
   CXXNullPtrLiteralExpr(QualType Ty, SourceLocation Loc)
       : Expr(CXXNullPtrLiteralExprClass, Ty, VK_PRValue, OK_Ordinary) {
-    CXXNullPtrLiteralExprBits.Loc = Loc;
+    CXXNullPtrLiteralExprBits.Loc = Loc.getRawEncoding();
     setDependence(ExprDependence::None);
   }
 
@@ -778,8 +782,12 @@ class CXXNullPtrLiteralExpr : public Expr {
   SourceLocation getBeginLoc() const { return getLocation(); }
   SourceLocation getEndLoc() const { return getLocation(); }
 
-  SourceLocation getLocation() const { return CXXNullPtrLiteralExprBits.Loc; }...
[truncated]

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a new review? What changed from the last one I looked at?

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this with google's internal modules test? If yes, then LG.

}

public:
using RawLocEncoding = uint64_t;
// 16 bits should be sufficient to store the module file index.
constexpr static unsigned ModuleFileIndexBits = 16;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the comment above, where says we use 24 (or 23) bits to store module file index. Personally I prefer to have higher bits.. since 2^16 = 66536 looks may be a possible number in practice. How about using 20 bits for module file index?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reflects the "only the lower 16 bits of A are used to store the module file index" comment.

I’m fine with using 20 bits for the module file index (which would still leave us with 4 spare bits that could potentially be used for the offset). However, I’d prefer to make that change in a follow-up patch to keep this one as minimal as possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@hokein
Copy link
Collaborator Author

hokein commented Jul 8, 2025

Why is this a new review? What changed from the last one I looked at?

Ah, right -- this patch is actually derived from the previous one, and it is simpler. That older patch became a bit outdated since some of its changes were split out and already landed in main. I found it easier to start fresh with a new patch rather than rebasing the old one.

That said, if you’d prefer to stick with the old review (to preserve the initial comments there), I can force-push this patch to the old review instead.

@hokein hokein force-pushed the perf/64-sloc-new4 branch from bc259ea to f7a2219 Compare July 8, 2025 09:52
@hokein hokein requested a review from JDevlieghere as a code owner July 8, 2025 09:52
@llvmbot llvmbot added the lldb label Jul 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:as-a-library libclang and C++ API clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category clang-format clang-tidy clang-tools-extra clangd lldb
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants