Skip to content

inconsistency alignment of store instructions #1204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
PikachuHyA opened this issue Dec 4, 2024 · 4 comments · May be fixed by #1433
Closed

inconsistency alignment of store instructions #1204

PikachuHyA opened this issue Dec 4, 2024 · 4 comments · May be fixed by #1433
Assignees
Labels
IR difference A difference in ClangIR-generated LLVM IR that could complicate reusing original CodeGen tests

Comments

@PikachuHyA
Copy link
Collaborator

I'm testing #1076 with clang/test/CodeGen/tbaa.cpp.

and I have noticed an inconsistency in the alignment of store instructions between the LLVM IR generated by clang and the one produced by ClangIR.

Here is a simplified example that demonstrates the difference:

// demo.cc
typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned int uint32_t;
typedef unsigned long long uint64_t;

typedef struct {
   uint16_t f16;
   uint32_t f32;
   uint16_t f16_2;
   uint32_t f32_2;
} StructA;

uint32_t g2(uint32_t *s, StructA *A, uint64_t count) {
    *s = 1;
    A->f16 = 4;
    return *s;
}

LLVM IR Generated by Clang

./bin/clang++ -c demo.cc -Xclang -emit-llvm -o demo.orig.ll

The following is the LLVM IR output when compiled with clang:

// demo.orig.ll

define dso_local noundef i32 @_Z2g2PjP7StructAy(ptr noundef %s, ptr noundef %A, i64 noundef %count) #0 {
entry:
  %s.addr = alloca ptr, align 8
  %A.addr = alloca ptr, align 8
  %count.addr = alloca i64, align 8
  store ptr %s, ptr %s.addr, align 8
  store ptr %A, ptr %A.addr, align 8
  store i64 %count, ptr %count.addr, align 8
  %0 = load ptr, ptr %s.addr, align 8
  store i32 1, ptr %0, align 4
  %1 = load ptr, ptr %A.addr, align 8
  %f16 = getelementptr inbounds nuw %struct.StructA, ptr %1, i32 0, i32 0
  // highlight align 4
  store i16 4, ptr %f16, align 4
  %2 = load ptr, ptr %s.addr, align 8
  %3 = load i32, ptr %2, align 4
  ret i32 %3
}

LLVM IR Generated by ClangIR

./bin/clang++ -c demo.cc -Xclang -emit-llvm -o demo.ll -fclangir

In contrast, the LLVM IR produced by ClangIR is as follows:

// demo.ll

define dso_local i32 @_Z2g2PjP7StructAy(ptr %0, ptr %1, i64 %2) #0 {
  %4 = alloca ptr, i64 1, align 8
  %5 = alloca ptr, i64 1, align 8
  %6 = alloca i64, i64 1, align 8
  %7 = alloca i32, i64 1, align 4
  store ptr %0, ptr %4, align 8
  store ptr %1, ptr %5, align 8
  store i64 %2, ptr %6, align 8
  %8 = load ptr, ptr %4, align 8
  store i32 1, ptr %8, align 4
  %9 = load ptr, ptr %5, align 8
  %10 = getelementptr %struct.StructA, ptr %9, i32 0, i32 0
  // highlight align 2
  store i16 4, ptr %10, align 2
  %11 = load ptr, ptr %4, align 8
  %12 = load i32, ptr %11, align 4
  store i32 %12, ptr %7, align 4
  %13 = load i32, ptr %7, align 4
  ret i32 %13
}

Comparison

The significant difference lies in this line:

  • In the Clang-generated LLVM IR: store i16 4, ptr %f16, align 4
  • In the ClangIR-generated LLVM IR: store i16 4, ptr %10, align 2
@smeenai
Copy link
Collaborator

smeenai commented Dec 4, 2024

Interesting. align 2 is valid here, but I guess Clang is taking advantage of the fact that the struct pointer must be 4-byte aligned?

@Lancern
Copy link
Member

Lancern commented Dec 4, 2024

I guess Clang is taking advantage of the fact that the struct pointer must be 4-byte aligned?

Indeed, another interesting example to showcase this:

struct Foo {
  short a;
  short b;
  short c;
  short d;
  int e;   // Make the struct 4-byte aligned
};

void test(Foo *ptr) {
  ptr->a = 1;  // align 4
  ptr->b = 2;  // align 2
  ptr->c = 3;  // align 4
  ptr->d = 4;  // align 2
}

The alignments original clang emitted for the 4 stores in test are 4, 2, 4, 2.

@Lancern Lancern added the IR difference A difference in ClangIR-generated LLVM IR that could complicate reusing original CodeGen tests label Dec 4, 2024
@smeenai
Copy link
Collaborator

smeenai commented Dec 4, 2024

Yup, and it's also able to do clever things like (https://godbolt.org/z/hP8nbhara):

struct Foo {
  short a;
  short b;
  short c;
  short d;
  long e;   // Make the struct 8-byte aligned
};

void test(Foo *ptr) {
  ptr->a = 1;  // align 8
  ptr->b = 2;  // align 2
  ptr->c = 3;  // align 4
  ptr->d = 4;  // align 2
}

@FantasqueX FantasqueX self-assigned this Mar 1, 2025
andykaylor added a commit that referenced this issue May 22, 2025
This change corrects the alignment of store operations and fixes a
related problem with calculation of member offsets (we weren't
accounting for the alignment of the field whose offset we were
calculating.

Many tests are affected by this, but most just needed a wildcard match
to ignore the explicit alignment which wasn't present before. In cases
where I updated a check for a specific alignment value, I compared
against classic codegen to verify that we are now producing the same
alignment.

Two new tests are added align-store.c and alignment.cpp. The second of
these partially copies a test of the same name from clang/test/CodeGen.
It's testing globals and isn't directly related to the code changes
here, but we didn't seem to have a test for this. I put the store
alignment tests in a different file because inconsistency between CIR
and LLVM IR in placement of globals would have made a combined test
difficult to follow.

This addresses #1204
@andykaylor
Copy link
Collaborator

Fixed by #1637

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IR difference A difference in ClangIR-generated LLVM IR that could complicate reusing original CodeGen tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants