Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions llvm/include/llvm/Analysis/TargetTransformInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,18 @@ class TargetTransformInfo {

LLVM_ABI bool isNoopAddrSpaceCast(unsigned FromAS, unsigned ToAS) const;

// Given an address space cast of the given pointer value, calculate the known
// bits of the source pointer in the source addrspace and the destination
// pointer in the destination addrspace.
LLVM_ABI std::pair<KnownBits, KnownBits>
computeKnownBitsAddrSpaceCast(unsigned ToAS, const Value &PtrOp) const;

// Given an address space cast, calculate the known bits of the resulting ptr
// in the destination addrspace using the known bits of the source pointer in
// the source addrspace.
LLVM_ABI KnownBits computeKnownBitsAddrSpaceCast(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these two functions have to be in TTI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that the right place? Targets can override the semantic that way if they like. And since the function is split into two parts, it's easier to override only the part you need to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, yes and no. The no part is that I don't see any target-dependent part here that can't be handled with information from the data layout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addrspacecast. For example, the case where I cast larger address to smaller address, I just trunc. This might be valid for all practical cases (given integral), but it's in no way guaranteed afaiu.
Also, the anyext for smaller->larger casts is a conservative choice that could be improved with target knowledge, ig

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addrspacecast.

Then I'd just leave the cast between two address spaces part into TTI, which makes sense, and then leave no default implementation. Even for us, with GAS, AS5->AS0 is not zero extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already replaced the zext by an anyext in a previous commit, so I don't assume anything specific about the casts afaics. Except that larger->smaller casts truncate. But that may be too much to assume, either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least in general. For amdgpu, this assumption has already been in the original code, so ig I can add an override for computeKnownBitsAddrSpaceCast in AMDGPUTargetTransformInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best option would be to add a case for Instruction::AddrSpaceCast to computeKnownBitsFromOperator in ValueTracking.cpp. But since we cannot access TTI functions from there, ig that's not an option.
This means, however, that computeKnownBitsAddrSpaceCast should stay recursive. Otherwise, addrspacecasts as source values will never be handled correctly and I don't think the user of computeKnownBitsAddrSpaceCast should be required to build a loop around it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping. If there is not really a good solution, ig the best option would be to remove the target implementation of computeKnownBits for AddrSpaceCast that Matt suggested? So that we can at least fix the original bug our fuzzer detected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a better solution now, without pulling a dependence on TTI into ValueTracking, which is a bigger change.

Really TTI doesn't feel like the right place for this. It belongs in some kind of other target-IR-information that doesn't depend on codegen, but we do not have such a place today

unsigned FromAS, unsigned ToAS, const KnownBits &FromPtrBits) const;

/// Return true if globals in this address space can have initializers other
/// than `undef`.
LLVM_ABI bool
Expand Down
47 changes: 47 additions & 0 deletions llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

#include "llvm/Analysis/ScalarEvolutionExpressions.h"
#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/Analysis/VectorUtils.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/GetElementPtrTypeIterator.h"
Expand Down Expand Up @@ -151,6 +152,52 @@ class TargetTransformInfoImplBase {
}

virtual bool isNoopAddrSpaceCast(unsigned, unsigned) const { return false; }

virtual std::pair<KnownBits, KnownBits>
computeKnownBitsAddrSpaceCast(unsigned ToAS, const Value &PtrOp) const {
const Type *PtrTy = PtrOp.getType();
assert(PtrTy->isPtrOrPtrVectorTy() &&
"expected pointer or pointer vector type");
unsigned FromAS = PtrTy->getPointerAddressSpace();

if (DL.isNonIntegralAddressSpace(FromAS))
return std::pair(KnownBits(DL.getPointerSizeInBits(FromAS)),
KnownBits(DL.getPointerSizeInBits(ToAS)));

KnownBits FromPtrBits;
if (const AddrSpaceCastInst *CastI = dyn_cast<AddrSpaceCastInst>(&PtrOp)) {
std::pair<KnownBits, KnownBits> KB = computeKnownBitsAddrSpaceCast(
CastI->getDestAddressSpace(), *CastI->getPointerOperand());
FromPtrBits = KB.second;
} else if (FromAS == 0 &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could avoid adding new checks for AS==0. I think this can easily be handled even without waiting for #131557, by adding something like a isNullPointerAllZeroes(unsigned AS) function to DataLayout.h. That function can then do the AS==0 check and will make it much easier to extend this to other address spaces in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

PatternMatch::match(&PtrOp, PatternMatch::m_Zero())) {
// For addrspace 0, we know that a null pointer has the value 0.
FromPtrBits = KnownBits::makeConstant(
APInt::getZero(DL.getPointerSizeInBits(FromAS)));
} else {
FromPtrBits = computeKnownBits(&PtrOp, DL, nullptr);
}

KnownBits ToPtrBits =
computeKnownBitsAddrSpaceCast(FromAS, ToAS, FromPtrBits);

return std::pair(FromPtrBits, ToPtrBits);
}

virtual KnownBits
computeKnownBitsAddrSpaceCast(unsigned FromAS, unsigned ToAS,
const KnownBits &FromPtrBits) const {
unsigned ToASBitSize = DL.getPointerSizeInBits(ToAS);

if (DL.isNonIntegralAddressSpace(FromAS))
return KnownBits(ToASBitSize);

// By default, we assume that all valid "larger" (e.g. 64-bit) to "smaller"
// (e.g. 32-bit) casts work by chopping off the high bits.
// By default, we do not assume that null results in null again.
return FromPtrBits.anyextOrTrunc(ToASBitSize);
}

virtual bool
canHaveNonUndefGlobalInitializerInAddressSpace(unsigned AS) const {
return AS == 0;
Expand Down
11 changes: 11 additions & 0 deletions llvm/lib/Analysis/TargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,17 @@ bool TargetTransformInfo::isNoopAddrSpaceCast(unsigned FromAS,
return TTIImpl->isNoopAddrSpaceCast(FromAS, ToAS);
}

std::pair<KnownBits, KnownBits>
TargetTransformInfo::computeKnownBitsAddrSpaceCast(unsigned ToAS,
const Value &PtrOp) const {
return TTIImpl->computeKnownBitsAddrSpaceCast(ToAS, PtrOp);
}

KnownBits TargetTransformInfo::computeKnownBitsAddrSpaceCast(
unsigned FromAS, unsigned ToAS, const KnownBits &FromPtrBits) const {
return TTIImpl->computeKnownBitsAddrSpaceCast(FromAS, ToAS, FromPtrBits);
}

bool TargetTransformInfo::canHaveNonUndefGlobalInitializerInAddressSpace(
unsigned AS) const {
return TTIImpl->canHaveNonUndefGlobalInitializerInAddressSpace(AS);
Expand Down
35 changes: 0 additions & 35 deletions llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1150,41 +1150,6 @@ Value *GCNTTIImpl::rewriteIntrinsicWithAddressSpace(IntrinsicInst *II,
ConstantInt::getTrue(Ctx) : ConstantInt::getFalse(Ctx);
return NewVal;
}
case Intrinsic::ptrmask: {
unsigned OldAS = OldV->getType()->getPointerAddressSpace();
unsigned NewAS = NewV->getType()->getPointerAddressSpace();
Value *MaskOp = II->getArgOperand(1);
Type *MaskTy = MaskOp->getType();

bool DoTruncate = false;

const GCNTargetMachine &TM =
static_cast<const GCNTargetMachine &>(getTLI()->getTargetMachine());
if (!TM.isNoopAddrSpaceCast(OldAS, NewAS)) {
// All valid 64-bit to 32-bit casts work by chopping off the high
// bits. Any masking only clearing the low bits will also apply in the new
// address space.
if (DL.getPointerSizeInBits(OldAS) != 64 ||
DL.getPointerSizeInBits(NewAS) != 32)
return nullptr;

// TODO: Do we need to thread more context in here?
KnownBits Known = computeKnownBits(MaskOp, DL, nullptr, II);
if (Known.countMinLeadingOnes() < 32)
return nullptr;

DoTruncate = true;
}

IRBuilder<> B(II);
if (DoTruncate) {
MaskTy = B.getInt32Ty();
MaskOp = B.CreateTrunc(MaskOp, MaskTy);
}

return B.CreateIntrinsic(Intrinsic::ptrmask, {NewV->getType(), MaskTy},
{NewV, MaskOp});
}
case Intrinsic::amdgcn_flat_atomic_fmax_num:
case Intrinsic::amdgcn_flat_atomic_fmin_num: {
Type *DestTy = II->getType();
Expand Down
90 changes: 75 additions & 15 deletions llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,12 @@ class InferAddressSpacesImpl {

bool isSafeToCastConstAddrSpace(Constant *C, unsigned NewAS) const;

Value *clonePtrMaskWithNewAddressSpace(
IntrinsicInst *I, unsigned NewAddrSpace,
const ValueToValueMapTy &ValueWithNewAddrSpace,
const PredicatedAddrSpaceMapTy &PredicatedAS,
SmallVectorImpl<const Use *> *PoisonUsesToFix) const;

Value *cloneInstructionWithNewAddressSpace(
Instruction *I, unsigned NewAddrSpace,
const ValueToValueMapTy &ValueWithNewAddrSpace,
Expand Down Expand Up @@ -651,6 +657,69 @@ static Value *operandWithNewAddressSpaceOrCreatePoison(
return PoisonValue::get(NewPtrTy);
}

// A helper function for cloneInstructionWithNewAddressSpace. Handles the
// conversion of a ptrmask intrinsic instruction.
Value *InferAddressSpacesImpl::clonePtrMaskWithNewAddressSpace(
IntrinsicInst *I, unsigned NewAddrSpace,
const ValueToValueMapTy &ValueWithNewAddrSpace,
const PredicatedAddrSpaceMapTy &PredicatedAS,
SmallVectorImpl<const Use *> *PoisonUsesToFix) const {
const Use &PtrOpUse = I->getArgOperandUse(0);
unsigned OldAddrSpace = PtrOpUse->getType()->getPointerAddressSpace();
Value *MaskOp = I->getArgOperand(1);
Type *MaskTy = MaskOp->getType();

std::optional<KnownBits> OldPtrBits;
std::optional<KnownBits> NewPtrBits;
if (!TTI->isNoopAddrSpaceCast(OldAddrSpace, NewAddrSpace)) {
if (std::optional<std::pair<KnownBits, KnownBits>> KB =
TTI->computeKnownBitsAddrSpaceCast(NewAddrSpace, *PtrOpUse.get())) {
OldPtrBits = KB->first;
NewPtrBits = KB->second;
}
}

// If the pointers in both addrspaces have a bitwise representation and if the
// representation of the new pointer is smaller (fewer bits) than the old one,
// check if the mask is applicable to the ptr in the new addrspace. Any
// masking only clearing the low bits will also apply in the new addrspace
// Note: checking if the mask clears high bits is not sufficient as those
// might have already been 0 in the old ptr.
if (NewPtrBits && OldPtrBits->getBitWidth() > NewPtrBits->getBitWidth()) {
KnownBits MaskBits =
computeKnownBits(MaskOp, *DL, /*AssumptionCache=*/nullptr, I);
// Set all unknown bits of the old ptr to 1, so that we are conservative in
// checking which bits are cleared by the mask.
OldPtrBits->One |= ~OldPtrBits->Zero;
// Check which bits are cleared by the mask in the old ptr.
KnownBits ClearedBits = KnownBits::sub(*OldPtrBits, *OldPtrBits & MaskBits);

// If the mask isn't applicable to the new ptr, leave the ptrmask as-is and
// insert an addrspacecast after it.
if (ClearedBits.countMaxActiveBits() > NewPtrBits->countMaxActiveBits()) {
std::optional<BasicBlock::iterator> InsertPoint =
I->getInsertionPointAfterDef();
assert(InsertPoint && "insertion after ptrmask should be possible");
Type *NewPtrType = getPtrOrVecOfPtrsWithNewAS(I->getType(), NewAddrSpace);
Instruction *AddrSpaceCast =
new AddrSpaceCastInst(I, NewPtrType, "", *InsertPoint);
AddrSpaceCast->setDebugLoc(I->getDebugLoc());
return AddrSpaceCast;
}
}

IRBuilder<> B(I);
if (NewPtrBits) {
MaskTy = MaskTy->getWithNewBitWidth(NewPtrBits->getBitWidth());
MaskOp = B.CreateTrunc(MaskOp, MaskTy);
}
Value *NewPtr = operandWithNewAddressSpaceOrCreatePoison(
PtrOpUse, NewAddrSpace, ValueWithNewAddrSpace, PredicatedAS,
PoisonUsesToFix);
return B.CreateIntrinsic(Intrinsic::ptrmask, {NewPtr->getType(), MaskTy},
{NewPtr, MaskOp});
}

// Returns a clone of `I` with its operands converted to those specified in
// ValueWithNewAddrSpace. Due to potential cycles in the data flow graph, an
// operand whose address space needs to be modified might not exist in
Expand All @@ -660,9 +729,6 @@ static Value *operandWithNewAddressSpaceOrCreatePoison(
// Note that we do not necessarily clone `I`, e.g., if it is an addrspacecast
// from a pointer whose type already matches. Therefore, this function returns a
// Value* instead of an Instruction*.
//
// This may also return nullptr in the case the instruction could not be
// rewritten.
Value *InferAddressSpacesImpl::cloneInstructionWithNewAddressSpace(
Instruction *I, unsigned NewAddrSpace,
const ValueToValueMapTy &ValueWithNewAddrSpace,
Expand All @@ -683,17 +749,8 @@ Value *InferAddressSpacesImpl::cloneInstructionWithNewAddressSpace(
// Technically the intrinsic ID is a pointer typed argument, so specially
// handle calls early.
assert(II->getIntrinsicID() == Intrinsic::ptrmask);
Value *NewPtr = operandWithNewAddressSpaceOrCreatePoison(
II->getArgOperandUse(0), NewAddrSpace, ValueWithNewAddrSpace,
PredicatedAS, PoisonUsesToFix);
Value *Rewrite =
TTI->rewriteIntrinsicWithAddressSpace(II, II->getArgOperand(0), NewPtr);
if (Rewrite) {
assert(Rewrite != II && "cannot modify this pointer operation in place");
return Rewrite;
}

return nullptr;
return clonePtrMaskWithNewAddressSpace(
II, NewAddrSpace, ValueWithNewAddrSpace, PredicatedAS, PoisonUsesToFix);
}

unsigned AS = TTI->getAssumedAddrSpace(I);
Expand Down Expand Up @@ -1331,7 +1388,10 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(

unsigned OperandNo = PoisonUse->getOperandNo();
assert(isa<PoisonValue>(NewV->getOperand(OperandNo)));
NewV->setOperand(OperandNo, ValueWithNewAddrSpace.lookup(PoisonUse->get()));
WeakTrackingVH NewOp = ValueWithNewAddrSpace.lookup(PoisonUse->get());
assert(NewOp &&
"poison replacements in ValueWithNewAddrSpace shouldn't be null");
NewV->setOperand(OperandNo, NewOp);
}

SmallVector<Instruction *, 16> DeadInstructions;
Expand Down
Loading