forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 4
Clang xtensa target #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Structs with delicate packing are often larger in MSVC than Itanium. 099a52f did not make sizeof(InputSection) smaller for MSVC. Just exclude MSVC.
Glue operand is only present if there are variadic register operands, which makes it optional. Also, change the number of fixed operands to 1 (the trap ID).
Wide shift nodes produce two results, not one. Reuse the added type profile to define the standard "shift parts" nodes.
It's needed for llvm#116409, which hangs with slow unwind.
…read can hold (llvm#116409) I've run into an issue where TSan can't be used on some code without turning off deadlock detection because a thread tries to hold too many mutexes. It would be preferable to be able to use deadlock detection as that is a major benefit of TSan. Its mentioned in google/sanitizers#950 that the 64 mutex limit was an arbitrary number. I've increased it to 128 and all the tests still pass. Considering the increasing number of cores on CPUs and how programs can now use more threads to take advantage of it, I think raising the limit to 128 would be some good future proofing --------- Co-authored-by: Vitaly Buka <[email protected]>
…ies (llvm#115930) This fixes a bug where variadic segment properties would not be elided when printing `prop-dict`.
…6281) Inferring the ARM64EC target can lead to errors. The `-machine:arm64ec` option may include x86_64 input files, and any valid ARM64EC input is also valid for `-machine:arm64x`. MSVC requires an explicit `-machine` argument with informative diagnostics; this patch adopts the same behavior.
…pInterface` interfaces (llvm#99566) This patch adds the `ConvertToLLVMAttrInterface` and `ConvertToLLVMOpInterface` interfaces. It also modifies the `convert-to-llvm` pass to use these interfaces when available. The `ConvertToLLVMAttrInterface` interfaces allows attributes to configure conversion to LLVM, including the conversion target, LLVM type converter, and populating conversion patterns. See the `NVVMTargetAttr` implementation of this interface for an example of how this interface can be used to configure conversion to LLVM. The `ConvertToLLVMOpInterface` interface collects all convert to LLVM attributes stored in an operation. Finally, the `convert-to-llvm` pass was modified to use these interfaces when available. This allows applying `convert-to-llvm` to GPU modules and letting the `NVVMTargetAttr` decide which patterns to populate.
…version script." (llvm#117444) Commit llvm@eaa0a21 has fixed the build problem already so the change in llvm#117342 does not make sense any more. I am reverting it.
Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.
This test fails on the `clang-x64-windows-msvc` builder: .---command stderr------------ | C:\b\slave\clang-x64-windows-msvc\llvm-project\llvm\test\CodeGen\Hexagon\widen-not-load.ll:7:16: error: CHECK-LABEL: expected string not found in input | ; CHECK-LABEL: test1 | ^ | <stdin>:1:1: note: scanning from here | llc.exe: Unknown command line argument '-debug-only=hexagon-load-store-widening'. Try: 'c:\b\slave\clang-x64-windows-msvc\build\stage1\bin\llc.exe --help' | ^ | <stdin>:1:35: note: possible intended match here | llc.exe: Unknown command line argument '-debug-only=hexagon-load-store-widening'. Try: 'c:\b\slave\clang-x64-windows-msvc\build\stage1\bin\llc.exe --help' | ^
The folded load variants almost never require Port5 for length changing conversions (just for SNB ymm cases), and don't typically use an extra uop for the load. Confirmed with a mixture of Agner + uops.info comparisons.
Add complete IvyBridge schedule (which is included in the SandyBridge model, IvyBridge was the first to support F16C) - split rr/rm schedules as they usually have very different port usage. Haswell/Broadwell use Port1 not Port0. Confirmed with a mixture of Agner + uops.info comparisons.
ELF core debugging fix llvm#117070 broke TestLoadUnload.py tests due to GetModuleSpec call, ProcessGDBRemote fetches modules from remote. Revise the original PR, renamed FindBuildId to FindModuleUUID.
Restructure and slightly simplify code to re-use existing basic blocks.
Allow setting the name to use for the generated IR value of the derived IV in preparations for llvm#112145. This is analogous to VPInstruction::Name.
Existing implementation may trigger infinite cycles when collecting effects above or below the current block after wrapping around a loop-like construct. Limit this case to only looking at the immediate block (loop body). This is correct because wrap around is intended to consider effects of different iterations of the same loop and shouldn't be existing the loop block. Reported-by: Fabian Mora <[email protected]> Co-authored-by: Fabian Mora <[email protected]>
…117481) Detected by misc-use-internal-linkage
…ctions with inputs not signed-extended. (llvm#116764) Two options for clang -mdiv32: Use div.w[u] and mod.w[u] instructions with input not sign-extended. -mno-div32: Do not use div.w[u] and mod.w[u] instructions with input not sign-extended. The default is -mno-div32.
… segments (llvm#92815)" This caused test failures, see comment on the PR: Failed Tests (2): BOLT-Unit :: Core/./CoreTests/AArch64/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0 BOLT-Unit :: Core/./CoreTests/X86/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0 > When a binary has multiple text segments, the Size is computed as the > difference of the last address of these segments from the BaseAddress. > The base addresses of all text segments must be the same. > > Introduces flag 'perf-script-events' for testing. It allows passing perf events > without BOLT having to parse them using 'perf script'. The flag is used to > pass a mock perf profile that has two memory mappings for a mock binary > that has two text segments. The size of the mapping is updated as this > change `parseMMapEvents` processes all text segments. This reverts commit 4b71b37.
…y is FixedVectorType. (llvm#117536)
…lvm#116673) Drop commas from split barrier operations assembly format. Signed-off-by: Victor Perez <[email protected]> Depends on llvm#116648, review ec8d354 only. --------- Signed-off-by: Victor Perez <[email protected]>
The pattern `select %x, true, false => %x` is only valid in case that the return type is identical to the type of `%x` (i.e., i1). Hence, the check `isInteger(1)` was replaced with `isSignlessInteger(1)`. Fixes: llvm#117554
This got recently added to SmallVectorExtras: llvm#117460.
This should act like range. Previously ConstantRangeList assumed a 64-bit range. Now query from the actual entries. This also means that the empty range has no bitwidth, so move asserts to avoid checking the bitwidth of empty ranges.
) llvm#116220 clarified that violations of aliasing metadata are UB. Only set the AA metadata after hoisting a log, if it is guaranteed to execute in the original loop. PR: llvm#117204
…dFieldReferenceExpr (llvm#116965) The original code assumed that only special methods might be defined as defaulted. Since C++20 comparison operators might be defaulted too, and we *do* want to consider those as using the fields of the class. Fixes: llvm#116961
…ot (llvm#117320) The optimiser will produce empty blocks that are unconditionally executed according to the CFG -- while it may not be meaningful code, and won't get a prologue_end position, we need to not crash on this input. The fault comes from assuming that there's always a next block with some instructions in it, that will eventually produce some meaningful control flow to stop at -- in the given reproducer in issue llvm#117206 this isn't true, because the function terminates with `unreachable`. Thus, I've refactored the "get next instruction logic" into a helper that'll step through all blocks and terminate if there aren't any more. Reproducer from aeubanks
…es with mismatched streaming attributes (llvm#116391) If `__attribute__((flatten))` is used on a function, or `[[clang::always_inline]]` on a statement, don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when these attributes are used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". Similarly, the (clang-only) `[[clang::always_inline]]` statement attribute is more relaxed than the GNU `__attribute__((always_inline))` (which says it should error it if it can't inline), saying only "If a statement is marked [[clang::always_inline]] and contains calls, the compiler attempts to inline those calls.". The docs also go on to show an example of where `[[clang::always_inline]]` has no effect.
Currently, the Vector dialect TD file includes the following "vector" type definitions: ```mlir def AnyVector : VectorOf<[AnyType]>; def AnyVectorOfAnyRank : VectorOfAnyRankOf<[AnyType]>; def AnyFixedVector : FixedVectorOf<[AnyType]>; def AnyScalableVector : ScalableVectorOf<[AnyType]>; ``` In short: * `AnyVector` _excludes_ 0-D vectors. * `AnyVectorOfAnyRank`, `AnyFixedVector`, and `AnyScalableVector` _include_ 0-D vectors. The naming for "groups" that include 0-D vectors is inconsistent and can be misleading, and `AnyVector` implies that 0-D vectors are included, which is not the case. This patch renames these definitions for clarity: ```mlir def AnyVectorOfNonZeroRank : VectorOfNonZeroRankOf<[AnyType]>; def AnyVectorOfAnyRank : VectorOfAnyRankOf<[AnyType]>; def AnyFixedVectorOfAnyRank : FixedVectorOfAnyRank<[AnyType]>; def AnyScalableVectorOfAnyRank : ScalableVectorOfAnyRank<[AnyType]>; ``` Rationale: * The updated names are more explicit about 0-D vector support. * It becomes clearer that scalable vectors currently allow 0-D vectors - this might warrant a revisit. * The renaming paves the way for adding a new group for "fixed-width vectors excluding 0-D vectors" (e.g., AnyFixedVector), which I plan to introduce in a follow-up patch.
I noticed while working on another test that I never used the PCH trickery to get this to validate that serialization/deserialization works correctly. It DOES, but we weren't testing it with this test like the others.
llvm#117700) This MR fixes failed test `CodeGen/RISCV/compress-opt-select.ll`. It was failed due to previously merged commit `[TTI][RISCV] Unconditionally break critical edges to sink ADDI (PR llvm#108889)`. So, regenerated `compress-opt-select` test.
…lvm#117727) This reverts commit 4866447 as requested by the commit author. Buildbots fail: * https://lab.llvm.org/buildbot/#/builders/164/builds/4945 * https://lab.llvm.org/buildbot/#/builders/52/builds/4021
…` is used (llvm#91524) As described in issue llvm#91518, a previous PR llvm#78484 introduced the `defaultMemorySpaceFn` into bufferization options, allowing one to inform OneShotBufferize that it should use a specified function to derive the memory space attribute from the encoding attribute attached to tensor types. However, introducing this feature exposed unhandled edge cases, examples of which are introduced by this change in the new test under `test/Dialect/Bufferization/Transforms/one-shot-bufferize-encodings.mlir`. Fixing the inconsistencies introduced by `defaultMemorySpaceFn` is pretty simple. This change: - Updates the `bufferization.to_memref` and `bufferization.to_tensor` operations to explicitly include operand and destination types, whereas previously they relied on type inference to deduce the tensor types. Since the type inference cannot recover the correct tensor encoding/memory space, the operand and result types must be explicitly included. This is a small assembly format change, but it touches a large number of test files. - Makes minor updates to other bufferization functions to handle the changes in building the above ops. - Updates bufferization of `tensor.from_elements` to handle memory space. Integration/upgrade guide: In downstream projects, if you have tests or MLIR files that explicitly use `bufferization.to_tensor` or `bufferization.to_memref`, then update them to the new assembly format as follows: ``` %1 = bufferization.to_memref %0 : memref<10xf32> %2 = bufferization.to_tensor %1 : memref<10xf32> ``` becomes ``` %1 = bufferization.to_memref %0 : tensor<10xf32> to memref<10xf32> %2 = bufferization.to_tensor %0 : memref<10xf32> to tensor<10xf32> ```
In order to align with `svext` and NEON `vext`/`vextq`, this patch changes immediate argument in `svextq` such that it refers to elements of the size of those of the source vector, rather than bytes. The [spec for this intrinsic](https://github.com/ARM-software/acle/blob/main/main/acle.md#extq) is ambiguous about the meaning of this argument, this issue was raised after there was a differing interpretation for it from the implementers of the ACLE in GCC. For example (with our current implementation): `svextq_f64(zn_f64, zm_f64, 1)` would, for each 128-bit segment of `zn_f64,` concatenate the highest 15 bytes of this segment with the first byte of the corresponding segment of `zm_f64`. After this patch, the behavior of `svextq_f64(zn_f64, zm_f64, 1)` would be, for each 128-bit vector segment of `zn_f64`, to concatenate the higher doubleword of this segment with the lower doubleword of the corresponding segment of `zm_f64`. The range of the immediate argument in `svextq` would be modified such that it is: - [0,15] for `svextq_{s8,u8}` - [0,7] for `svextq_{s16,u16,f16,bf16}` - [0,3] for `svextq_{s32,u32,f32}` - [0,1] for `svextq_{s64,u64,f64}`
compress is intented to match vcompress from the ISA manual. Note that deinterleave is a subset of this, and is already tested elsewhere. decompress is the synthetic pattern defined in same - though we can often do better than the mentioned iota/vrgather. Note that some of these can also be expressed as interleave with at least one undef source, and is already tested elsewhere. repeat repeats each input element N times in the output. It can be described as as a interleave operations, but we can sometimes do better lowering wise.
We should leave these for EXPENSIVE_CHECKS builds. Some of these were near the top of slowest tests.
…non build dependent size (llvm#117604) On llvm#110065 the changes to LinuxSigInfo Struct introduced some variables that will differ in size on 32b or 64b. I've rectified this by setting them all to build independent types.
Support SV_GroupID attribute. Translate it into dx.group.id in clang codeGen. Fixes: llvm#70120
This is another clause where the parsing does all the required enforcement besides the construct it appertains to, so this patch removes the restriction and adds sufficient test coverage for combined constructs.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
None yet
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implement support for generic Xtensa target in Clang