The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
Extension that adds support for method calls on primitive types in PHP
[ValueTracking] Fix incorrect computeConstantRange() arguments
The second argument is ForSigned, not UseInstrInfo.
[InstCombine] Add extra test for non-overflowing usub.sat (NFC)
Same as the existing one, but with both nuw and nsw on the add.
[InstCombine] Fold more intrinsics over selects
Move this handling to a centralized place and extend it to handle saturating add/sub intrinsics.
I originally wanted to make this fully generic rather than whitelist based, because this is legal and likely profitable for all speculatable intrinsics. The caveat is that for vector selects, the intrinsic can't perform cross-lane operations like a shuffle or reduction, which we don't really expose as a generic property right now. So for now I'm just extending the list.
[InstCombine] Regenerate test checks (NFC)
[InstCombine] Add additional test cases for folding intrinsic into select (NFC)
Test cross-lane intrinsics with vector selects.
[mlir][Transform] NFC - Fix spurious reflows
Revert "[AMDGPU] Select v_sat_pk_u8_i16"
This reverts commit 64b45db34a0cd979dae9ca3016e9da517e57b987.
Reason: the patterns are wrong which can result in a miscompilation. However, fixing the pattern is not trivial due to how i8 values are handled, and due to the additional type-checking performed by D147127: trunc/smax/smin are all defined as int ops in the DAG despite them working on vectors too.
As this is not a much-needed pattern, I prefer reverting for now until I can find time to properly rewrite the pattern.
[mlir] Use GenericAdaptor to simplify 1:N type conversion API.
For 1:N type conversion, there is a 1:N relationship between the original operands and the converted operands. The same is true for the results. The previous design passed an instance of a "mapping" class into each pattern that helped with handling this 1:N correspondance. However, this was still rather manual and, in particular, it required the use of magic constants for the indices of the different operands.
This commits uses the generated GenericAdaptor class that is generated for each op class in order to simplify this relationship further. The GenericAdaptor allows to wrap around a list of arbitrary types for each operand (via templating); for 1:N type conversion, this allows the operand accessors of the adaptor class to return a ValueRange that corresponds to the N values in the converted types. Patterns can thus use the named accessors instead of magic constants, which eliminates a common class of errors.
This commit further simplifies the API that patterns need to implement by making the operand and result type mappings part of the adaptor. Since many patterns only need one of the two (or even neither), this reduces the number of unnecessary arguments in many cases.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D147225
[MLIR][OpenMP][Flang] Set OpenMP target attributes in MLIR module
Scope of changes:
Differential Revision: https://reviews.llvm.org/D146612
Reviewed By: kiranchandramohan
Co-authored-by: Kiran Chandramohan kiran.chandramohan@arm.com
[Orc] Drop arch check in the DebugObjectManagerPlugin for ELF
Tested this with the new AArch32 backend on armv7l and it works without issues in GDB. The size of the load-address field is only 32-bit here, but we implicitly account for it by writing a ELFT::uint which is: https://github.com/llvm/llvm-project/blob/release/16.x/llvm/include/llvm/Object/ELFTypes.h#L57
So, instead of adding a newly supported machine type, let's just drop this restriction althogether.
[clang][Interp] Fix record initialization via CallExpr subclasses
We can't just use VisitCallExpr() here, as that doesn't handle CallExpr subclasses such as CXXMemberCallExpr.
Differential Revision: https://reviews.llvm.org/D141772
[Matrix] Update most dot tests using vXi64 to vXi32.
Update dot-product-int.ll tests to use mostly i32 instead of i64; there's no mul.2d instruction, so vector versions of v2i64 cannot be lowered efficiently.
[InstCombine] Regenerate test checks (NFC)
[Assignment Tracking][SROA] Handle DIArgList in migrateDebugInfo
If the to-be-split dbg.assign has a DIArgList
and a new Value
has been
requested then use a kill-location for the new dbg.assign. We can't simply
replace the value component (a DIArgList
) with the new Value
as that would
leave the DIExpression
in an invalid state (DW_OP_LLVM_arg
operands with no
arglist).
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D147312
[Assignment Tracking] Enable by default
See https://discourse.llvm.org/t/rfc-enable-assignment-tracking/69399
This sets the -Xclang -fexperimental-assignment-tracking
flag to the value
enabled
which means it will be enabled so long as none of the following are
true: it's an LTO build, LLDB debugger tuning has been specified, or it's an O0
build (no work is done in any case if -g is not specified or -gmlt is used).
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D146987
[Matrix] Add special case dot product lowering
Add special case to matrix lowering for dot products. Normal matrix lowering if optimized for either row-major or column-major, which results in many shufflevector
instructions being generated for one vector. We work around this in our special case. We can also use vector-reduce adds instead of sequential adds to sum the result of the element-wise multiplication, which takes advantage of SIMD instructions.
Reviewed By: fhahn, thegameg
Differential Revision: https://reviews.llvm.org/D131125
[InstCombine] Remove min/max special case when folding into select
Now that we canonicalize to min/max intrinsics, we no longer need to guard against this here.
In fact, it seems like the issue from PR46271 was the final push for introducing the intrinsics in the first place...
[mlir][llvm] Import pointer data layout specification.
The revision moves the data layout parsing into a separate file and extends it to support pointer data layout specifications. Additionally, it also produces more precise warnings and error messages.
Reviewed By: Dinistro, definelicht
Differential Revision: https://reviews.llvm.org/D147170
[mlir] Fix casting of leading unit dims for vector.insert
When dropping leading unit dims of vector.insert's operands and creating a new vector.insert, its new position rank should be computed explicitly in two steps: first based on the numbers of leading unit dims dropped from the vector.insert's destination, then based on the numbers of leading unit dims dropped from its source.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D147280
[flang] Don't fold operation when shapes differ
When folding a binary operation between two array constructors, it is necessary to check if each value contained in the left operand has the same rank and shape as the one on the right. Otherwise, lowering would end up with an operation between values of different ranks/shapes, which could result in a crash.
For instance, the following code was crashing the compiler: integer :: x(4), y(2, 2), z(4)
z = (/x/) + (/y/)
Fixes #60229
Reviewed By: klausler, jeanPerier
Differential Revision: https://reviews.llvm.org/D147181
[bazel] Port 9d2b84ef6232
Fix a simple think-o; NFC
This was using a bitwise OR of two boolean member variables, now it's using a logical OR instead.
[libc][NFC] Adjust some CMake messages for the GPU build
Summary: This disables the MPFR warning on the GPU since we can't support it anyway. Also fixes a misspelled message.
[clang][Interp] Fix binary comma operators
We left the result of RHS on the stack in case DiscardResult was true.
Differential Revision: https://reviews.llvm.org/D141784
[Assignment Tracking] Remove assertion from DbgAssignIntrinsic::setAddress
Follow up to https://reviews.llvm.org/D146987.
Remove assertion that the Value must be a pointer type. This fires in real-world examples e.g. by codegenprepare introducing ptrtoint conversions.
The buildbots have not caught up yet but without this change the test compiler-rt/test/ubsan/TestCases/TypeCheck/vptr.cpp fails with an ICE.
[InstCombine] Add additional test for folding intrinsic into select (NFC)
This is just a difference between PIC and non-PIC code.
Given the following code godbolt:
#define ENUM() \
ENUM_I(A) \
ENUM_I(B) \
ENUM_I(C) \
ENUM_I(D) \
ENUM_I(E) \
ENUM_I(F) \
ENUM_I(G) \
ENUM_I(H) \
ENUM_I(I) \
ENUM_I(J) \
ENUM_I(K) \
ENUM_I(L) \
ENUM_I(M) \
ENUM_I(N)
enum class E {
#define ENUM_I(name) name,
ENUM()
#undef ENUM_I
COUNT,
};
const char* enum_name(E e) noexcept {
switch (e) {
#define ENUM_I(name) \
case E::name: \
return #name;
ENUM()
#undef ENUM_I
default:
__builtin_unreachable();
}
}
Running clang with -O3 -march=skylake
generates:
enum_name(E): # @enum_name(E)
movsxd rax, edi
lea rcx, [rip + .Lreltable.enum_name(E)]
movsxd rax, dword ptr [rcx + 4*rax]
add rax, rcx
ret
(strings...)
while gcc with -O3 -march=skylake
generates:
enum_name(E):
mov edi, edi
mov rax, QWORD PTR CSWTCH.1[0+rdi*8]
ret
(strings...)
[InstCombine] Remove min/max special case when folding into select
Now that we canonicalize to min/max intrinsics, we no longer need to guard against this here.
In fact, it seems like the issue from PR46271 was the final push for introducing the intrinsics in the first place...
[InstCombine] Regenerate test checks (NFC)
Probably related to the LLVM upgrade, but I was unable to reduce this to a pure LLVM reproducer.
Upstream fix: https://github.com/llvm/llvm-project/commit/fc6e91fe8184129d2395b79ce42f4495b95b0d0d
/cherry-pick fc6e91fe8184129d2395b79ce42f4495b95b0d0d
Fixes LLVM 16 regression reported in https://github.com/rust-lang/rust/issues/109775.
[Local] Handle size mismatch between pointer/int in copyRangeMetadata()
SROA may convert a wide integer load into a narrow pointer load, make sure we don't crash. It would not be legal to transfer the metadata in this case.
Minimized:
define i128 @test() {
%a = alloca i128
store ptr null, ptr %a
%v = load i128, ptr %a, !range !0
ret i128 %v
}
!0 = !{i128 1, i128 0}
Preliminary reduction for opt -passes=sroa
: https://gist.github.com/nikic/1c606a180808a42806526ec00cee947c
reproduce code
class Ptr {
int* _r;
inline void m_set(int* p){
_r = p + 1;
}
inline int* m_get(){
return _r - 1;
}
public:
Ptr(int* p) { this->m_set(p); };
inline operator bool() {
return !!m_get();
}
};
int main()
{
Ptr p((int*)0);
while(p);
return 0;
}
when compile with -O0
or -O1
, the output returns 0 as expected.
when compile with -O2
or higher, the output is an infinite loop;
this issue first appeared at clang 3.3, and affects all archs (those i can test with godbolt), godbolt link below: https://godbolt.org/z/59j8GT1KG
gcc also share the same issue since gcc 9, FYI.
Adding a non-zero value to a null pointer (in C) or adding any value to a null pointer (in C++) is undefined behavior.
Oh right, I confused the outputs.
It looks like the only real difference here is that previously the "vscale * 4" increment was repeated once for each GEP (in the form of a constexpr) and now it appears only once, outside the loop, which prevents it from being folded into the addressing mode.
I believe CGP is responsible for sinking such instructions to exploit addressing modes. I'm not particularly familiar with that code though.
/cherry-pick e7c35d71007fab6e6729a0cfa821023128de2f74
[AArch64] Extend icmp bitcast to vecreduce fold to comparison with -1
D130163 added support for folding setcc (iN (bitcast (vNi1 X))), 0, (eq|ne) to setcc (iN (zext (i1 (vecreduce_or (vNi1 X))))), 0, (eq|ne).
There is a conjugate fold for comparison with -1 which uses vecreduce_and and sext instead.
Proof: https://alive2.llvm.org/ce/z/Zz--xy
Differential Revision: https://reviews.llvm.org/D146518
Right, the check improvements should be due to optimization improvements. What's pretty surprising is that relatively minor instruction count improvements map to such a large wall time improvement (based on the detailed statistics, apparently mostly in typeck?)
@bors r+ rollup
The post-LSR output contains vscale GEP constant expressions, which are supposed to be forbidden since recently, so something is going very wrong here.
This is another attempt to work around https://github.com/rust-lang/rust/issues/108227.
By limiting to one link job, we should be able to avoid file name clashes in mkstemp().