nikic
Repos
94
Followers
5671
Following
25

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

16149
5214

A PHP parser written in PHP

15681
842

Fast request router for PHP

4699
397

Extension that adds support for method calls on primitive types in PHP

1080
42

Iteration primitives using generators

1073
65

Extension exposing PHP 7 abstract syntax tree

863
71

Events

issue comment
included range in for loop gives more assembly code than excluded range

Simpler example without the unnecessary memory indirections: https://rust.godbolt.org/z/GPvrbvbYY

External iteration over inclusive ranges is well known to optimize badly. Optimization may be viable in this particular case though.

Created at 7 hours ago
issue comment
Mark Vec::drop() as #[inline]

@bors try @rust-timer queue

Created at 8 hours ago
pull request opened
Mark Vec::drop() as #[inline]

As seen in https://github.com/rust-lang/rust/issues/102539#issuecomment-1264398807, Vec::drop() is commonly a no-op, if the underlying type does not need drop. Checking to see what the impact of this would be, I'm not sure the change can be done in this form (an alternative would be to do a needs_drop check and only instantiate that part in each CGU).

r? @ghost

Created at 8 hours ago
issue comment
Use linker plugin LTO for compiling `rustc`

Somehow, this perf run made it to https://perf.rust-lang.org/index.html, as it it landed on the master branch...

Created at 8 hours ago
create branch
nikic create branch inline-vec-drop
Created at 8 hours ago
issue comment
Tentatively inline(always) the trivial Into::into

I think we should reopen this one (the #[inline] variant, not #[inline(always)]). The perf result was mostly positive, including on bootstrap timings. I believe we do want this from() wrapper to be instantiated in all CGUs, because in most cases the from() implementation is trivial as well, and this is important for looking through abstractions. into() shows up a lot in https://github.com/rust-lang/rust/issues/102539#issuecomment-1264398807.

Created at 9 hours ago
Created at 9 hours ago
Created at 9 hours ago
pull request opened
Mark Array::size() as #[inline]

Array::size() is a trivial function -- in fact, it always returns a constant. Mark it as #[inline] so it is instantiated in each CGU, and can be optimized without relying on crate-local LTO. See https://github.com/rust-lang/rust/issues/102539 for context.

It looks like SmallVec already marks pretty much all other trivial functions as #[inline], so this seems like an oversight.

Created at 10 hours ago
create branch
nikic create branch inline-size
Created at 10 hours ago
issue comment
Rebased: Mark drop calls in landing pads cold instead of noinline

@lqd I don't know if anything changed in serde, but the intention here is indeed to potentially regress compile-time for certain types of excessively large machine-generated code in favor of fixing a class of common optimization failures.

Created at 11 hours ago
Created at 13 hours ago
issue comment
Mark Cell::replace() as #[inline]

@bors try @rust-timer queue

Created at 14 hours ago
pull request opened
Mark Cell::replace() as #[inline]

Giving this a try based on https://github.com/rust-lang/rust/issues/102539#issuecomment-1264398807.

Created at 14 hours ago
create branch
nikic create branch inline-cell-replace
Created at 14 hours ago
issue comment
#[inline] on generic functions

I did a bit of crude data analysis, based on calls to non-inlined small functions in ThinLTO inputs. Here's the result: https://gist.github.com/nikic/c1d441ca6468b8c2ee114ff4a11f5667

Some of these are not supposed to be inlined (various panic functions) and there are a lot of drop_in_place functions that really should be inlined but aren't due to noinline annotations (fixed by https://github.com/rust-lang/rust/pull/102099).

But there are also some obvious candidates where per CGU instantiation is likely beneficial, and we'd probably want to add an #[inline] annotation. Cell::replace() looks like one of the worst offenders. From the ecosystem, SmallVec::size() stands out to me.

# Third 30 entries excluding drop_in_place and various panic helpers.
n=20649 s=3  <core::cell::Cell<isize>>::replace
n=18672 s=1  <alloc::vec::Vec<u8> as core::ops::drop::Drop>::drop
n=8691  s=1  <core::ptr::unique::Unique<u8> as core::convert::Into<core::ptr::non_null::NonNull<u8>>>::into
n=6194  s=7  <[u8] as core::cmp::PartialEq>::eq
n=5741  s=3  <core::cell::Cell<usize>>::replace
n=5350  s=1  <rustc_query_system::dep_graph::graph::DepNodeIndex as core::convert::Into<rustc_data_structures::profiling::QueryInvocationId>>::into
n=4693  s=2  alloc::alloc::box_free::<rustc_ast::ast::Ty, alloc::alloc::Global>
n=1950  s=1  <[rustc_middle::ty::subst::GenericArg; 8] as smallvec::Array>::size
n=1930  s=2  <rustc_ast::ptr::P<rustc_ast::ast::Expr> as core::ops::deref::Deref>::deref
n=1760  s=1  <[rustc_middle::ty::Ty; 8] as smallvec::Array>::size
n=1740  s=2  <rustc_errors::diagnostic_builder::DiagnosticBuilder<rustc_errors::ErrorGuaranteed>>::emit
n=1699  s=1  rustc_data_structures::sync::assert_sync::<rustc_middle::ty::context::tls::ImplicitCtxt>
n=1560  s=1  <rustc_hir_analysis::check::inherited::Inherited as core::ops::deref::Deref>::deref
n=1509  s=1  <core::hash::BuildHasherDefault<rustc_hash::FxHasher> as core::default::Default>::default
n=1498  s=2  alloc::alloc::box_free::<rustc_ast::ast::Pat, alloc::alloc::Global>
n=1481  s=6  <&str as core::convert::Into<alloc::borrow::Cow<str>>>::into
n=1119  s=2  <rustc_ast::ptr::P<rustc_ast::ast::Ty> as core::ops::deref::Deref>::deref
n=1112  s=1  <[rustc_middle::ty::sty::Binder<rustc_middle::ty::sty::ExistentialPredicate>; 8] as smallvec::Array>::size
n=1047  s=6  smallvec::infallible::<()>
n=923   s=2  <rustc_target::abi::TyAndLayout<rustc_middle::ty::Ty> as core::ops::deref::Deref>::deref
n=874   s=6  <std::path::Path>::new::<std::ffi::os_str::OsString>
n=850   s=1  <alloc::vec::Vec<rustc_span::span_encoding::Span> as core::ops::drop::Drop>::drop
n=848   s=6  _$LT$alloc..string..String$u20$as$u20$core..clone..Clone$GT$::clone::h3a6e79de4ee6835e
n=825   s=2  alloc::alloc::box_free::<syn::expr::Expr, alloc::alloc::Global>
n=779   s=2  <tracing_core::metadata::Metadata>::fields
n=761   s=3  <hashbrown::set::HashSet<rustc_target::asm::InlineAsmReg, core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::insert
n=706   s=1  <rustc_span::def_id::DefId as core::borrow::Borrow<rustc_span::def_id::DefId>>::borrow
n=696   s=3  <rustc_data_structures::atomic_ref::AtomicRef<fn(rustc_span::def_id::LocalDefId)> as core::ops::deref::Deref>::deref
n=694   s=5  alloc::alloc::box_free::<dyn core::error::Error + core::marker::Send + core::marker::Sync, alloc::alloc::Global>
Created at 14 hours ago
issue comment
Tentatively #[inline] Option::from

Because this is a recurring problem, I've spent some time trying to understand just why #[inline] makes a difference for generic functions, even though it ostensibly shouldn't. This is my conclusion: https://github.com/rust-lang/rust/issues/102539

Basically, if we ignore the opt-for-size case (where -Z share-generics causes extra issues), the relevant distinction is whether the function is instantiated per-crate or per-CGU. Generic functions are instantiated per-crate, while #[inline] functions are instantiated per-CGU.

Created at 19 hours ago
opened issue
#[inline] on generic functions

Common Rust wisdom says that #[inline] does not need to be placed on small generic functions. This is because generic functions will get monomorphized in each crate anyway, so the attribute is not necessary for cross-crate inlining.

However, we also know that in practice placing #[inline] on generic functions does help optimization, even for tiny functions where the additional inlinehint this gives to LLVM really shouldn't be relevant. What gives? I believe there are two complications:

The main problem is that #[inline] forces an instantiation of the function in each CGU, while generic functions are normally only instantiated once per crate. This means that a definition of generic functions is available to crate-local LTO, but not to the pre-link optimization pipeline. Especially for trivial generic functions, this may significantly hamper pre-link optimization, and post-link optimization may not be able to recover from this.

The second complication occurs when optimizing for size. In this case, we currently enable -Z share-generics by default, which means that generic functions only get monomorphized once and are exported for downstream crates. This means that the function definition is not available even to crate-local LTO. It only becomes available during full cross-crate LTO.

The second point is something we can fix: We probably should not be enabling -Z share-generics by default in any optimized builds, including size-optimized builds.

The first one is trickier, as instantiating monomorphizations in each CGU by default is likely not desirable. Possibly we should just stop considering whether a function is generic or not when it comes to #[inline] placement.

Created at 19 hours ago
issue comment
Add `inline(always)` to a few functions that end up in `opt-level=z` compiled output

I just came back to this, and my analysis of why this happens wasn't quite right. The actual culprit is share-generics, which is enabled by default when optimizing for size: https://github.com/rust-lang/rust/blob/877877a19a408421486a5077d36e1bfef090e42e/compiler/rustc_session/src/config.rs#L778

This can be easily verified using -Z share-generics=no: https://rust.godbolt.org/z/fx67dqW84

So apparently this is working as intended? Can't optimize it away if it's exported for use by downstream crates. Presumably this does not happen when building an executable.

Created at 1 day ago
issue comment
Rebased: Mark drop calls in landing pads cold instead of noinline

Probably fixes #97217 as well.

Created at 1 day ago

early-exit

Created at 1 day ago
pull request opened
Update LLVM submodule

This merges in the current upstream release/15.x branch.

Fixes #102402.

Created at 1 day ago
create branch
nikic create branch update-llvm-8
Created at 1 day ago

[RLEV] Pick a correct insert point when incoming instruction is itself a phi node

This fixes https://github.com/llvm/llvm-project/issues/57336. It was exposed by a recent SCEV change, but appears to have been a long standing issue.

Note that the whole insert into the loop instead of a split exit edge is slightly contrived to begin with; it's there solely because IndVarSimplify preserves the CFG.

Differential Revision: https://reviews.llvm.org/D132571

(cherry picked from commit c37b1a5f764380f83ba08ae0cebca2b162123eb6)

[DOCS] Minor fixes and removals of WIP warnings

[RISCV][ReleaseNotes] Added LLVM and Clang release notes for RISC-V 15.0.0

AMDGPU: mbcnt allow for non-zero src1 for known-bits

Src1 for mbcnt can be a non-zero literal or register. Take this into account when calculating known bits.

Differential Revision: https://reviews.llvm.org/D131478

(cherry picked from commit 1d1cc05539e275ae7666fc4b44bf725ec335078a)

[Symbolizer] Implement data symbolizer markup element.

This connects the Symbolizer to the markup filter and enables the first working end-to-end flow using the filter.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D130187

(cherry picked from commit 22df238d4a642a4553ebf7b91325189be48b139d)

[Symbolizer] Implement pc element in symbolizing filter.

Implements the pc element for the symbolizing filter, including it's "ra" and "pc" modes. Return addresses ("ra") are adjusted by decrementing one. By default, {{{pc}}} elements are assumed to point to precise code ("pc") locations. Backtrace elements will adopt the opposite convention.

Along the way, some minor refactors of value printing and colorization.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D131115

(cherry picked from commit bf48b128b02813e53e0c8f6585db837d14c9358f)

[Symbolizer] Fix symbolizer-filter-markup-pc.test on Windows

(cherry picked from commit 0d6cf1e8b5fa8590f816d5330cb7c2dcc449ec24)

[Symbolizer] Handle {{{bt}}} symbolizer markup element.

This adds support for backtrace generation to the llvm-symbolizer markup filter, which is likely the largest use case.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D132706

(cherry picked from commit ea99225521cba6dec1ad4ca70a8665829e772fa9)

[compiler-rt] [test] Handle missing ld.gold gracefully

Fix the is_binutils_lto_supported() function to handle missing executables gracefully. Currently, the function does not catch exceptions from subprocess.Popen() and therefore causes lit to crash if config.gold_executable does not specify a valid executable:

lit: /usr/lib/python3.11/site-packages/lit/TestingConfig.py:136: fatal: unable to parse config file '/tmp/portage/sys-libs/compiler-rt-
15.0.0/work/compiler-rt/test/lit.common.cfg.py', traceback: Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/lit/TestingConfig.py", line 125, in load_from_path
    exec(compile(data, path, 'exec'), cfg_globals, None)
  File "/tmp/portage/sys-libs/compiler-rt-15.0.0/work/compiler-rt/test/lit.common.cfg.py", line 561, in <module>
    if is_binutils_lto_supported():
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/portage/sys-libs/compiler-rt-15.0.0/work/compiler-rt/test/lit.common.cfg.py", line 543, in is_binutils_lto_supported
    ld_cmd = subprocess.Popen([exe, '--help'], stdout=subprocess.PIPE, env={'LANG': 'C'})
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1022, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.11/subprocess.py", line 1899, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'GOLD_EXECUTABLE-NOTFOUND'

Differential Revision: https://reviews.llvm.org/D133358

(cherry picked from commit ea953b9d9a65c202985a79f1f95da115829baef6)

[clang-format] Distinguish logical and after bracket from reference

Fix commit b646f0955574 and remove redundant code.

Differential Revision: https://reviews.llvm.org/D131750

(cherry picked from commit ef71383b0cfbacdbebf495015f6ead5294bf7759)

[DAG] extractShiftForRotate - replace assertion for shift opcode with an early-out

We feed the result from the first extractShiftForRotate call into the second, and that result might no longer be a shift op (usually due to constant folding).

NOTE: We REALLY need to stop creating nodes on the fly inside extractShiftForRotate!

Fixes Issue #57474

(cherry picked from commit eaede4b5b7cfc13ca0e484b4cb25b2f751d86fd9)

[clang] Skip re-building lambda expressions in parameters to consteval fns.

As discussed in this comment, we end up building the lambda twice: once while parsing the function calls and then again while handling the immediate invocation.

This happens specially during removing nested immediate invocation. Eg: When we have another consteval function as the parameter along with this lambda expression. Eg: foo(bar([]{})), foo(bar(), []{})

While removing such nested immediate invocations, we should not rebuild this lambda. (IIUC, rebuilding a lambda would always generate a new type which will never match the original type from parsing)

Fixes: https://github.com/llvm/llvm-project/issues/56183 Fixes: https://github.com/llvm/llvm-project/issues/51695 Fixes: https://github.com/llvm/llvm-project/issues/50455 Fixes: https://github.com/llvm/llvm-project/issues/54872 Fixes: https://github.com/llvm/llvm-project/issues/54587

Differential Revision: https://reviews.llvm.org/D132945

(cherry picked from commit e7eec38246560781e0a4020b19c7eb038a8c5655)

[Clang] Fix crash in coverage of if consteval.

Clang crashes when encountering an if consteval statement. This is the minimum fix not to crash. The fix is consistent with the current behavior of if constexpr, which does generate coverage data for the discarded branches. This is of course not correct and a better solution is needed for both if constexpr and if consteval. See https://github.com/llvm/llvm-project/issues/54419.

Fixes #57377

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D132723

[DwarfEhPrepare] Assign dummy debug location for inserted _Unwind_Resume calls (PR57469)

DwarfEhPrepare inserts calls to _Unwind_Resume into landing pads. If _Unwind_Resume happens to be defined in the same module and debug info is used, then this leads to a verifier error:

inlinable function call in a function with debug info must have a !dbg location call void @_Unwind_Resume(ptr %exn.obj) #0

Fix this by assigning a dummy location to the call. (As this happens in the backend, inlining is not actually relevant here.)

Fixes https://github.com/llvm/llvm-project/issues/57469.

Differential Revision: https://reviews.llvm.org/D133095

(cherry picked from commit 5134bd432f8c35c87f4c4dc3bb744d396adcab58)

[mlir] Fix building CRunnerUtils on OpenBSD with 15.x

CRunnerUtils builds as C++11. 9c1d133c3a0256cce7f40e2e06966f84e8b99ffe broke the build on OpenBSD. aligned_alloc() was only introduced in C++17.

[LLD][COFF] Fix writing a map file when range extension thunks are inserted

Bug: An assertion fails:

Assertion failed: isa<To>(Val) && "cast<Ty>() argument of incompatible type!",
file C:\Users\<user>\prog\llvm\llvm-git-lld-bug\llvm\include\llvm/Support/Casting.h, line 578

Bug is triggered, if

- a map file is requested with /MAP, and
- Architecture is ARMv7, Thumb, and
- a relative jump (branch instruction) is greater than 16 MiB (2^24)

The reason for the Bug is:

- a Thunk is created for the jump
- a Symbol for the Thunk is created
    - of type `DefinedSynthetic`
    - in file `Writer.cpp`
    - in function `getThunk`
- the Symbol has no name
- when creating the map file, the name of the Symbol is queried
- the function `Symbol::computeName` of the base class `Symbol`
  casts the `this` pointer to type `DefinedCOFF` (a derived type),
  but the acutal type is `DefinedSynthetic`
- The in the llvm::cast an assertion fails

Changes:

  • Modify regression test to trigger this bug
  • Give the symbol pointing to the thunk a name, to fix the bug
  • Add assertion, that only DefinedCOFF symbols are allowed to have an empty name, when the constructor of the base class Symbol is executed

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D133201

(cherry picked from commit 4e5a59a3839f54d928d37d49d4c4ddbb3f339b76)

[libc++][format] Updates feature-test macros.

During the discussion on the SG-10 mailinglist regarding the format feature-test macros voted in during the last plenary it turns out libc++ can't mark the format feature-test macro as implemented.

According to https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations#__cpp_lib_format the not yet implemented paper P1361R2 Integration of chrono with text formatting affects the feature test macro.

Note that P1361R2 doesn't mention the feature-test macro nor is there an LWG-issue to address the issue. The reporter of the issue didn't recall where this requirement exactly has been decided.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D133271

[MachO] Fix dead-stripping __eh_frame

This section is marked S_ATTR_LIVE_SUPPORT in input files, which meant that on arm64, we were unnecessarily preserving FDEs if we e.g. had multiple weak definitions for a function. Worse, we would actually produce an invalid __eh_frame section in that case, because the CIE associated with the unnecessary FDE would still get dead-stripped and we'd end up with a dangling FDE. We set up associations from functions to their FDEs, so dead-stripping will just work naturally, and we can clear S_ATTR_LIVE_SUPPORT from our input __eh_frame sections to fix dead-stripping.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D132489

(cherry picked from commit a745e47900dde15c180d5caea7a1d292ca809eb1)

[MachO] Don't fold compact unwind entries with LSDA

Folding them will cause the unwinder to compute the incorrect function start address for the folded entries, which in turn will cause the personality function to interpret the LSDA incorrectly and break exception handling.

You can verify the end-to-end flow by creating a simple C++ file:

void h();
int main() { h(); }

and then linking this file against the liblsda.dylib produced by the test case added here. Before this change, running the resulting program would result in a program termination with an uncaught exception. Afterwards, it works correctly.

Reviewed By: #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D132845

(cherry picked from commit 56bd3185cdd8d79731acd6c75bf41869284a12ed)

Downgrade implicit int and implicit function declaration to warning only

The changes in Clang 15.0.0 which enabled these diagnostics as a warning which defaulted to an error caused disruption for people working on distributions such as Gentoo. There was an explicit request to downgrade these to be warning-only in Clang 15.0.1 with the expectation that Clang 16 will default the diagnostics to an error.

See https://discourse.llvm.org/t/configure-script-breakage-with-the-new-werror-implicit-function-declaration/65213 for more details on the discussion.

See https://reviews.llvm.org/D133800 for the public review of these changes.

Created at 1 day ago

AMDGPU: Add baseline test for expansion of 16-bit local atomics

The expansion is currently using the wrong pointer size.

AtomicExpand: Use correct pointer size for integer

This was using the default address space.

[GISel] TreeMatcher: always skip leaves if they don't care

In GIMatchTreeOpcodePartitioner::applyForPartition(), the loop over the possible leaves skip a leaf if the instruction does not care about the instruction. When processing the referenced operands in the next loop the same leaves need to be skipped.

Later, when these leaves are added to all partitions, the bit vector must be resized first before the bit representing the leaf is set.

This fixes a crash in llvm-tblgen.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134192

[HLSL] remove unnecessary abs attributes

remove abs non-elementwise attribute statements, stick to elementwise.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D134312

[flang] Add semantics test for atomic_add subroutine

Reviewed By: ktras

Differential Revision: https://reviews.llvm.org/D131535

[NFC] Fix typo in comment

[mlir][spirv] Lower max/min vector.reduction for OpenCL

Templatizing vector reduction to enable lowering from vector.reduction max/min to CL ops.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D134313

[mlir][linalg] Swap tensor.extract_slice(linalg.fill)

This commit adds a pattern to swap

tensor.extract_slice(linalg.fill(%cst, %init))

into

linalg.fill(%cst, tensor.extract_slice(%init))

when the linalg.fill op have no other users. This helps to reduce the fill footprint.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D134102

[LLD][COFF] Support /MAPINFO flag

This patch adds support for link.exe's /MAPINFO flag to lld-link.exe.

Here is a description of the flag from Microsoft (https://learn.microsoft.com/en-us/cpp/build/reference/mapinfo-include-information-in-mapfile?view=msvc-170):

The /MAPINFO option tells the linker to include the specified information in a mapfile, which is created if you specify the /MAP option. EXPORTS tells the linker to include exported functions.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D134247

[PS4] Always enable the .debug_aranges section when using LTO

This flag enables the .debug_aranges section by passing a flag to LLD and our internal linker. This also adds a new routine that will generate the correct flag for our internal linker or set of flags for LLD when given a list of LLVM options. That ensures multiple LLVM codegen options can be passed to either linker consistently.

Differential Revision: https://reviews.llvm.org/D134296

[mlir][spirv] Add casting ops to/from generic storage space

Reviwed By: antiagainst

Differential Revision: https://reviews.llvm.org/D134217

[HLSL] Support PCH for cc1 mode

Add HLSLExternalSemaSource as ExternalSemaSource instead of ASTContext::ExternalSource when PCH is included.

This allows a different external source to be set for the AST context.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D132421

[mlir][spirv] Query target environment for mapping memory space

Checks spirv::TargetEnv from op to see if it contains either Kernel or Shader capabilities. If it does, then it will set the memory space mapping accordingly.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D134317

[clang] Fix missing template arguments in AST of access to member variable template

Signed-off-by: Matheus Izvekov mizvekov@gmail.com

Differential Revision: https://reviews.llvm.org/D134295

[DSE] Add value type info checks for masked store candidates in Dead Store Elimination.

The type information of the store values can diverge when checking for valid mask store candidates to eliminate via DSE. This patch checks for equivalence wrt to size and element count.

Reviewed By: fhahn, rui.zhang

Differential Revision: https://reviews.llvm.org/D132700

[mlir][arith] Add integration tests for addi emulation

This includes tests with the exact expected values and comparison-based tests.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D134321

[mlir][arith] Fix constant naming in integration tests. NFC.

Suggested by @antiagainst in D134321.

[libc] add strerror

Strerror maps error numbers to strings. Additionally, a utility for mapping errors to strings was added so that it could be reused for perror and similar.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D134074

[mlir][tensor] Merge consecutive insert_slice/extract_slice ops

Consecutive tensor.insert_slice/tensor.extract_slice can be created for the case like tiling convolution and then downsizing 2-D convolutions into 1-D ones. It hinders further transformations. So adding these patterns to clean it up.

Given that bufferization is sensitive and have requirements over the IR structure (see https://reviews.llvm.org/D132666), these patterns are put in Transforms/ with separate entry points for explicit collection.

Reviewed By: ThomasRaoux, mravishankar

Differential Revision: https://reviews.llvm.org/D133871

Change isLittleEndian to follow llvm style and add an accessor

Differential Revision: https://reviews.llvm.org/D134290

Created at 1 day ago

[AST] Use BatchAA in aliasesPointer() (NFC)

Created at 1 day ago

[LICM] Collect more scalar promotion stats (NFC)

Collect more statistics for scalar promotion. In particular, keep track of how many promotion candidates there were, and whether it is a load or a load/store promotion.

Created at 1 day ago