forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 2
[pull] main from llvm:main #5641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pull
wants to merge
1,122
commits into
Ericsson:main
Choose a base branch
from
llvm:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+269,751
−87,136
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ynthesis (#162576) Don't emit a warning when an Objective-C property is defined using copy or strong semantics.
…te pattern. (#162700) Addresses my comment here #162036 (comment)
When --use-old-text fails, we are emitting all code meant for the original `.text` section into the new section. This could be more bytes compared to those emitted under no `--use-old-text`, especially under `--lite`. As a result, `--use-old-text` results in a larger binary, not smaller which could be confusing to the user. Add more information to the warning, including recommendation to rebuild without `--use-old-text` for smaller binary size.
…-around-statements check (#162698) The check 'readability-braces-around-statements' do offer fixes!
Observed in GCC-produced binary. Emit a warning for the user. Test Plan: added bolt/test/X86/fragment-alias.s
Removed all the caching maps (BB, Inst) in `Embedder` as we don't want to cache embeddings in general. Our earlier experiments on Symbolic embeddings show recomputation of embeddings is cheaper than cache lookups. OTOH, Flow-Aware embeddings would benefit from instruction level caching, as computing the embedding for an instruction would depend on the embeddings of other instructions in a function. So, retained instruction embedding caching logic only for Flow-Aware computation. This also necessitates an `invalidate` method that would clean up the cache when the embeddings would become invalid due to transformations.
…#162526) For a pattern like this: Pat<(MyOp $x, $x), (...), [(MyCheck $x)]>; The old implementation generates: Pat<(MyOp $x0, $x1), (...), [(MyCheck $x0), ($x0 == $x1)]>; This is not very straightforward, because the $x name appears in the source pattern; it's attempting to assume equality check will be performed as part of the source pattern matching. This commit moves the equality checks before the other constraints, i.e.: Pat<(MyOp $x0, $x1), (...), [($x0 == $x1), (MyCheck $x0)]>;
Based on review feedback in #160026. This makes the substitution a lot more clear now that there is no documentation around %T. --------- Co-authored-by: Louis Dionne <[email protected]>
This patch introduces some missing s.barrier instructions in the ROCDL dialect handling named barriers Specifically: ``` @llvm.amdgcn.s.barrier.init - s_barrier_init @llvm.amdgcn.s.barrier.join - s_barrier_join @llvm.amdgcn.s.barrier.leave - s_barrier_leave @llvm.amdgcn.s.barrier.signal.isfirst - s_barrier_signal_isfirst @llvm.amdgcn.s.get.barrier.state - s_get_barrier_state ```
In gcc11 we're getting an error with a `using Req = Req` statement. This changes the name of the types in JSONTransportTest from `Req` > `Request`, `Evt` > `Event`, `Resp` > `Response`.
…ance (#162399) sifive-x390 and sifive-x280 both share the SiFIve7 scheduling model, yet the former has a limited FP64 vector performance. Right now we account for it by instantiating two separate scheduling models (throttled v.s. non-throttled) from the base SiFive7 model. However, this approach (which is also used in other performance features like fast vrgather in SiFive7) does not scale if we add more of these performance features in the future -- the number of scheduling models will simply become unmanageable. The new solution I've been working on is to let a _single_ scheduling model be configured by subtarget features on performance features like these, such that we no longer need to create those derived models. This patch creates the subtarget feature that'll ultimately replace the `isFP64Throttled` knob in SiFive7 scheduling model mentioned earlier. There will be a follow-up patch to integrate this into the scheduling model.
In "Debugging C++ Coroutines", we provide a gdb script to aid with debugging C++ coroutines in gdb. This commit updates said script to make it easier to use and more robust. The commit contains the following user-facing changes: * `show-coro-frame` was replaced by a pretty-printer for `std::coroutine_handle`. This is much easier to use than a custom command since it works out-of-the-box with `p` and in my IDE's variable view (tested using VS-Code) * the new `get_coro_{frame,promise}` functions can be called from expressions to access nested members. Example: `p get_coro_promise(fib.coro_hdl)->current_state` * `async-bt` was replaced by a frame filter. This way, the builtin `bt` command directly shows all the async coroutine frames. Under the covers, the script became more robust: * For devirtualization, we now look up the `__coro_frame` variable in the resume function instead of relying on the `.coro_frame_ty` naming convention. Thereby, devirtualization works slightly better also on gcc-compiled binaries (however, there is still more work to be done). * We use the LLVM-generated `__coro_resume_<N>` labels to get the exact line at which a coroutine was suspended. * The continuation handle is now looked up by name instead of via dereferencing a calculated pointer. Thereby, the script should be simpler to adjust for various coroutine libraries without requiring pointer arithmetic hacks. Other sections of the documentation were adjusted accordingly to reflect the newly added features of the gdb script.
…h subtarget feature (#162400) This patch teaches the SiFive7 scheduling model to configure / toggle the throttled FP64 vector feature with subtarget feature rather than hard-coded TableGen parameter, which inevitably forces us to instantiate a new scheduling model for every performance features like this.
This test, with a corefile created via yaml2macho-core plus an ObjectFileJSON binary with symbol addresses and ranges, was failing on some machines/CI because the wrong ABI was being picked. The bytes of the functions were not included in the yaml or .json binary. The unwind falls back to using the ABI plugin default unwind plans. We have two armv7 ABIs - the Darwin ABI that always uses r7 as the frame pointer, and the AAPCS ABI which uses r11 code. In reality, armv7 code uses r11 in arm mode, r7 in thumb code. But the ABI ArchDefaultUnwindPlan doesn't have any access to the Target's ArchSpec or Process register state, to determine the correct processor state (arm or thumb). And in fact, on Cortex-M targets, the instructions are always thumb, so the arch default unwind plan (hardcoded r11) is always wrong. The corefile doesn't specify a vendor/os, only a cpu. The object file json specifies the armv7m-apple-* triple, which will select the correct ABI plugin, and the test runs. In some cases, it looks like the Process ABI was fetched after opening the corefile, but before the binary.json was loaded and corrected the Target's ArchSpec. And we never re-evaluate the ABI once it is set, in a Process. When we picked the AAPCS armv7 ABI, we would try to use r11 as frame pointer, and the unwind would stop after one stack frame. I'm stepping around this problem by (1) adding the register bytes of the prologues of every test function in the backtrace, and (2) shortening the function ranges (in binary.json) to specify that the functions are all just long enough for the prologue where execution is stopped. The instruction emulation plugin will fail if it can't get all of the bytes from the function instructions, so I hacked the function sizes in the .json to cover the prologue plus one and changed the addresses in the backtrace to fit within those ranges. [ updated this commit to keep the @skipIfRemote on the API test because two remote CI bots are failing for reasons I don't quite see. ]
Releases of Ubuntu that do not support the GNU hash style are long unsupported.
This patch adds support for fneg/fabs operations. For other bit manipulation operations (select/copysign), we don't need new APIs.
Avoids exposing the implementation detail of uintptr_t to the constructor. This is a replacement of b738f63 which avoids needing tablegen to know the underlying storage type.
Make sure to apply the option+number of register logic from the selection pattrn.
These patterns are for setcc with scalar result type and vector operands or shifts with vector result and scalar shift amount.
The shift amount may have a different scalar size than the result, but they should have the same number of elements or they should both be scalar.
…161809) Check RegisterClassInfo if any registers of the new class are actually available for use. Currently AMDGPU overrides shouldCoalesce to avoid this situation. The target hook does not have access to the dynamic register class counts, but ideally the target hook would only be used for profitability concerns. The new test doesn't change, due to the AMDGPU shouldCoalesce override, but would be unallocatable if we dropped the override and switched to the default implementation. The existing limit-coalesce.mir already tests the behavior of this override, but it's too conservative and isn't checking the case where the new class is unallocatable. Add this check so it can be relaxed.
…162714) This renames some attribute list related functions, to make callers think about whether they want to append or prepend to the list, instead of defaulting to prepending which is often not the desired behaviour (for the cases where it matters, sometimes we're just adding to an empty list). Then it adjusts some of these calls to append where they were previously prepending. This has the effect of making `err_attributes_are_not_compatible` consistent in emitting diagnostics as `<new-attr> and <existing-attr> are not compatible`, regardless of the syntax used to apply the attributes.
…tions If the non-commutative user has several same operands and at least one of them (but not the first) is copyable, need to consider this opportunity when calculating the number of dependencies. Otherwise, the schedule bundle might be not scheduled correctly and cause a compiler crash Fixes #162925
Corrects the spelling of 'IsGlobaLinkage' to 'IsGlobalLinkage' in XCOFF-related code, comments, and tests across the codebase.
Currently we cannot vectorize loops with latch blocks terminated by a switch. In the future this could be handled by materializing appropriate compares. Fixes #156894.
… is PHI Need to insert the vector value for the postponed gather/buildvector node after all uses non only if the vector value of the user node is phi, but also if the user node itself is PHI node, which may produce vector phi + shuffle. Fixes #162799
Generally G_UADDE, G_UADDO, G_USUBE, G_USUBO are used together and it was enough to simply define EFLAGS. But if extractvalue is used, we end up with a copy of EFLAGS into GPR. Always generate SETB instruction to put the carry bit on GPR and CMP to set the carry bit back. It gives the correct lowering in all the cases. Closes #120029
Otherwise debug-info is stripped, which influences the language of the current frame. Also, set explicit breakpoint because Windows seems to not obey the debugtrap. Log from failing test on Windows: ``` (lldb) command source -s 0 'lit-lldb-init-quiet' Executing commands in 'D:\test\lit-lldb-init-quiet'. (lldb) command source -C --silent-run true lit-lldb-init (lldb) target create "main.out" Current executable set to 'D:\test\main.out' (x86_64). (lldb) settings set interpreter.stop-command-source-on-error false (lldb) command source -s 0 'with-target.input' Executing commands in 'D:\test\with-target.input'. (lldb) expr blah ^ error: use of undeclared identifier 'blah' note: Falling back to default language. Ran expression as 'Objective C++'. (lldb) run Process 29404 launched: 'D:\test\main.out' (x86_64) Process 29404 stopped * thread #1, stop reason = Exception 0x80000003 encountered at address 0x7ff7b3df7189 frame #0: 0x00007ff7b3df718a main.out -> 0x7ff7b3df718a: xorl %eax, %eax 0x7ff7b3df718c: popq %rcx 0x7ff7b3df718d: retq 0x7ff7b3df718e: int3 (lldb) expr blah ^ error: use of undeclared identifier 'blah' note: Falling back to default language. Ran expression as 'Objective C++'. (lldb) expr -l objc -- blah ^ error: use of undeclared identifier 'blah' note: Expression evaluation in pure Objective-C not supported. Ran expression as 'Objective C++'. (lldb) expr -l c -- blah ^ error: use of undeclared identifier 'blah' note: Expression evaluation in pure C not supported. Ran expression as 'ISO C++'. ```
Add unittest for `DataBreakpointInfoArguments`
…tializers (#163005) `UnwrappedLineParser::parseBracedList` had no explicit handling for the `requires` keyword, so it would just call `nextToken()` instead of properly parsing the `requires` expression. This fix adds a case for `tok::kw_requires` in `parseBracedList`, calling `parseRequiresExpression` to handle it correctly, matching the existing behavior in `parseParens`. Fixes #162984.
…163114) This commit renames the "finalize" operation to "initialize", and "deallocate" to "deinitialize". The new names are chosen to better fit the point of view of the ORC-runtime and executor-process: After memory is *reserved* it can be *initialized* with some content, and *deinitialized* to return that memory to the reserved region. This seems more understandable to me than the original scheme, which named these operations after the controller-side JITLinkMemoryManager operations that they partially implemented. I.e. SimpleNativeMemoryMap::finalize implemented the final step of JITLinkMemoryManager::finalize, initializing the memory in the executor; and SimpleNativeMemoryMap::deallocate implemented the final step of JITLinkMemoryManager::deallocate, running dealloc actions and releasing the finalized region. The proper way to think of the relationship between these operations now is that: 1. The final step of finalization is to initialize the memory in the executor. 2. The final step of deallocation is to deinitialize the memory in the executor.
We can add 's' or 'u' before the hexadecimal constants to denote its signedness. See https://llvm.org/docs/LangRef.html#simple-constants for reference.
…163043) This ensures that we are not including any branches on main that are not in the current user's branch in the diff. We can add this to the command now that --diff_from_common_commit (or at least the fixed version) has landed in a release (21.1.1).
The Unsupported case is special and doesn't have an entry in the vector, and is directly emitted as the 0 case. This should be harmless as it is, but could break if the right number of new libcalls is added.
FWIW, this [[nodiscard]] led to the discovery of #161625.
[[fallthrough]] is now part of C++17, so we don't need to use LLVM_FALLTHROUGH.
[[fallthrough]] is now part of C++17, so we don't need to use LLVM_FALLTHROUGH.
llvm::to_underlying, forward ported from C++23, conveniently packages static_cast and std::underlying_type_t like so: static_cast<std::underlying_type_t<EnumTy>>(E)
Identified with bugprone-unused-local-non-trivial-variable.
Identified with bugprone-unused-local-non-trivial-variable.
This would have failed during compilation post generation later, trim and use raw string literals to avoid such failures. Probably a few more places where similar failures could occur, but this was unexpected failure user ran into.
… patterns (#163080) This is a follow-up PR for #162699. Currently, in the function where we define rewrite patterns, the `op` we receive is of type `ir.Operation` rather than a specific `OpView` type (such as `arith.AddIOp`). This means we can’t conveniently access certain parts of the operation — for example, we need to use `op.operands[0]` instead of `op.lhs`. The following example code illustrates this situation. ```python def to_muli(op, rewriter): # op is typed ir.Operation instead of arith.AddIOp pass patterns.add(arith.AddIOp, to_muli) ``` In this PR, we convert the operation to its corresponding `OpView` subclass before invoking the rewrite pattern callback, making it much easier to write patterns. --------- Co-authored-by: Maksim Levental <[email protected]>
…2547) SparseForwardDataFlowAnalysis, with the comments specifying that StateT must be subclassing AbstractSparseLattice, also places a static assert in the class itself. This commit adds the same missing assert for SparseBackwardDataFlowAnalysis.
…write patterns in bindings (#163123) The MLIR Python bindings now support defining new passes, new rewrite patterns (through either `RewritePatternSet` or `PDLModule`), as well as new dialects using the IRDL bindings. Adding a dedicated section to document these features would make it easier for users to discover and understand the full capabilities of the Python bindings.
Add patterns which reduce or operations to register sequences when combining i16 values to i32. This removes many intermediate VGPRs and reduces registers pressure.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )