LLVM memory builtins #995

aykevl · 2020-03-26T15:31:21Z

Use LLVM intrinsics for memory operations (memset, memcpy, memmove). This gives more information to the optimizer and in the end lowers to libc functions instead of a custom TinyGo implementation. This is possible since #871 and #845.

Another motivation for this PR is to reduce the amount of code at the end of the compiler.Compile function, which should help with eventually breaking it down further into per-package compiles (#285).

This replaces the custom runtime.memcpy and runtime.memmove functions with calls to LLVM builtins that should hopefully allow LLVM to better optimize such calls. They will be lowered to regular libc memcpy/memmove when they can't be optimized away. When testing this change with some smoke tests, I found that many smoke tests resulted in slightly larger binary sizes with this commit applied. I looked into it and it appears that machine.sendUSBPacket was not inlined before while it is with this commit applied. Additionally, when I compared all driver smoke tests with -opt=1 I saw that many were reduced slightly in binary size and none increased in size.

This gives the optimizer a bit more information about what the calls do. This should result in slightly better generated code. Code size sometimes goes up and sometimes goes down. I blame the code size going up on the inliner which inlines more functions, because compiling the smoke tests in the drivers repository with -opt=1 results in a slight code size reduction in all cases.

deadprogram · 2020-03-27T18:32:35Z

This absolutely seems like a much better idea to me.

However, I am noticing that all of the programs are now a little larger, but a couple of them are now much larger than I would have expected.

Before this PR:

tinygo build -size short -o test.hex -target=itsybitsy-m0        examples/blinky1
   code    data     bss |   flash     ram
   9172       8    4312 |    9180    4320

After this PR:

tinygo build -size short -o test.hex -target=itsybitsy-m0        examples/blinky1
   code    data     bss |   flash     ram
   9548       8    4312 |    9556    4320

What do you think?

aykevl · 2020-03-27T18:56:44Z

It definitely is machine.sendUSBPacket.

Before (at c4fd19b):

   code    data     bss |   flash     ram
   9380       8    4312 |    9388    4320

With this PR:

   code    data     bss |   flash     ram
   9548       8    4312 |    9556    4320

With this PR and //go:noinline on machine.sendUSBPacket:

   code    data     bss |   flash     ram
   9308       8    4312 |    9316    4320

So it definitely can reduce code size, with a hint to the inliner. I think the inliner considers the LLVM builtin to be of a lower cost than a regular function call, although it is often translated to a function call in the end. Because of this lower cost, the inliner decided that machine.sendUSBPacket is just below the limit and decided to inline the function at all 11 places.
So, what do you think? I can add a commit to stop that function from being inlined.

Sidenote: Clang does this memory builtin translation too. When you call memcpy for example, it will emit a call to llvm.memcpy.<something> instead of a regular memcpy call.

deadprogram · 2020-03-27T19:15:03Z

Very good investigative work to figure out the root cause. Sounds like a good solution to me.

So I assume that be a separate PR, in which case should I should merge this one when that one is ready?

aykevl · 2020-03-27T19:24:13Z

Sounds good to me! It's not a functional regression and the fix is trivial.

deadprogram · 2020-03-27T20:01:47Z

Also now merging this. Good work!

aykevl added 2 commits March 26, 2020 16:09

aykevl mentioned this pull request Mar 27, 2020

machine: avoid binary size regression after LLVM memory intrinsics #998

Merged

deadprogram merged commit 91d1a23 into dev Mar 27, 2020

deadprogram deleted the llvm-mem-calls branch March 27, 2020 20:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM memory builtins #995

LLVM memory builtins #995

Uh oh!

aykevl commented Mar 26, 2020 •

edited

Loading

Uh oh!

deadprogram commented Mar 27, 2020

Uh oh!

aykevl commented Mar 27, 2020

Uh oh!

deadprogram commented Mar 27, 2020

Uh oh!

aykevl commented Mar 27, 2020 •

edited

Loading

Uh oh!

deadprogram commented Mar 27, 2020

Uh oh!

Uh oh!

LLVM memory builtins #995

LLVM memory builtins #995

Uh oh!

Conversation

aykevl commented Mar 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deadprogram commented Mar 27, 2020

Uh oh!

aykevl commented Mar 27, 2020

Uh oh!

deadprogram commented Mar 27, 2020

Uh oh!

aykevl commented Mar 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deadprogram commented Mar 27, 2020

Uh oh!

Uh oh!

aykevl commented Mar 26, 2020 •

edited

Loading

aykevl commented Mar 27, 2020 •

edited

Loading