Open
Description
Bugzilla Link | 8233 |
Version | trunk |
OS | All |
Reporter | LLVM Bugzilla Contributor |
CC | @topperc,@RKSimon,@Kojoley,@rotateright |
Extended Description
Sibcall optimization does not take place when 'sret' is used on X86-64. For example:
$ cat test.ll
; ModuleID = '/home/urxae/.ccache/test.tmp.s010625.15525.ii'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"
%struct.Foo = type { [3 x i64] }
define void @_Z10tailcallerv(%struct.Foo* sret %agg.result) nounwind {
entry:
tail call void @_Z10tailcalleev(%struct.Foo* sret %agg.result) nounwind
ret void
}
declare void @_Z10tailcalleev(%struct.Foo* sret) nounwind
$ llc test.ll -o -
.file "test.ll"
.text
.globl _Z10tailcallerv
.align 16, 0x90
.type _Z10tailcallerv,@function
_Z10tailcallerv: # @_Z10tailcallerv
# BB#0: # %entry
pushq %rbx
movq %rdi, %rbx
callq _Z10tailcalleev
movq %rbx, %rax
popq %rbx
ret
.Ltmp0:
.size _Z10tailcallerv, .Ltmp0-_Z10tailcallerv
.section .note.GNU-stack,"",@progbits
This despite the fact that the code for tailcaller() here only needs to be a jump to tailcallee()...
Removing sret from tailcaller() doesn't help, but removing it from both works.
(Note: removing it only from tailcallee() shouldn't work, because then tailcaller() can't copy %rdi to %rax on exit)
32-bit x86 code seems to have this problem as well, by the way.