Skip to content

Segmentation fault on AArch64 release build with opcache.jit=1112 #12596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pfustc opened this issue Nov 2, 2023 · 22 comments
Closed

Segmentation fault on AArch64 release build with opcache.jit=1112 #12596

pfustc opened this issue Nov 2, 2023 · 22 comments

Comments

@pfustc
Copy link
Contributor

pfustc commented Nov 2, 2023

Description

This appears on AArch64 build with phpdbg disabled.

Build steps

$ bash buildconf
$ bash configure --disable-phpdbg
$ make -j 50

INI

opcache.enable_cli=1
opcache.jit=1112
opcache.jit_buffer_size=16M

The following code:

<?php
class test {
    function __construct() {
        if (empty($this->test[0][0])) { print "test1";}
        if (!isset($this->test[0][0])) { print "test2";}
    }
}

$test1 = new test();
?>

Resulted in this output:

test1test2Segmentation fault (core dumped)

But I expected this output instead:

test1test2

GDB

(gdb) r
Starting program: /mnt/local/php-src/sapi/cli/php -c /mnt/local/www index.php
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
test1test2
Program received signal SIGSEGV, Segmentation fault.
0x0000aaaaaaeea798 in execute_ex (ex=0x0) at /home/penli01/php-src/Zend/zend_vm_execute.h:57064
57064			EG(current_execute_data) = EX(prev_execute_data);
(gdb) bt
#0  0x0000aaaaaaeea798 in execute_ex (ex=0x0) at /home/penli01/php-src/Zend/zend_vm_execute.h:57064
#1  0x0000aaaaaaef5d18 in zend_execute (op_array=0xfffff5681800, return_value=<optimized out>)
    at /home/penli01/php-src/Zend/zend_vm_execute.h:61594
#2  0x0000aaaaaae71cf4 in zend_execute_scripts (type=type@entry=8, retval=retval@entry=0x0, file_count=file_count@entry=3)
    at /home/penli01/php-src/Zend/zend.c:1881
#3  0x0000aaaaaae06060 in php_execute_script (primary_file=0xfffff7a87200 <__GI__IO_doallocbuf+108>, primary_file@entry=0xfffffffdb300)
    at /home/penli01/php-src/main/main.c:2492
#4  0x0000aaaaaaf69570 in do_cli (argc=4, argv=0xaaaaab97bc60) at /home/penli01/php-src/sapi/cli/php_cli.c:966
#5  0x0000aaaaaabbc9bc in main (argc=4, argv=<optimized out>) at /home/penli01/php-src/sapi/cli/php_cli.c:1340
(gdb)

PHP Version

PHP master @ 95c8ad2

Operating System

Ubuntu 22.04

@dstogov
Copy link
Member

dstogov commented Nov 2, 2023

I cannot reproduce this.
phpdbg shouldn't affect JIT directly. (I hope you didn't forget make install).

Could you please configure PHP with --with-capstone, then run the script with -d opcache.jit_debug=0x507 and provide the output.

@pfustc
Copy link
Contributor Author

pfustc commented Nov 2, 2023

phpdbg shouldn't affect JIT directly. (I hope you didn't forget make install).

Yes, I just checked that the issue also exists with phpdbg enabled. But the SegFault disappears with --enable-debug when configuring the build. Have you tried without --enable-debug?

Another guess: Have you run the test on real AArch64 hardware, or just QEMU? Perhaps there's some difference.

I can also reproduce this with make test. A lot of unit tests fail on AArch64 with opcache.jit=1112.

Could you please configure PHP with --with-capstone, then run the script with -d opcache.jit_debug=0x507 and provide the output.

Sure, please see below. Another clue from GDB is that SegFault is caused by (ex=0x0) at Zend/zend_vm_execute.h:57064, not in the JITed code.

$_main:
     ; (lines=4, args=0, vars=1, tmps=1, ssa_vars=0)
     ; (JIT)
     ; /mnt/local/www/index.php:1-11
     ; return  [] RANGE[0..0]
BB0:
     ; start lines=[0-1]
     ; to=(BB1)
     ; level=0
     ; children=(BB1)
0000 V1 = NEW 0 string("test")
0001 DO_FCALL

BB1:
     ; follow exit entry lines=[2-3]
     ; from=(BB0)
     ; idom=BB0
     ; level=1
0002 ASSIGN CV0($test1) V1
0003 RETURN int(1)
JIT$/mnt/local/www/index.php:
    aaaac4801010:	sub sp, sp, #0x30
    aaaac4801014:	bl #0xaaaac5c8f924
    aaaac4801018:	movz x0, #0x48d0
    aaaac480101c:	movk x0, #0xc671, lsl #16
    aaaac4801020:	movk x0, #0xaaaa, lsl #32
    aaaac4801024:	ldr x0, [x0]
    aaaac4801028:	cbnz x0, JIT$$exception_handler
    aaaac480102c:	movz w0, #0x8c78
    aaaac4801030:	movk w0, #0xbd0d, lsl #16
    aaaac4801034:	cmp w28, w0
    aaaac4801038:	b.ne .L8
    aaaac480103c:	ldr x0, [x27, #8]
    aaaac4801040:	mov x28, x0
    aaaac4801044:	movz x0, #0x8c78
    aaaac4801048:	movk x0, #0xbd0d, lsl #16
    aaaac480104c:	movk x0, #0xaaaa, lsl #32
    aaaac4801050:	str x0, [x27]
    aaaac4801054:	str xzr, [x27, #8]
    aaaac4801058:	str x27, [x28, #0x30]
    aaaac480105c:	ldr x19, [x28, #0x18]
    aaaac4801060:	ldr w0, [x19, #4]
    aaaac4801064:	and w0, w0, #0x800
    aaaac4801068:	cmp w0, #0
    aaaac480106c:	b.ne .L5
.L1:
    aaaac4801070:	ldrb w0, [x19]
    aaaac4801074:	cmp w0, #2
    aaaac4801078:	b.ne .L9
    aaaac480107c:	str xzr, [x28, #8]
    aaaac4801080:	str xzr, [x28, #0x10]
    aaaac4801084:	ldr x0, [x19, #0x38]
    aaaac4801088:	and x1, x0, #1
    aaaac480108c:	cmp x1, #0
    aaaac4801090:	b.eq .L2
    aaaac4801094:	movz x1, #0x4e50
    aaaac4801098:	movk x1, #0xc671, lsl #16
    aaaac480109c:	movk x1, #0xaaaa, lsl #32
    aaaac48010a0:	ldr x1, [x1]
    aaaac48010a4:	add x0, x1, x0
    aaaac48010a8:	ldr x0, [x0]
.L2:
    aaaac48010ac:	str x0, [x28, #0x40]
    aaaac48010b0:	movz x0, #0x4758
    aaaac48010b4:	movk x0, #0xc671, lsl #16
    aaaac48010b8:	movk x0, #0xaaaa, lsl #32
    aaaac48010bc:	str x28, [x0]
    aaaac48010c0:	mov x27, x28
    aaaac48010c4:	ldr x0, [x19, #0x50]
    aaaac48010c8:	mov x28, x0
    aaaac48010cc:	ldr w20, [x27, #0x2c]
    aaaac48010d0:	ldr w0, [x19, #0x20]
    aaaac48010d4:	cmp w0, w20
    aaaac48010d8:	b.lt .L6
    aaaac48010dc:	ldr w0, [x19, #4]
    aaaac48010e0:	and w0, w0, #0x100
    aaaac48010e4:	cmp w0, #0
    aaaac48010e8:	b.ne .L3
    aaaac48010ec:	lsl w0, w20, #5
    aaaac48010f0:	mov w0, w0
    aaaac48010f4:	add x28, x28, x0
.L3:
    aaaac48010f8:	ldr w0, [x19, #0x48]
    aaaac48010fc:	sub w0, w0, w20
    aaaac4801100:	cmp w0, #0
    aaaac4801104:	b.le .L7
    aaaac4801108:	mov w1, w20
    aaaac480110c:	lsl x1, x1, #4
    aaaac4801110:	add x1, x1, x27
    aaaac4801114:	add x1, x1, #0x58
.L4:
    aaaac4801118:	str wzr, [x1]
    aaaac480111c:	sub w0, w0, #1
    aaaac4801120:	cmp w0, #0
    aaaac4801124:	b.eq .L7
    aaaac4801128:	add x1, x1, #0x10
    aaaac480112c:	b .L4
.L5:
    aaaac4801130:	movz x17, #0xd6a8
    aaaac4801134:	movk x17, #0xb35a, lsl #16
    aaaac4801138:	movk x17, #0xffff, lsl #32
    aaaac480113c:	blr x17
    aaaac4801140:	cbz w0, JIT$$exception_handler
    aaaac4801144:	b .L1
.L6:
    aaaac4801148:	movz x17, #0x68d4
    aaaac480114c:	movk x17, #0xb362, lsl #16
    aaaac4801150:	movk x17, #0xffff, lsl #32
    aaaac4801154:	blr x17
    aaaac4801158:	b .L3
.L7:
    aaaac480115c:	ldr x0, [x28]
    aaaac4801160:	add sp, sp, #0x30
    aaaac4801164:	br x0
.ENTRY_2:
    aaaac4801168:	sub sp, sp, #0x30
.L8:
    aaaac480116c:	bl #0xaaaac5cb7234
    aaaac4801170:	movz x0, #0x48d0
    aaaac4801174:	movk x0, #0xc671, lsl #16
    aaaac4801178:	movk x0, #0xaaaa, lsl #32
    aaaac480117c:	ldr x0, [x0]
    aaaac4801180:	cbnz x0, JIT$$exception_handler
    aaaac4801184:	add sp, sp, #0x30
    aaaac4801188:	b ZEND_RETURN_SPEC_CONST_LABEL
.L9:
    aaaac480118c:	movz x0, #0x4758
    aaaac4801190:	movk x0, #0xc671, lsl #16
    aaaac4801194:	movk x0, #0xaaaa, lsl #32
    aaaac4801198:	str x28, [x0]
    aaaac480119c:	mov x1, sp
    aaaac48011a0:	movz w0, #0x1
    aaaac48011a4:	str w0, [x1, #8]
    aaaac48011a8:	ldr x2, [x19, #0x48]
    aaaac48011ac:	mov x0, x28
    aaaac48011b0:	blr x2
    aaaac48011b4:	movz x0, #0x4758
    aaaac48011b8:	movk x0, #0xc671, lsl #16
    aaaac48011bc:	movk x0, #0xaaaa, lsl #32
    aaaac48011c0:	str x27, [x0]
    aaaac48011c4:	mov x0, x28
    aaaac48011c8:	movz x17, #0xd5d0
    aaaac48011cc:	movk x17, #0xb35c, lsl #16
    aaaac48011d0:	movk x17, #0xffff, lsl #32
    aaaac48011d4:	blr x17
    aaaac48011d8:	ldrb w0, [x28, #0x2a]
    aaaac48011dc:	and w0, w0, #0x20
    aaaac48011e0:	cmp w0, #0
    aaaac48011e4:	b.ne .L13
.L10:
    aaaac48011e8:	ldrb w0, [x28, #0x2a]
    aaaac48011ec:	and w0, w0, #4
    aaaac48011f0:	cmp w0, #0
    aaaac48011f4:	b.ne .L15
    aaaac48011f8:	movz x0, #0x4738
    aaaac48011fc:	movk x0, #0xc671, lsl #16
    aaaac4801200:	movk x0, #0xaaaa, lsl #32
    aaaac4801204:	str x28, [x0]
.L11:
    aaaac4801208:	ldrb w0, [sp, #9]
    aaaac480120c:	cmp w0, #0
    aaaac4801210:	b.eq .L12
    aaaac4801214:	ldr x0, [sp]
    aaaac4801218:	ldr w1, [x0]
    aaaac480121c:	sub w1, w1, #1
    aaaac4801220:	str w1, [x0]
    aaaac4801224:	cmp w1, #0
    aaaac4801228:	b.ne .L16
    aaaac480122c:	movz x1, #0x8c78
    aaaac4801230:	movk x1, #0xbd0d, lsl #16
    aaaac4801234:	movk x1, #0xaaaa, lsl #32
    aaaac4801238:	str x1, [x27]
    aaaac480123c:	bl rc_dtor_func
.L12:
    aaaac4801240:	movz x0, #0x48d0
    aaaac4801244:	movk x0, #0xc671, lsl #16
    aaaac4801248:	movk x0, #0xaaaa, lsl #32
    aaaac480124c:	ldr x0, [x0]
    aaaac4801250:	cbnz x0, JIT$$icall_throw
    aaaac4801254:	movz x0, #0x4786
    aaaac4801258:	movk x0, #0xc671, lsl #16
    aaaac480125c:	movk x0, #0xaaaa, lsl #32
    aaaac4801260:	ldrb w0, [x0]
    aaaac4801264:	cmp w0, #0
    aaaac4801268:	b.ne .L18
    aaaac480126c:	movz x28, #0x8c98
    aaaac4801270:	movk x28, #0xbd0d, lsl #16
    aaaac4801274:	movk x28, #0xaaaa, lsl #32
    aaaac4801278:	b .L8
.L13:
    aaaac480127c:	ldr x0, [x28, #0x20]
    aaaac4801280:	ldr w1, [x0]
    aaaac4801284:	sub w1, w1, #1
    aaaac4801288:	str w1, [x0]
    aaaac480128c:	cmp w1, #0
    aaaac4801290:	b.ne .L14
    aaaac4801294:	bl zend_objects_store_del
    aaaac4801298:	b .L10
.L14:
    aaaac480129c:	ldr w1, [x0, #4]
    aaaac48012a0:	movz w2, #0xfc10
    aaaac48012a4:	movk w2, #0xffff, lsl #16
    aaaac48012a8:	and w1, w1, w2
    aaaac48012ac:	cmp w1, #0
    aaaac48012b0:	b.ne .L10
    aaaac48012b4:	bl gc_possible_root
    aaaac48012b8:	b .L10
.L15:
    aaaac48012bc:	mov x0, x28
    aaaac48012c0:	movz x17, #0x6790
    aaaac48012c4:	movk x17, #0xb35d, lsl #16
    aaaac48012c8:	movk x17, #0xffff, lsl #32
    aaaac48012cc:	blr x17
    aaaac48012d0:	b .L11
.L16:
    aaaac48012d4:	ldrb w1, [sp, #8]
    aaaac48012d8:	cmp w1, #0xa
    aaaac48012dc:	b.ne .L17
    aaaac48012e0:	ldrb w1, [x0, #0x11]
    aaaac48012e4:	and w1, w1, #2
    aaaac48012e8:	cmp w1, #0
    aaaac48012ec:	b.eq .L12
    aaaac48012f0:	ldr x0, [x0, #8]
.L17:
    aaaac48012f4:	ldr w1, [x0, #4]
    aaaac48012f8:	movz w2, #0xfc10
    aaaac48012fc:	movk w2, #0xffff, lsl #16
    aaaac4801300:	and w1, w1, w2
    aaaac4801304:	cmp w1, #0
    aaaac4801308:	b.ne .L12
    aaaac480130c:	bl gc_possible_root
    aaaac4801310:	b .L12
.L18:
    aaaac4801314:	movz x28, #0x8c98
    aaaac4801318:	movk x28, #0xbd0d, lsl #16
    aaaac480131c:	movk x28, #0xaaaa, lsl #32
    aaaac4801320:	b JIT$$interrupt_handler


test::__construct:
     ; (lines=11, args=0, vars=0, tmps=2, ssa_vars=0)
     ; (JIT)
     ; /mnt/local/www/index.php:3-6
     ; return  [] RANGE[0..0]
BB0:
     ; start lines=[0-3]
     ; to=(BB2, BB1)
     ; level=0
     ; children=(BB1, BB2)
0000 T0 = FETCH_OBJ_IS THIS string("test")
0001 T1 = FETCH_DIM_IS T0 int(0)
0002 T0 = ISSET_ISEMPTY_DIM_OBJ (empty) T1 int(0)
0003 JMPZ T0 BB2

BB1:
     ; follow lines=[4-4]
     ; from=(BB0)
     ; to=(BB2)
     ; idom=BB0
     ; level=1
0004 ECHO string("test1")

BB2:
     ; follow target lines=[5-8]
     ; from=(BB0, BB1)
     ; to=(BB4, BB3)
     ; idom=BB0
     ; level=1
     ; children=(BB3, BB4)
0005 T0 = FETCH_OBJ_IS THIS string("test")
0006 T1 = FETCH_DIM_IS T0 int(0)
0007 T0 = ISSET_ISEMPTY_DIM_OBJ (isset) T1 int(0)
0008 JMPNZ T0 BB4

BB3:
     ; follow lines=[9-9]
     ; from=(BB2)
     ; to=(BB4)
     ; idom=BB2
     ; level=2
0009 ECHO string("test2")

BB4:
     ; follow target exit lines=[10-10]
     ; from=(BB2, BB3)
     ; idom=BB2
     ; level=2
0010 RETURN null
JIT$test::__construct:
    aaaac4801330:	sub sp, sp, #0x30
    aaaac4801334:	ldr x0, [x27, #0x20]
    aaaac4801338:	ldr x1, [x27, #0x40]
    aaaac480133c:	ldr x2, [x1, #8]
    aaaac4801340:	ldr x3, [x0, #0x10]
    aaaac4801344:	cmp x3, x2
    aaaac4801348:	b.ne .L9
    aaaac480134c:	ldr x1, [x1, #0x10]
    aaaac4801350:	cmp x1, #0
    aaaac4801354:	b.lt .L10
    aaaac4801358:	add x1, x1, x0
    aaaac480135c:	ldrb w2, [x1, #8]
    aaaac4801360:	cmp w2, #0
    aaaac4801364:	b.eq .L9
    aaaac4801368:	ldr w0, [x1, #8]
    aaaac480136c:	ldr x1, [x1]
    aaaac4801370:	and w2, w0, #0xff00
    aaaac4801374:	cmp w2, #0
    aaaac4801378:	b.eq .L2
    aaaac480137c:	cmp w0, #0x10a
    aaaac4801380:	b.ne .L1
    aaaac4801384:	ldr w0, [x1, #0x10]
    aaaac4801388:	ldr x1, [x1, #8]
    aaaac480138c:	and w2, w0, #0xff00
    aaaac4801390:	cmp w2, #0
    aaaac4801394:	b.eq .L2
.L1:
    aaaac4801398:	ldr w2, [x1]
    aaaac480139c:	add w2, w2, #1
    aaaac48013a0:	str w2, [x1]
.L2:
    aaaac48013a4:	str x1, [x27, #0x50]
    aaaac48013a8:	str w0, [x27, #0x58]
.L3:
    aaaac48013ac:	movz x0, #0x48d0
    aaaac48013b0:	movk x0, #0xc671, lsl #16
    aaaac48013b4:	movk x0, #0xaaaa, lsl #32
    aaaac48013b8:	ldr x0, [x0]
    aaaac48013bc:	cbnz x0, JIT$$exception_handler
    aaaac48013c0:	add x28, x28, #0x20
    aaaac48013c4:	bl #0xaaaac5c9a9a0
    aaaac48013c8:	movz x0, #0x48d0
    aaaac48013cc:	movk x0, #0xc671, lsl #16
    aaaac48013d0:	movk x0, #0xaaaa, lsl #32
    aaaac48013d4:	ldr x0, [x0]
    aaaac48013d8:	cbnz x0, JIT$$exception_handler
    aaaac48013dc:	bl #0xaaaac5c9d190
    aaaac48013e0:	movz x0, #0x48d0
    aaaac48013e4:	movk x0, #0xc671, lsl #16
    aaaac48013e8:	movk x0, #0xaaaa, lsl #32
    aaaac48013ec:	ldr x0, [x0]
    aaaac48013f0:	cbnz x0, JIT$$exception_handler
    aaaac48013f4:	movz w0, #0x8b10
    aaaac48013f8:	movk w0, #0xbd0d, lsl #16
    aaaac48013fc:	cmp w28, w0
    aaaac4801400:	b.ne .L4
    aaaac4801404:	str x28, [x27]
    aaaac4801408:	movz x0, #0xfca8
    aaaac480140c:	movk x0, #0xbca2, lsl #16
    aaaac4801410:	movk x0, #0xaaaa, lsl #32
    aaaac4801414:	movz x1, #0x5
    aaaac4801418:	bl php_output_write
    aaaac480141c:	movz x0, #0x48d0
    aaaac4801420:	movk x0, #0xc671, lsl #16
    aaaac4801424:	movk x0, #0xaaaa, lsl #32
    aaaac4801428:	ldr x0, [x0]
    aaaac480142c:	cbnz x0, JIT$$exception_handler
.L4:
    aaaac4801430:	ldr x0, [x27, #0x20]
    aaaac4801434:	ldr x1, [x27, #0x40]
    aaaac4801438:	ldr x2, [x1, #8]
    aaaac480143c:	ldr x3, [x0, #0x10]
    aaaac4801440:	cmp x3, x2
    aaaac4801444:	b.ne .L11
    aaaac4801448:	ldr x1, [x1, #0x10]
    aaaac480144c:	cmp x1, #0
    aaaac4801450:	b.lt .L12
    aaaac4801454:	add x1, x1, x0
    aaaac4801458:	ldrb w2, [x1, #8]
    aaaac480145c:	cmp w2, #0
    aaaac4801460:	b.eq .L11
    aaaac4801464:	ldr w0, [x1, #8]
    aaaac4801468:	ldr x1, [x1]
    aaaac480146c:	and w2, w0, #0xff00
    aaaac4801470:	cmp w2, #0
    aaaac4801474:	b.eq .L6
    aaaac4801478:	cmp w0, #0x10a
    aaaac480147c:	b.ne .L5
    aaaac4801480:	ldr w0, [x1, #0x10]
    aaaac4801484:	ldr x1, [x1, #8]
    aaaac4801488:	and w2, w0, #0xff00
    aaaac480148c:	cmp w2, #0
    aaaac4801490:	b.eq .L6
.L5:
    aaaac4801494:	ldr w2, [x1]
    aaaac4801498:	add w2, w2, #1
    aaaac480149c:	str w2, [x1]
.L6:
    aaaac48014a0:	str x1, [x27, #0x50]
    aaaac48014a4:	str w0, [x27, #0x58]
.L7:
    aaaac48014a8:	movz x0, #0x48d0
    aaaac48014ac:	movk x0, #0xc671, lsl #16
    aaaac48014b0:	movk x0, #0xaaaa, lsl #32
    aaaac48014b4:	ldr x0, [x0]
    aaaac48014b8:	cbnz x0, JIT$$exception_handler
    aaaac48014bc:	movz x28, #0x8b50
    aaaac48014c0:	movk x28, #0xbd0d, lsl #16
    aaaac48014c4:	movk x28, #0xaaaa, lsl #32
    aaaac48014c8:	bl #0xaaaac5c9a9a0
    aaaac48014cc:	movz x0, #0x48d0
    aaaac48014d0:	movk x0, #0xc671, lsl #16
    aaaac48014d4:	movk x0, #0xaaaa, lsl #32
    aaaac48014d8:	ldr x0, [x0]
    aaaac48014dc:	cbnz x0, JIT$$exception_handler
    aaaac48014e0:	bl #0xaaaac5c9d190
    aaaac48014e4:	movz x0, #0x48d0
    aaaac48014e8:	movk x0, #0xc671, lsl #16
    aaaac48014ec:	movk x0, #0xaaaa, lsl #32
    aaaac48014f0:	ldr x0, [x0]
    aaaac48014f4:	cbnz x0, JIT$$exception_handler
    aaaac48014f8:	movz w0, #0x8bb0
    aaaac48014fc:	movk w0, #0xbd0d, lsl #16
    aaaac4801500:	cmp w28, w0
    aaaac4801504:	b.ne .L8
    aaaac4801508:	str x28, [x27]
    aaaac480150c:	movz x0, #0xfcd0
    aaaac4801510:	movk x0, #0xbca2, lsl #16
    aaaac4801514:	movk x0, #0xaaaa, lsl #32
    aaaac4801518:	movz x1, #0x5
    aaaac480151c:	bl php_output_write
    aaaac4801520:	movz x0, #0x48d0
    aaaac4801524:	movk x0, #0xc671, lsl #16
    aaaac4801528:	movk x0, #0xaaaa, lsl #32
    aaaac480152c:	ldr x0, [x0]
    aaaac4801530:	cbnz x0, JIT$$exception_handler
.L8:
    aaaac4801534:	movz x28, #0x8bd0
    aaaac4801538:	movk x28, #0xbd0d, lsl #16
    aaaac480153c:	movk x28, #0xaaaa, lsl #32
    aaaac4801540:	add sp, sp, #0x30
    aaaac4801544:	b ZEND_RETURN_SPEC_CONST_LABEL
.L9:
    aaaac4801548:	str x28, [x27]
    aaaac480154c:	movz x17, #0x26f0
    aaaac4801550:	movk x17, #0xb35d, lsl #16
    aaaac4801554:	movk x17, #0xffff, lsl #32
    aaaac4801558:	blr x17
    aaaac480155c:	b .L3
.L10:
    aaaac4801560:	str x28, [x27]
    aaaac4801564:	movz x17, #0x2804
    aaaac4801568:	movk x17, #0xb35d, lsl #16
    aaaac480156c:	movk x17, #0xffff, lsl #32
    aaaac4801570:	blr x17
    aaaac4801574:	b .L3
.L11:
    aaaac4801578:	movz x1, #0x8b30
    aaaac480157c:	movk x1, #0xbd0d, lsl #16
    aaaac4801580:	movk x1, #0xaaaa, lsl #32
    aaaac4801584:	str x1, [x27]
    aaaac4801588:	movz x17, #0x26f0
    aaaac480158c:	movk x17, #0xb35d, lsl #16
    aaaac4801590:	movk x17, #0xffff, lsl #32
    aaaac4801594:	blr x17
    aaaac4801598:	b .L7
.L12:
    aaaac480159c:	movz x2, #0x8b30
    aaaac48015a0:	movk x2, #0xbd0d, lsl #16
    aaaac48015a4:	movk x2, #0xaaaa, lsl #32
    aaaac48015a8:	str x2, [x27]
    aaaac48015ac:	movz x17, #0x2804
    aaaac48015b0:	movk x17, #0xb35d, lsl #16
    aaaac48015b4:	movk x17, #0xffff, lsl #32
    aaaac48015b8:	blr x17
    aaaac48015bc:	b .L7

test1test2Segmentation fault (core dumped)

@dstogov
Copy link
Member

dstogov commented Nov 2, 2023

Yes, I just checked that the issue also exists with phpdbg enabled. But the SegFault disappears with --enable-debug when configuring the build. Have you tried without --enable-debug?

Yes. I run with --enable-debug

Another guess: Have you run the test on real AArch64 hardware, or just QEMU? Perhaps there's some difference.

QEMU. I don't have a suitable AArch64 box.

I'll try the release build...

@dstogov
Copy link
Member

dstogov commented Nov 2, 2023

I can't reproduce the crash. My assembler code look identical, so I suspect, this may be related to memory model differences in QEMU and real CPU.

Can you try to debug this down?
The crash is most probably occur because x27 register reserved for execute_data somehow got invalid value.
I recommend to run PHP under GDB with -d opcache.jit_debug=0x100 (this will expose JIT symbols to GDB), set breakpoint at zend_runtime_jit, continue after the first break, finish after the second, then execute si few times (while get into JIT-ed test::__consructor) code, then record and continue. After the crash you may execute in reverse to understand where x27 got a wrong value, or what else went wrong.

@dstogov dstogov closed this as completed Nov 2, 2023
@dstogov dstogov reopened this Nov 2, 2023
@pfustc pfustc changed the title Segmentation fault on AArch64 with phpdbg disabled and opcache.jit=1112 Segmentation fault on AArch64 release build with opcache.jit=1112 Nov 3, 2023
@pfustc
Copy link
Contributor Author

pfustc commented Nov 3, 2023

Thanks for your guide. I just followed your steps and got the entry of the JIT-ed code.

=> 0xaaaaa9a01330 <JIT$test::__construct>:	sub	sp, sp, #0x30
   0xaaaaa9a01334 <JIT$test::__construct+4>:	ldr	x0, [x27, #32]
   0xaaaaa9a01338 <JIT$test::__construct+8>:	ldr	x1, [x27, #64]
   0xaaaaa9a0133c <JIT$test::__construct+12>:	ldr	x2, [x1, #8]
   0xaaaaa9a01340 <JIT$test::__construct+16>:	ldr	x3, [x0, #16]
   0xaaaaa9a01344 <JIT$test::__construct+20>:	cmp	x3, x2
   0xaaaaa9a01348 <JIT$test::__construct+24>:	b.ne	0xaaaaa9a01548 <JIT$test::__construct+536>  // b.any

Unfortunately, GDB record failed with record: failed to record execution log.. I turned to use watch but still didn't see x27 changed anywhere until the segmentation fault appeared.

From GDB I see x27 is still valid at the point of the SegFault. The instruction causes the SegFault looks like

0xaaaaaaed0a18 <execute_ex+804>:	str	x2, [x0, #488]

where x0 holds a zero. What does the previous ldr x0, [x19, #1392] access? I see the memory around [x19, #1392] are all zeros.

Please see below GDB output.

(gdb) x/40i $pc-120
   0xaaaaaaed09a0 <execute_ex+684>:	bl	0xaaaaaaea3d24 <ZEND_RECV_VARIADIC_SPEC_UNUSED_HANDLER>
   0xaaaaaaed09a4 <execute_ex+688>:	ldr	x0, [x28]
   0xaaaaaaed09a8 <execute_ex+692>:	br	x0
   0xaaaaaaed09ac <execute_ex+696>:	bl	0xaaaaaaea1fd0 <ZEND_INIT_DYNAMIC_CALL_SPEC_CV_HANDLER>
   0xaaaaaaed09b0 <execute_ex+700>:	ldr	x0, [x28]
   0xaaaaaaed09b4 <execute_ex+704>:	br	x0
   0xaaaaaaed09b8 <execute_ex+708>:	bl	0xaaaaaae85300 <ZEND_ECHO_SPEC_CONST_HANDLER>
   0xaaaaaaed09bc <execute_ex+712>:	ldr	x0, [x28]
   0xaaaaaaed09c0 <execute_ex+716>:	br	x0
   0xaaaaaaed09c4 <execute_ex+720>:	ldr	x1, [x27, #16]
   0xaaaaaaed09c8 <execute_ex+724>:	ldr	w2, [x28, #8]
   0xaaaaaaed09cc <execute_ex+728>:	cbz	x1, 0xaaaaaaed09ec <execute_ex+760>
   0xaaaaaaed09d0 <execute_ex+732>:	add	x3, x28, w2, sxtw
   0xaaaaaaed09d4 <execute_ex+736>:	ldr	x0, [x28, w2, sxtw]
   0xaaaaaaed09d8 <execute_ex+740>:	ldr	w2, [x3, #8]
   0xaaaaaaed09dc <execute_ex+744>:	str	x0, [x1]
   0xaaaaaaed09e0 <execute_ex+748>:	str	w2, [x1, #8]
   0xaaaaaaed09e4 <execute_ex+752>:	tst	w2, #0xff00
   0xaaaaaaed09e8 <execute_ex+756>:	b.ne	0xaaaaaaed81a4 <execute_ex+31408>  // b.any
   0xaaaaaaed09ec <execute_ex+760>:	ldr	w24, [x27, #40]
   0xaaaaaaed09f0 <execute_ex+764>:	mov	w0, #0x81f0000             	// #136249344
   0xaaaaaaed09f4 <execute_ex+768>:	str	x28, [x27]
   0xaaaaaaed09f8 <execute_ex+772>:	tst	w24, w0
   0xaaaaaaed09fc <execute_ex+776>:	b.ne	0xaaaaaaed3040 <execute_ex+10572>  // b.any
   0xaaaaaaed0a00 <execute_ex+780>:	ldr	x1, [x27, #24]
   0xaaaaaaed0a04 <execute_ex+784>:	add	x21, x27, #0x50
   0xaaaaaaed0a08 <execute_ex+788>:	ldr	x0, [x19, #1392]
   0xaaaaaaed0a0c <execute_ex+792>:	mov	w25, #0xfffffc10            	// #-1008
   0xaaaaaaed0a10 <execute_ex+796>:	ldr	w23, [x1, #72]
   0xaaaaaaed0a14 <execute_ex+800>:	ldr	x2, [x27, #48]
=> 0xaaaaaaed0a18 <execute_ex+804>:	str	x2, [x0, #488]
   0xaaaaaaed0a1c <execute_ex+808>:	cbnz	w23, 0xaaaaaaed0a54 <execute_ex+864>
   0xaaaaaaed0a20 <execute_ex+812>:	b	0xaaaaaaed0a80 <execute_ex+908>
   0xaaaaaaed0a24 <execute_ex+816>:	ldr	w1, [x0, #4]
   0xaaaaaaed0a28 <execute_ex+820>:	cmp	w1, #0x1a
   0xaaaaaaed0a2c <execute_ex+824>:	b.ne	0xaaaaaaed0a3c <execute_ex+840>  // b.any
   0xaaaaaaed0a30 <execute_ex+828>:	ldrb	w1, [x0, #17]
   0xaaaaaaed0a34 <execute_ex+832>:	tbz	w1, #1, 0xaaaaaaed0a48 <execute_ex+852>
   0xaaaaaaed0a38 <execute_ex+836>:	ldr	x0, [x0, #8]
   0xaaaaaaed0a3c <execute_ex+840>:	ldr	w1, [x0, #4]
(gdb) p/x $x27
$7 = 0xfffff5614090
(gdb) p/x $x0
$8 = 0x0
(gdb) x/100x $x19+1350
0xaaaaa22d8e96:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8ea6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8eb6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8ec6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8ed6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8ee6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8ef6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f06:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f16:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f26:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f36:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f46:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f56:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f66:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f76:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f86:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8f96:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8fa6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8fb6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8fc6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8fd6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8fe6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d8ff6:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d9006:	0x00000000	0x00000000	0x00000000	0x00000000
0xaaaaa22d9016:	0x00000000	0x00000000	0x00000000	0x00000000

@dstogov
Copy link
Member

dstogov commented Nov 3, 2023

Thank you for the debugging. My assumption about x27 was wrong.

It looks like execute_ex() uses x19 to store some temporary address (probably address of execute_global structure) and the JIT code clobbers this register.
The inline asm __asm__ __volatile__ (""::: "x19","x20","x21","x22","x23","x24","x25","x26"); should prevent using persistent registers across calls to JIT code, but it seems it doesn't. It probably, should be moved to a better place(s). E.g. directly before JIT code call.

I probably won't be able to dive deeper before Noveber 7.

@pfustc
Copy link
Contributor Author

pfustc commented Nov 8, 2023

Hi @dstogov,

I tracked all change points of x19 but don't see any JIT-ed code. All change points are in the Zend runtime, php-cli or system calls.

But I find two mistakes in AArch64 register definition.

_(X19, x19, w18) \
and
_(V19, d19, s19, h19, b18) \

Unfortunately, after fixing these locally, the segmentation fault issue still exists. Could you try running

make test TESTS="-d opcache.enable_cli=1 -d opcache.jit=0002 -d opcache.jit_buffer_size=16M"

with AArch64 release build to reproduce the failures in your QEMU environment?

@dstogov
Copy link
Member

dstogov commented Nov 8, 2023

Thanks for the fixes and tests. I'll try to debug this issue, but it's at the end of my TODO list.

@pfustc
Copy link
Contributor Author

pfustc commented Nov 8, 2023

Thanks for the fixes and tests. I'll try to debug this issue, but it's at the end of my TODO list.

I haven't created any PR for the fix. Would you like to push a commit to fix them directly?

@dstogov
Copy link
Member

dstogov commented Nov 8, 2023

I already committed your fix.

@dstogov
Copy link
Member

dstogov commented Nov 8, 2023

It seems the difference in behaviour is caused by the C compiler. In the following fragment the difference is caused by the line marked by ???. In my case the address of a global variables page is loaded by adrp instruction, your compiler seems to keep the address of current_execute_data stored in x19 + 1392.

                        my C compiler ( GCC 12.2.1)                      your C compiler
    =========================================================================================================
    <execute_ex+680>:   mov   w0, #0x81f0000                             mov   w0, #0x81f0000                       
    <execute_ex+684>:   str   x28, [x27]                                 str   x28, [x27]                           
    <execute_ex+688>:   tst   w21, w0                                    tst   w24, w0                              
    <execute_ex+692>:   b.ne  <execute_ex+105??>                         b.ne  <execute_ex+105??>
                                                                         ldr   x1, [x27, #24] 
                                                                         add   x21, x27, #0x50
??? <execute_ex+696>:   adrp  x0, 0x118a000 <core_globals+48>            ldr   x0, [x19, #1392]
    <execute_ex+700>:   add   x19, x27, #0x50
                                                                         mov   w25, #0xfffffc10
                                                                         ldr   w23, [x1, #72]  
    <execute_ex+704>:   ldr   x1, [x27, #48]                             ldr   x2, [x27, #48]
=>  <execute_ex+708>:   str   x1, [x0, #3512]                            str   x2, [x0, #488]

I have no idea why your compiler accesses global variables differently, but this difference together with x19 clobbering cause the failure.

Do you use GCC and Linux?
May be some special options?
This may be related to resolution of external symbols in a DSO library, but I can't imagine what your compiler keep in x19. And why it does this despite violation of inline asm constraint. (may be doesn't violate it, but I can't see this yet).
Does your compiler support GNU stile inline assembler?

@dstogov
Copy link
Member

dstogov commented Nov 8, 2023

This may be related to resolution of external symbols in a DSO library

This is actually not a DSO library and the global variable is defined in the same binary.
May be somehow related to linking.

@pfustc
Copy link
Contributor Author

pfustc commented Nov 9, 2023

I just re-investigated this issue a bit.

Previously I was using GCC 11.4.0 on Ubuntu 22.04. After I upgraded GCC to 12.3.0, the segmentation fault no longer appears. I have reproduced some differences in the code between using GCC 11.4.0 and 12.3.0 just like what you showed above. But I didn't investigate it further because I don't think it's a GCC bug.

The most important thing is why x19 holds a wrong value. Yes, the JIT-ed code clobbers it.

This time I set breakpoint at zend_runtime_jit and fin at the first break. (Last time I continued at the first break so I missed something.) After several instructions, I see

   0xaaaaa9a01050 <JIT$/mnt/local/www/index.php+64>:	str	x0, [x27]
   0xaaaaa9a01054 <JIT$/mnt/local/www/index.php+68>:	str	xzr, [x27, #8]
   0xaaaaa9a01058 <JIT$/mnt/local/www/index.php+72>:	str	x27, [x28, #48]
=> 0xaaaaa9a0105c <JIT$/mnt/local/www/index.php+76>:	ldr	x19, [x28, #24]
   0xaaaaa9a01060 <JIT$/mnt/local/www/index.php+80>:	ldr	w0, [x19, #4]
   0xaaaaa9a01064 <JIT$/mnt/local/www/index.php+84>:	and	w0, w0, #0x800
   0xaaaaa9a01068 <JIT$/mnt/local/www/index.php+88>:	cmp	w0, #0x0

After this, the value of x19 is

(gdb) p $x19
$3 = 187649842055504

It's exactly the same wrong value in x19 at the segmentation fault point.

Does the inline asm __asm__ __volatile__ (""::: "x19","x20","x21","x22","x23","x24","x25","x26"); work?

Yes, it does something. x19 ~ x26 are callee-saved registers on AArch64. So some stack push/pop instructions are generated by GCC in execute_ex, like below.

   0xaaaaaaecccf4 <execute_ex>:	        stp	x29, x30, [sp, #-176]!
   0xaaaaaaecccf8 <execute_ex+4>:	adrp	x1, 0xaaaaab908000 <[email protected]>
   0xaaaaaaecccfc <execute_ex+8>:	mov	x29, sp
   0xaaaaaaeccd00 <execute_ex+12>:	ldr	x1, [x1, #928]
   0xaaaaaaeccd04 <execute_ex+16>:	stp	x19, x20, [sp, #16]
   0xaaaaaaeccd08 <execute_ex+20>:	stp	x21, x22, [sp, #32]
   0xaaaaaaeccd0c <execute_ex+24>:	stp	x23, x24, [sp, #48]
   0xaaaaaaeccd10 <execute_ex+28>:	stp	x25, x26, [sp, #64]
   0xaaaaaaeccd14 <execute_ex+32>:	ldr	x2, [x1]

   ... (skipped some instructions)

   0xaaaaaaeccdd0 <execute_ex+220>:	ldp	x19, x20, [sp, #16]
   0xaaaaaaeccdd4 <execute_ex+224>:	ldp	x21, x22, [sp, #32]
   0xaaaaaaeccdd8 <execute_ex+228>:	ldp	x23, x24, [sp, #48]
   0xaaaaaaeccddc <execute_ex+232>:	ldp	x25, x26, [sp, #64]
   0xaaaaaaeccde0 <execute_ex+236>:	ldp	x29, x30, [sp], #176
   0xaaaaaaeccde4 <execute_ex+240>:	ret

But it seems that they do not wrap the JIT-ed code correctly.

I ran the JIT-ed code instruction by instruction, and found it branched out at b ZEND_RETURN_SPEC_CONST_LABEL.

   0xaaaaa9a01534 <JIT$test::__construct+516>:	mov	x28, #0x8bd0                	// #35792
   0xaaaaa9a01538 <JIT$test::__construct+520>:	movk	x28, #0xa22d, lsl #16
   0xaaaaa9a0153c <JIT$test::__construct+524>:	movk	x28, #0xaaaa, lsl #32
   0xaaaaa9a01540 <JIT$test::__construct+528>:	add	sp, sp, #0x30
=> 0xaaaaa9a01544 <JIT$test::__construct+532>:	b	0xaaaaaaeccfc4 <execute_ex+720>

The jump target is around HYBRID_CASE(ZEND_RETURN_SPEC_CONST): in zend_vm_execute.h:57406. After several instructions, the execution reaches zend_vm_execute.h:57054 - the segmentation fault point.

So, x19 is clobbered but not restored anywhere between branching out the JIT-ed code and the segmentation fault point. Perhaps we should move the inline asm to another place.

Something interesting:

  • This issue only appears with release build, not debug build
  • This issue only appears with CRTO=xxx2 - Inline VM handlers

@dstogov
Copy link
Member

dstogov commented Nov 9, 2023

Thanks for the debugging.

But I didn't investigate it further because I don't think it's a GCC bug.

Right. This is not a GCC bug. The different GCC version just makes the bug visible.
It would be great to reproduce it with my GCC version.

The most important thing is why x19 holds a wrong value. Yes, the JIT-ed code clobbers it.

Yeah. x19 appear in the disassembler you already posted a week ago.

So, x19 is clobbered but not restored anywhere between branching out the JIT-ed code and the segmentation fault point. Perhaps we should move the inline asm to another place.

I think the same.
In case I can reproduce the bug I will able to fix it.
In worst case, I may provide a patch and you will test it.

Something interesting:
* This issue only appears with release build, not debug build

It's clear. Debug build uses -O0 and this prevents compiler using aggressive register allocation.

* This issue only appears with CRTO=xxx2 - Inline VM handlers

Do you mean opcache.jit=xxx2?
Probably this optimization level leads to calling ZEND_RETURN VM handler that causes failure in case of register clobbering. Less aggressive JIT don't use persistent registers, more aggressive JIT inlines ZEND_RETURN handler, but this is just a guess. I think the bug is general, and may lead to failures with other JIT levels and on different platforms as well. So it would be great to fix it.

@dstogov dstogov self-assigned this Nov 28, 2023
dstogov added a commit to dstogov/php-src that referenced this issue Nov 28, 2023
@dstogov
Copy link
Member

dstogov commented Nov 28, 2023

I made some more research and found that the difference in generated code may be caused by -fpic or -fpie flags and common sub-expression elimination for adrp x19, _GLOBAL_OFFSET_TABLE_. Unfortunately, I wasn't able to reproduce the CSE effect.

I hope you are still able to reproduce the failure with old GCC.
Could you please check if #12813 fixes this.

@pfustc
Copy link
Contributor Author

pfustc commented Nov 29, 2023

I hope you are still able to reproduce the failure with old GCC.
Could you please check if #12813 fixes this.

I just did some tests with #12813. I don't think the issue is fixed.

The segmentation fault in my previous case does not appear now. But in other test cases, I still see segmentation faults at the same point caused by x19 clobbered by JIT-ed code. Running below command with release build can reproduce all the issues.

make test TESTS="-d opcache.protect_memory=1 -d opcache.enable_cli=1 -d opcache.jit_buffer_size=16M -d opcache.jit=1112"

Below is another case that still causes segmentation fault with the same JIT option.

<?php
error_reporting(0);
$a = 10;

function Test()
{
    static $a=1;
    global $b;
    $c = 1;
    $b = 5;
    echo "$a $b ";
    $a++;
    $c++;
    echo "$a $c ";
}

Test();
echo "$a $b $c ";
Test();
echo "$a $b $c ";
Test();
?>

What do you think of making x19 to x26 not allocatable again? The old JIT implementation does so by defining a register set of ZEND_REGSET_PRESERVED. And AArch64 still has enough numbers of allocatable GP registers even without them.

	/* TODO: Allow usage of preserved registers ???
	 * Their values have to be stored in prologue and restored in epilogue
	 */
	available = ZEND_REGSET_DIFFERENCE(available, ZEND_REGSET_PRESERVED);

@dstogov
Copy link
Member

dstogov commented Nov 29, 2023

What do you think of making x19 to x26 not allocatable again? The old JIT implementation does so by defining a register set of ZEND_REGSET_PRESERVED. And AArch64 still has enough numbers of allocatable GP registers even without them.

This is not a solution. Only x19-x26 may be used across function calls without spilling.

@pfustc
Copy link
Contributor Author

pfustc commented Nov 30, 2023

This is not a solution. Only x19-x26 may be used across function calls without spilling.

Perhaps I don't quite understand your intention. What do you mean by "across function calls without spilling"?

AFAIK, if we conform to the C calling convention, we should save and restore x19-x26 if they are clobbered in a compiled function. I just tried CALL VM and found it works exactly in this way. The CALL VM generates stp and ldp sequences in ir_emit_prologue and ir_emit_epilogue respectively, so it doesn't have this SegFault issue. But in HYBRID VM, we don't set ctx.fixed_save_regset in zend_jit_init_ctx() hence these callee-saved registers are NOT actually protected by the JIT-ed code.

If I understand correctly, the goal of __asm__ __volatile__() is to protect these registers, but it doesn't. This inline asm can save the clobbered registers before the asm code restore them afterwards. However, the real JIT-ed code is not at this place. So there is no code restoring clobbered registers after the JIT-ed code. Your patch #12813 moves the inline asm into the big HYBRID while loop. But there is still no register restoring if the JIT-ed code directly branches to another handler.

Perhaps it's not a good solution to protect the callee-saves by adding __asm__ __volatile__() anywhere. Why not save and restore these registers in the prologue and epilogue just like the CALL VM does?

@dstogov
Copy link
Member

dstogov commented Nov 30, 2023

This is not a solution. Only x19-x26 may be used across function calls without spilling.

Perhaps I don't quite understand your intention. What do you mean by "across function calls without spilling"?

In case we perform a CALL in JIT code and some value is used before and after the CALL then it can't be keept in scratch registers. Only in x19-x26 or in memory. This will reduce quality of generated code.

AFAIK, if we conform to the C calling convention, we should save and restore x19-x26 if they are clobbered in a compiled function. I just tried CALL VM and found it works exactly in this way. The CALL VM generates stp and ldp sequences in ir_emit_prologue and ir_emit_epilogue respectively, so it doesn't have this SegFault issue. But in HYBRID VM, we don't set ctx.fixed_save_regset in zend_jit_init_ctx() hence these callee-saved registers are NOT actually protected by the JIT-ed code.

HYBRID VM executes JIT code in context of VM execite_ex(). It doesn't create additional stack frame.

If I understand correctly, the goal of __asm__ __volatile__() is to protect these registers, but it doesn't. This inline asm can save the clobbered registers before the asm code restore them afterwards. However, the real JIT-ed code is not at this place. So there is no code restoring clobbered registers after the JIT-ed code. Your patch #12813 moves the inline asm into the big HYBRID while loop. But there is still no register restoring if the JIT-ed code directly branches to another handler.

I see.

It should be another solution to prevent CC from allocating registers across JIT transitions.
Somehow a register is used only to keep _GLOBAL_OFFSET_TABLE_ and this is done only by old GCC, however symbols may be linked statically (using ADRP).

Perhaps it's not a good solution to protect the callee-saves by adding __asm__ __volatile__() anywhere. Why not save and restore these registers in the prologue and epilogue just like the CALL VM does?

This will increase the cost of each JIT<->VM transition.

@pfustc
Copy link
Contributor Author

pfustc commented Dec 1, 2023

Thank you so much for your explanation. I have learned a lot from your reply.

dstogov added a commit to dstogov/php-src that referenced this issue Dec 5, 2023
@dstogov
Copy link
Member

dstogov commented Dec 5, 2023

@pfustc I understood the difference between GCC 11.4.0 from Ubuntu 22.04 and my GCC. GCC 11.4.0 was configured with --enable-default-pie and as result it silently adds -fpie flag that leads to PIC code to access external variables.

You may check if your GCC 12.0.3 produces PIC code by echo | cc -dM -E - | grep -i pic

Anyway, I updated #12813 and now it works for me even with GCC 11.4.0

@pfustc
Copy link
Contributor Author

pfustc commented Dec 5, 2023

Anyway, I updated #12813 and now it works for me even with GCC 11.4.0

I have run unit tests with HYBRID VM and all JIT option combinations on AArch64. It looks good.

@dstogov dstogov closed this as completed in 8cc6b35 Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants