Skip to content

Expose methods for inspecting Micro JIT code blocks #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 21, 2021
Merged

Conversation

tenderlove
Copy link

This commit adds a module UJIT. The module allows you to insert the
initial Micro JIT instruction in to an arbitrary iseq like this:

def foo(x)
  if x < 1
    "less than one"
  else
    "something else"
  end
end

iseq = RubyVM::InstructionSequence.of(method(:foo))

UJIT.insert(iseq) # Add initial jump

After the initial jump is added, we can make Micro JIT do some work:

100.times { foo(0) }

The UJIT module also exposes a method for finding all compiled blocks
for a given iseq, like this:

blocks = UJIT.blocks_for(iseq)

We can sort the blocks by address and use the Crabstone gem (which is a
wrapper around capstone) to disassemble the generated code.

Here is the full code example:

def foo(x)
  if x < 1
    "less than one"
  else
    "something else"
  end
end

iseq = RubyVM::InstructionSequence.of(method(:foo))

UJIT.insert(iseq) # Add initial jump

100.times { foo(0) }

blocks = UJIT.blocks_for(iseq)

 # brew install capstone
 # gem install crabstone
require "crabstone"

cs = Crabstone::Disassembler.new(Crabstone::ARCH_X86, Crabstone::MODE_64)

puts iseq.disasm

blocks.sort_by(&:address).reverse.each do |block|
  puts "== ISEQ RANGE: #{block.iseq_start_index} -> #{block.iseq_end_index} ".ljust(80, "=")
  cs.disasm(block.code, 0).each do |i|
    printf(
      "\t0x%<address>x:\t%<instruction>s\t%<details>s\n",
      address: i.address,
      instruction: i.mnemonic,
      details: i.op_str
    )
  end
end

Here is the output:

$ ./ruby test.rb
== disasm: #<ISeq:[email protected]:1 (1,0)-(7,3)> (catch: FALSE)
local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0<Arg>
0000 getlocal_WC_0                          x@0                       (   2)[LiCa]
0002 putobject_INT2FIX_1_
0003 opt_lt                                 <calldata!mid:<, argc:1, ARGS_SIMPLE>
0005 branchunless                           10
0007 putstring                              "less than one"           (   3)[Li]
0009 leave                                                            (   7)[Re]
0010 putstring                              "something else"          (   5)[Li]
0012 leave                                                            (   7)[Re]
== ISEQ RANGE: 7 -> 7 ==========================================================
	0x0:	movabs	rax, 0x7fcd014cd518
	0xa:	mov	qword ptr [rdi], rax
	0xd:	mov	r8, rax
	0x10:	mov	r9, rax
	0x13:	mov	r11, r12
	0x16:	jmp	qword ptr [rax]
== ISEQ RANGE: 0 -> 7 ==========================================================
	0x0:	mov	rax, qword ptr [rdi + 0x20]
	0x4:	mov	rax, qword ptr [rax - 0x18]
	0x8:	mov	qword ptr [rdx], rax
	0xb:	mov	qword ptr [rdx + 8], 3
	0x13:	movabs	rax, 0x7fcd0180ac00
	0x1d:	test	byte ptr [rax + 0x3e6], 1
	0x24:	jne	0x3ffe0da
	0x2a:	test	byte ptr [rdx], 1
	0x2d:	je	0x3ffe0da
	0x33:	test	byte ptr [rdx + 8], 1
	0x37:	je	0x3ffe0da
	0x3d:	mov	rax, qword ptr [rdx]
	0x40:	cmp	rax, qword ptr [rdx + 8]
	0x44:	movabs	rax, 0
	0x4e:	movabs	rcx, 0x14
	0x58:	cmovl	rax, rcx
	0x5c:	mov	qword ptr [rdx], rax
	0x5f:	test	qword ptr [rdx], -9
	0x66:	je	0x3ffe111
	0x6c:	jmp	0xffffffffffffffa3

@XrXr
Copy link

XrXr commented Jan 20, 2021

I should fix the CI builds...

@maximecb
Copy link

maximecb commented Jan 21, 2021

I like what you've done. It seems like a good idea to have a ujit module with an API for this and various things. I think that will serve us well in the future when we want to write tests and add more introspection.

I have 3 small requests and a question:

  • Can we name the module lowercase ujit instead of UJIT ?
  • Can we name .insert just install_entry or attach_entry?
  • I'd rather have the code in ujit_ruby.c be in ujit_iface.c. The goal of the ujit_iface.c file was to hold things that relate to our interface to the outside.

Question: you wrote code to produce a dissassembly with crabstone. How can we package this code so that it's easy to use and people don't have to look for it? Is there any way we can add method written in Ruby to the ujit module that does the disasm of a method or iseq? This can be a separate pull request if you prefer.

Ideally if we could do something like ujit.disasm(method or iseq) that would be super convenient.

@maximecb maximecb added the enhancement New feature or request label Jan 21, 2021
@tenderlove
Copy link
Author

  • Can we name the module lowercase ujit instead of UJIT ?

We can name it Ujit. The first letter of a Ruby constant must be uppercase (which is why I just did UJIT). Given the constraints of constant names in Ruby, what would you prefer?

  • Can we name .insert just install_entry or attach_entry?

Yep, I'll change that

  • I'd rather have the code in ujit_ruby.c be in ujit_iface.c. The goal of the ujit_iface.c file was to hold things that relate to our interface to the outside.

Sure, that's fine. I can move it in that file.

Question: you wrote code to produce a dissassembly with crabstone. How can we package this code so that it's easy to use and people don't have to look for it? Is there any way we can add method written in Ruby to the ujit module that does the disasm of a method or iseq? This can be a separate pull request if you prefer.

IMO we should package that as an external gem (maybe a micro jit gem?). I don't think it's a good idea to add a dependency to a disassembler to Ruby. For now, I think we could just check in a script for us to use while debugging, but eventually I think we need a gem or library that exists outside Ruby.

@maximecb
Copy link

IMO we should package that as an external gem (maybe a micro jit gem?). I don't think it's a good idea to add a dependency to a disassembler to Ruby. For now, I think we could just check in a script for us to use while debugging, but eventually I think we need a gem or library that exists outside Ruby.

I agree that we shouldn't add a dependency to Ruby. Though we do need some kind of way to package the code in a convenient way.

In Python it's easy to lazily import a module, or let the import fail and catch the exception. We could have ujit.disasm() lazily import the disassembler lib right when the method is called, and just print an error message (please install this lib) and return nil if not found. That way we can kind of have the best of both worlds. People can just install that lib manually if they need disasm?

Alternatively, some custom script we can easily require would be OK too. I'll let you pick the solution you prefer.

This commit adds a module `UJIT`.  The module allows you to insert the
initial Micro JIT instruction in to an arbitrary iseq like this:

```ruby
def foo(x)
  if x < 1
    "less than one"
  else
    "something else"
  end
end

iseq = RubyVM::InstructionSequence.of(method(:foo))

UJIT.insert(iseq) # Add initial jump
```

After the initial jump is added, we can make Micro JIT do some work:

```ruby
100.times { foo(0) }
```

The `UJIT` module also exposes a method for finding all compiled blocks
for a given iseq, like this:

```ruby
blocks = UJIT.blocks_for(iseq)
```

We can sort the blocks by address and use the Crabstone gem (which is a
wrapper around `capstone`) to disassemble the generated code.

Here is the full code example:

```ruby
def foo(x)
  if x < 1
    "less than one"
  else
    "something else"
  end
end

iseq = RubyVM::InstructionSequence.of(method(:foo))

UJIT.insert(iseq) # Add initial jump

100.times { foo(0) }

blocks = UJIT.blocks_for(iseq)

 # brew install capstone
 # gem install crabstone
require "crabstone"

cs = Crabstone::Disassembler.new(Crabstone::ARCH_X86, Crabstone::MODE_64)

puts iseq.disasm

blocks.sort_by(&:address).reverse.each do |block|
  puts "== ISEQ RANGE: #{block.iseq_start_index} -> #{block.iseq_end_index} ".ljust(80, "=")
  cs.disasm(block.code, 0).each do |i|
    printf(
      "\t0x%<address>x:\t%<instruction>s\t%<details>s\n",
      address: i.address,
      instruction: i.mnemonic,
      details: i.op_str
    )
  end
end
```

Here is the output:

```
$ ./ruby test.rb
== disasm: #<ISeq:[email protected]:1 (1,0)-(7,3)> (catch: FALSE)
local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0<Arg>
0000 getlocal_WC_0                          x@0                       (   2)[LiCa]
0002 putobject_INT2FIX_1_
0003 opt_lt                                 <calldata!mid:<, argc:1, ARGS_SIMPLE>
0005 branchunless                           10
0007 putstring                              "less than one"           (   3)[Li]
0009 leave                                                            (   7)[Re]
0010 putstring                              "something else"          (   5)[Li]
0012 leave                                                            (   7)[Re]
== ISEQ RANGE: 7 -> 7 ==========================================================
	0x0:	movabs	rax, 0x7fcd014cd518
	0xa:	mov	qword ptr [rdi], rax
	0xd:	mov	r8, rax
	0x10:	mov	r9, rax
	0x13:	mov	r11, r12
	0x16:	jmp	qword ptr [rax]
== ISEQ RANGE: 0 -> 7 ==========================================================
	0x0:	mov	rax, qword ptr [rdi + 0x20]
	0x4:	mov	rax, qword ptr [rax - 0x18]
	0x8:	mov	qword ptr [rdx], rax
	0xb:	mov	qword ptr [rdx + 8], 3
	0x13:	movabs	rax, 0x7fcd0180ac00
	0x1d:	test	byte ptr [rax + 0x3e6], 1
	0x24:	jne	0x3ffe0da
	0x2a:	test	byte ptr [rdx], 1
	0x2d:	je	0x3ffe0da
	0x33:	test	byte ptr [rdx + 8], 1
	0x37:	je	0x3ffe0da
	0x3d:	mov	rax, qword ptr [rdx]
	0x40:	cmp	rax, qword ptr [rdx + 8]
	0x44:	movabs	rax, 0
	0x4e:	movabs	rcx, 0x14
	0x58:	cmovl	rax, rcx
	0x5c:	mov	qword ptr [rdx], rax
	0x5f:	test	qword ptr [rdx], -9
	0x66:	je	0x3ffe111
	0x6c:	jmp	0xffffffffffffffa3
```
@maximecb
Copy link

We can name it Ujit. The first letter of a Ruby constant must be uppercase (which is why I just did UJIT). Given the constraints of constant names in Ruby, what would you prefer?

Hawkward 🦅 Hmmm. I guess UJIT is OK.

@tenderlove
Copy link
Author

I added a script under misc as a helper for disassembly. Here is an example of how to use the script:

require "misc/ujit_disasm"

def foo(x)
  if x < 1
    "less than one"
  else
    "something else"
  end
end

iseq = RubyVM::InstructionSequence.of(method(:foo))

puts UJIT.disasm(iseq) # returns nil because nothing has been generated

100.times { foo 1 }

puts UJIT.disasm(iseq) # returns stuff because the JIT executed

Then in the shell:

$ ./ruby -I. test.rb

== disasm: #<ISeq:[email protected]:3 (3,0)-(9,3)> (catch: FALSE)
local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0<Arg>
0000 getlocal_WC_0                          x@0                       (   4)[LiCa]
0002 putobject_INT2FIX_1_
0003 opt_lt                                 <calldata!mid:<, argc:1, ARGS_SIMPLE>
0005 branchunless                           10
0007 putstring                              "less than one"           (   5)[Li]
0009 leave                                                            (   9)[Re]
0010 putstring                              "something else"          (   7)[Li]
0012 leave                                                            (   9)[Re]
== ISEQ RANGE: 10 -> 10 ========================================================
	0x0:	movabs	rax, 0x7fc38f5d17b0
	0xa:	mov	qword ptr [rdi], rax
	0xd:	mov	r8, rax
	0x10:	mov	r9, rax
	0x13:	mov	r11, r12
	0x16:	jmp	qword ptr [rax]
== ISEQ RANGE: 0 -> 7 ==========================================================
	0x0:	mov	rax, qword ptr [rdi + 0x20]
	0x4:	mov	rax, qword ptr [rax - 0x18]
	0x8:	mov	qword ptr [rdx], rax
	0xb:	mov	qword ptr [rdx + 8], 3
	0x13:	movabs	rax, 0x7fc390009400
	0x1d:	test	byte ptr [rax + 0x3e6], 1
	0x24:	jne	0x3ffca7a
	0x2a:	test	byte ptr [rdx], 1
	0x2d:	je	0x3ffca7a
	0x33:	test	byte ptr [rdx + 8], 1
	0x37:	je	0x3ffca7a
	0x3d:	mov	rax, qword ptr [rdx]
	0x40:	cmp	rax, qword ptr [rdx + 8]
	0x44:	movabs	rax, 0
	0x4e:	movabs	rcx, 0x14
	0x58:	cmovl	rax, rcx
	0x5c:	mov	qword ptr [rdx], rax
	0x5f:	test	qword ptr [rdx], -9
	0x66:	jne	0x3ffcad4

@maximecb
Copy link

Nice, very well done!

@maximecb maximecb merged commit 5c436fd into microjit Jan 21, 2021
@tenderlove tenderlove deleted the ujit-disasm branch January 21, 2021 22:44
@maximecb maximecb mentioned this pull request Jan 21, 2021
tenderlove added a commit that referenced this pull request Apr 20, 2021
This commit adds a check on the ep just like in the mark function.  The
env can contain null bytes if allocation tracing is enabled.

We're seeing errors during autocompaction like this:

```
(lldb) bt 40
* thread #1, name = 'ruby', stop reason = signal SIGABRT
    frame #0: 0x00007f7d64b6018b libc.so.6`raise + 203
    frame #1: 0x00007f7d64b3f859 libc.so.6`abort + 299
    frame #2: 0x000055af5f2fefc9 ruby`die at error.c:764:5
    frame #3: 0x000055af5f2ff1ac ruby`rb_bug_for_fatal_signal(default_sighandler=0x0000000000000000, sig=11, ctx=0x000055af60bc3340, fmt="") at error.c:804:5
    frame #4: 0x000055af5f4bd08f ruby`sigsegv(sig=11, info=0x000055af60bc3470, ctx=0x000055af60bc3340) at signal.c:960:5
    frame #5: 0x00007f7d64ebe3c0 libpthread.so.0`__restore_rt
    frame #6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13
    frame #7: 0x000055af5f339172 ruby`gc_update_object_references(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9307:9
    frame #8: 0x000055af5f338e79 ruby`gc_ref_update(vstart=0x00007f7d5b510010, vend=0x00007f7d5b513ff8, stride=40, objspace=0x000055af60b2b040, page=0x000055af62577aa0) at gc.c:9452:21
    frame #9: 0x000055af5f337846 ruby`gc_update_references(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:9481:9
    frame #10: 0x000055af5f336569 ruby`gc_compact_finish(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:4840:5
    frame #11: 0x000055af5f335efb ruby`gc_page_sweep(objspace=0x000055af60b2b040, heap=0x000055af60b2b068, sweep_page=0x000055af63a1eb30) at gc.c:5046:13
    frame #12: 0x000055af5f3355c5 ruby`gc_sweep_step(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:5214:19
    frame #13: 0x000055af5f33daf6 ruby`gc_sweep_rest(objspace=0x000055af60b2b040) at gc.c:5271:2
    frame #14: 0x000055af5f33cacd ruby`gc_sweep(objspace=0x000055af60b2b040) at gc.c:5389:2
    frame #15: 0x000055af5f33c21d ruby`gc_marks_rest(objspace=0x000055af60b2b040) at gc.c:7555:5
    frame #16: 0x000055af5f324d41 ruby`gc_rest(objspace=0x000055af60b2b040) at gc.c:8457:13
    frame #17: 0x000055af5f3297d8 ruby`garbage_collect(objspace=0x000055af60b2b040, reason=45568) at gc.c:8318:9
    frame #18: 0x000055af5f344ece ruby`garbage_collect_with_gvl(objspace=0x000055af60b2b040, reason=45568) at gc.c:8632:9
    frame #19: 0x000055af5f344e61 ruby`objspace_malloc_gc_stress(objspace=0x000055af60b2b040) at gc.c:10592:9
    frame #20: 0x000055af5f32ced1 ruby`objspace_xmalloc0(objspace=0x000055af60b2b040, size=64) at gc.c:10767:5
    frame #21: 0x000055af5f32ce11 ruby`ruby_xmalloc0(size=64) at gc.c:10988:12
    frame #22: 0x000055af5f32cdac ruby`ruby_xmalloc_body(size=64) at gc.c:10997:12
    frame #23: 0x000055af5f329415 ruby`ruby_xmalloc(size=64) at gc.c:12942:12
    frame #24: 0x00007f7d611c4fe5 objspace.so`newobj_i(tpval=0x00007f7d5b553770, data=0x000055af639031a0) at object_tracing.c:101:35
    frame #25: 0x000055af5f5b283f ruby`tp_call_trace(tpval=0x00007f7d5b553770, trace_arg=0x00007fff1016d398) at vm_trace.c:1115:2
    frame #26: 0x000055af5f5b50ec ruby`exec_hooks_body(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:304:3
    frame #27: 0x000055af5f5b0f24 ruby`exec_hooks_unprotected(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:333:5
    frame #28: 0x000055af5f5b0da8 ruby`rb_exec_event_hooks(trace_arg=0x00007fff1016d398, hooks=0x000055af60b2b920, pop_p=0) at vm_trace.c:378:13
    frame #29: 0x000055af5f33f8e2 ruby`rb_exec_event_hook_orig(ec=0x000055af60b2b700, hooks=0x000055af60b2b920, flag=1048576, self=0x00007f7d5b5c08c0, id=0, called_id=0, klass=0x0000000000000000, data=0x00007f7d5b513fd0, pop_p=0) at vm_core.h:1989:5
    frame #30: 0x000055af5f334975 ruby`gc_event_hook_body(ec=0x000055af60b2b700, objspace=0x000055af60b2b040, event=1048576, data=0x00007f7d5b513fd0) at gc.c:2083:5
  * frame #31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9
    frame #32: 0x000055af5f33410f ruby`newobj_slowpath_wb_protected(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910) at gc.c:2299
    frame #33: 0x000055af5f333de9 ruby`newobj_of0(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, wb_protected=1, cr=0x000055af60b2b910) at gc.c:2338:11
    frame #34: 0x000055af5f3227ae ruby`newobj_of(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, wb_protected=1) at gc.c:2348:17
    frame #35: 0x000055af5f322c5b ruby`rb_imemo_new(type=imemo_env, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, v0=0x00007f7d5b9d19c8) at gc.c:2434:12
    frame #36: 0x000055af5f5a3925 ruby`vm_env_new(env_ep=0x000055af657d88a0, env_body=0x000055af657d8890, env_size=4, iseq=0x00007f7d5b9d19c8) at vm_core.h:1363:33
    frame #37: 0x000055af5f5a3808 ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc90) at vm.c:801:11
    frame #38: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc20) at vm.c:752:13
    frame #39: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fbb0) at vm.c:752:13
(lldb) f 31
frame #31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9
   2281	        }
   2282	        GC_ASSERT(obj != 0);
   2283	        newobj_init(klass, flags, wb_protected, objspace, obj);
-> 2284	        gc_event_hook_prep(objspace, RUBY_INTERNAL_EVENT_NEWOBJ, obj, newobj_fill(obj, 0, 0, 0));
   2285	    }
   2286	    RB_VM_LOCK_LEAVE_CR_LEV(cr, &lev);
   2287
(lldb) p obj
(VALUE) $3 = 0x00007f7d5b513fd0
(lldb) f 6
frame #6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13
   9043	        {
   9044	            rb_env_t *env = (rb_env_t *)obj;
   9045	            TYPED_UPDATE_IF_MOVED(objspace, rb_iseq_t *, env->iseq);
-> 9046	            UPDATE_IF_MOVED(objspace, env->ep[VM_ENV_DATA_INDEX_ENV]);
   9047	            gc_update_values(objspace, (long)env->env_size, (VALUE *)env->env);
   9048	        }
   9049	        break;
(lldb) p obj
(VALUE) $4 = 0x00007f7d5b513fd0
(lldb)
```
peterzhu2118 pushed a commit that referenced this pull request Apr 21, 2021
This commit adds a check on the ep just like in the mark function.  The
env can contain null bytes if allocation tracing is enabled.

We're seeing errors during autocompaction like this:

```
(lldb) bt 40
* thread #1, name = 'ruby', stop reason = signal SIGABRT
    frame #0: 0x00007f7d64b6018b libc.so.6`raise + 203
    frame #1: 0x00007f7d64b3f859 libc.so.6`abort + 299
    frame #2: 0x000055af5f2fefc9 ruby`die at error.c:764:5
    frame #3: 0x000055af5f2ff1ac ruby`rb_bug_for_fatal_signal(default_sighandler=0x0000000000000000, sig=11, ctx=0x000055af60bc3340, fmt="") at error.c:804:5
    frame #4: 0x000055af5f4bd08f ruby`sigsegv(sig=11, info=0x000055af60bc3470, ctx=0x000055af60bc3340) at signal.c:960:5
    frame #5: 0x00007f7d64ebe3c0 libpthread.so.0`__restore_rt
    frame #6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13
    frame #7: 0x000055af5f339172 ruby`gc_update_object_references(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9307:9
    frame #8: 0x000055af5f338e79 ruby`gc_ref_update(vstart=0x00007f7d5b510010, vend=0x00007f7d5b513ff8, stride=40, objspace=0x000055af60b2b040, page=0x000055af62577aa0) at gc.c:9452:21
    frame #9: 0x000055af5f337846 ruby`gc_update_references(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:9481:9
    frame #10: 0x000055af5f336569 ruby`gc_compact_finish(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:4840:5
    frame #11: 0x000055af5f335efb ruby`gc_page_sweep(objspace=0x000055af60b2b040, heap=0x000055af60b2b068, sweep_page=0x000055af63a1eb30) at gc.c:5046:13
    frame #12: 0x000055af5f3355c5 ruby`gc_sweep_step(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:5214:19
    frame #13: 0x000055af5f33daf6 ruby`gc_sweep_rest(objspace=0x000055af60b2b040) at gc.c:5271:2
    frame #14: 0x000055af5f33cacd ruby`gc_sweep(objspace=0x000055af60b2b040) at gc.c:5389:2
    frame #15: 0x000055af5f33c21d ruby`gc_marks_rest(objspace=0x000055af60b2b040) at gc.c:7555:5
    frame #16: 0x000055af5f324d41 ruby`gc_rest(objspace=0x000055af60b2b040) at gc.c:8457:13
    frame #17: 0x000055af5f3297d8 ruby`garbage_collect(objspace=0x000055af60b2b040, reason=45568) at gc.c:8318:9
    frame #18: 0x000055af5f344ece ruby`garbage_collect_with_gvl(objspace=0x000055af60b2b040, reason=45568) at gc.c:8632:9
    frame #19: 0x000055af5f344e61 ruby`objspace_malloc_gc_stress(objspace=0x000055af60b2b040) at gc.c:10592:9
    frame #20: 0x000055af5f32ced1 ruby`objspace_xmalloc0(objspace=0x000055af60b2b040, size=64) at gc.c:10767:5
    frame #21: 0x000055af5f32ce11 ruby`ruby_xmalloc0(size=64) at gc.c:10988:12
    frame #22: 0x000055af5f32cdac ruby`ruby_xmalloc_body(size=64) at gc.c:10997:12
    frame #23: 0x000055af5f329415 ruby`ruby_xmalloc(size=64) at gc.c:12942:12
    frame #24: 0x00007f7d611c4fe5 objspace.so`newobj_i(tpval=0x00007f7d5b553770, data=0x000055af639031a0) at object_tracing.c:101:35
    frame #25: 0x000055af5f5b283f ruby`tp_call_trace(tpval=0x00007f7d5b553770, trace_arg=0x00007fff1016d398) at vm_trace.c:1115:2
    frame #26: 0x000055af5f5b50ec ruby`exec_hooks_body(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:304:3
    frame #27: 0x000055af5f5b0f24 ruby`exec_hooks_unprotected(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:333:5
    frame #28: 0x000055af5f5b0da8 ruby`rb_exec_event_hooks(trace_arg=0x00007fff1016d398, hooks=0x000055af60b2b920, pop_p=0) at vm_trace.c:378:13
    frame #29: 0x000055af5f33f8e2 ruby`rb_exec_event_hook_orig(ec=0x000055af60b2b700, hooks=0x000055af60b2b920, flag=1048576, self=0x00007f7d5b5c08c0, id=0, called_id=0, klass=0x0000000000000000, data=0x00007f7d5b513fd0, pop_p=0) at vm_core.h:1989:5
    frame #30: 0x000055af5f334975 ruby`gc_event_hook_body(ec=0x000055af60b2b700, objspace=0x000055af60b2b040, event=1048576, data=0x00007f7d5b513fd0) at gc.c:2083:5
  * frame #31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9
    frame #32: 0x000055af5f33410f ruby`newobj_slowpath_wb_protected(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910) at gc.c:2299
    frame #33: 0x000055af5f333de9 ruby`newobj_of0(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, wb_protected=1, cr=0x000055af60b2b910) at gc.c:2338:11
    frame #34: 0x000055af5f3227ae ruby`newobj_of(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, wb_protected=1) at gc.c:2348:17
    frame #35: 0x000055af5f322c5b ruby`rb_imemo_new(type=imemo_env, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, v0=0x00007f7d5b9d19c8) at gc.c:2434:12
    frame #36: 0x000055af5f5a3925 ruby`vm_env_new(env_ep=0x000055af657d88a0, env_body=0x000055af657d8890, env_size=4, iseq=0x00007f7d5b9d19c8) at vm_core.h:1363:33
    frame #37: 0x000055af5f5a3808 ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc90) at vm.c:801:11
    frame #38: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc20) at vm.c:752:13
    frame #39: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fbb0) at vm.c:752:13
(lldb) f 31
frame #31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9
   2281	        }
   2282	        GC_ASSERT(obj != 0);
   2283	        newobj_init(klass, flags, wb_protected, objspace, obj);
-> 2284	        gc_event_hook_prep(objspace, RUBY_INTERNAL_EVENT_NEWOBJ, obj, newobj_fill(obj, 0, 0, 0));
   2285	    }
   2286	    RB_VM_LOCK_LEAVE_CR_LEV(cr, &lev);
   2287
(lldb) p obj
(VALUE) $3 = 0x00007f7d5b513fd0
(lldb) f 6
frame #6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13
   9043	        {
   9044	            rb_env_t *env = (rb_env_t *)obj;
   9045	            TYPED_UPDATE_IF_MOVED(objspace, rb_iseq_t *, env->iseq);
-> 9046	            UPDATE_IF_MOVED(objspace, env->ep[VM_ENV_DATA_INDEX_ENV]);
   9047	            gc_update_values(objspace, (long)env->env_size, (VALUE *)env->env);
   9048	        }
   9049	        break;
(lldb) p obj
(VALUE) $4 = 0x00007f7d5b513fd0
(lldb)
```
casperisfine pushed a commit that referenced this pull request May 19, 2021
This commit adds a check on the ep just like in the mark function.  The
env can contain null bytes if allocation tracing is enabled.

We're seeing errors during autocompaction like this:

```
(lldb) bt 40
* thread #1, name = 'ruby', stop reason = signal SIGABRT
    frame #0: 0x00007f7d64b6018b libc.so.6`raise + 203
    frame #1: 0x00007f7d64b3f859 libc.so.6`abort + 299
    frame #2: 0x000055af5f2fefc9 ruby`die at error.c:764:5
    frame #3: 0x000055af5f2ff1ac ruby`rb_bug_for_fatal_signal(default_sighandler=0x0000000000000000, sig=11, ctx=0x000055af60bc3340, fmt="") at error.c:804:5
    frame #4: 0x000055af5f4bd08f ruby`sigsegv(sig=11, info=0x000055af60bc3470, ctx=0x000055af60bc3340) at signal.c:960:5
    frame #5: 0x00007f7d64ebe3c0 libpthread.so.0`__restore_rt
    frame #6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13
    frame #7: 0x000055af5f339172 ruby`gc_update_object_references(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9307:9
    frame #8: 0x000055af5f338e79 ruby`gc_ref_update(vstart=0x00007f7d5b510010, vend=0x00007f7d5b513ff8, stride=40, objspace=0x000055af60b2b040, page=0x000055af62577aa0) at gc.c:9452:21
    frame #9: 0x000055af5f337846 ruby`gc_update_references(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:9481:9
    frame #10: 0x000055af5f336569 ruby`gc_compact_finish(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:4840:5
    frame #11: 0x000055af5f335efb ruby`gc_page_sweep(objspace=0x000055af60b2b040, heap=0x000055af60b2b068, sweep_page=0x000055af63a1eb30) at gc.c:5046:13
    frame #12: 0x000055af5f3355c5 ruby`gc_sweep_step(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:5214:19
    frame #13: 0x000055af5f33daf6 ruby`gc_sweep_rest(objspace=0x000055af60b2b040) at gc.c:5271:2
    frame #14: 0x000055af5f33cacd ruby`gc_sweep(objspace=0x000055af60b2b040) at gc.c:5389:2
    frame #15: 0x000055af5f33c21d ruby`gc_marks_rest(objspace=0x000055af60b2b040) at gc.c:7555:5
    frame #16: 0x000055af5f324d41 ruby`gc_rest(objspace=0x000055af60b2b040) at gc.c:8457:13
    frame #17: 0x000055af5f3297d8 ruby`garbage_collect(objspace=0x000055af60b2b040, reason=45568) at gc.c:8318:9
    frame #18: 0x000055af5f344ece ruby`garbage_collect_with_gvl(objspace=0x000055af60b2b040, reason=45568) at gc.c:8632:9
    frame #19: 0x000055af5f344e61 ruby`objspace_malloc_gc_stress(objspace=0x000055af60b2b040) at gc.c:10592:9
    frame #20: 0x000055af5f32ced1 ruby`objspace_xmalloc0(objspace=0x000055af60b2b040, size=64) at gc.c:10767:5
    frame #21: 0x000055af5f32ce11 ruby`ruby_xmalloc0(size=64) at gc.c:10988:12
    frame #22: 0x000055af5f32cdac ruby`ruby_xmalloc_body(size=64) at gc.c:10997:12
    frame #23: 0x000055af5f329415 ruby`ruby_xmalloc(size=64) at gc.c:12942:12
    frame #24: 0x00007f7d611c4fe5 objspace.so`newobj_i(tpval=0x00007f7d5b553770, data=0x000055af639031a0) at object_tracing.c:101:35
    frame #25: 0x000055af5f5b283f ruby`tp_call_trace(tpval=0x00007f7d5b553770, trace_arg=0x00007fff1016d398) at vm_trace.c:1115:2
    frame #26: 0x000055af5f5b50ec ruby`exec_hooks_body(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:304:3
    frame #27: 0x000055af5f5b0f24 ruby`exec_hooks_unprotected(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:333:5
    frame #28: 0x000055af5f5b0da8 ruby`rb_exec_event_hooks(trace_arg=0x00007fff1016d398, hooks=0x000055af60b2b920, pop_p=0) at vm_trace.c:378:13
    frame #29: 0x000055af5f33f8e2 ruby`rb_exec_event_hook_orig(ec=0x000055af60b2b700, hooks=0x000055af60b2b920, flag=1048576, self=0x00007f7d5b5c08c0, id=0, called_id=0, klass=0x0000000000000000, data=0x00007f7d5b513fd0, pop_p=0) at vm_core.h:1989:5
    frame #30: 0x000055af5f334975 ruby`gc_event_hook_body(ec=0x000055af60b2b700, objspace=0x000055af60b2b040, event=1048576, data=0x00007f7d5b513fd0) at gc.c:2083:5
  * frame #31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9
    frame #32: 0x000055af5f33410f ruby`newobj_slowpath_wb_protected(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910) at gc.c:2299
    frame #33: 0x000055af5f333de9 ruby`newobj_of0(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, wb_protected=1, cr=0x000055af60b2b910) at gc.c:2338:11
    frame #34: 0x000055af5f3227ae ruby`newobj_of(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, wb_protected=1) at gc.c:2348:17
    frame #35: 0x000055af5f322c5b ruby`rb_imemo_new(type=imemo_env, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, v0=0x00007f7d5b9d19c8) at gc.c:2434:12
    frame #36: 0x000055af5f5a3925 ruby`vm_env_new(env_ep=0x000055af657d88a0, env_body=0x000055af657d8890, env_size=4, iseq=0x00007f7d5b9d19c8) at vm_core.h:1363:33
    frame #37: 0x000055af5f5a3808 ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc90) at vm.c:801:11
    frame #38: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc20) at vm.c:752:13
    frame #39: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fbb0) at vm.c:752:13
(lldb) f 31
frame #31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9
   2281	        }
   2282	        GC_ASSERT(obj != 0);
   2283	        newobj_init(klass, flags, wb_protected, objspace, obj);
-> 2284	        gc_event_hook_prep(objspace, RUBY_INTERNAL_EVENT_NEWOBJ, obj, newobj_fill(obj, 0, 0, 0));
   2285	    }
   2286	    RB_VM_LOCK_LEAVE_CR_LEV(cr, &lev);
   2287
(lldb) p obj
(VALUE) $3 = 0x00007f7d5b513fd0
(lldb) f 6
frame #6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13
   9043	        {
   9044	            rb_env_t *env = (rb_env_t *)obj;
   9045	            TYPED_UPDATE_IF_MOVED(objspace, rb_iseq_t *, env->iseq);
-> 9046	            UPDATE_IF_MOVED(objspace, env->ep[VM_ENV_DATA_INDEX_ENV]);
   9047	            gc_update_values(objspace, (long)env->env_size, (VALUE *)env->env);
   9048	        }
   9049	        break;
(lldb) p obj
(VALUE) $4 = 0x00007f7d5b513fd0
(lldb)
```
tenderlove pushed a commit that referenced this pull request Jun 2, 2021
In some cases, methods taking block parameters don't require extra
paramter setup. They are fairly popular in railsbench.
maximecb pushed a commit that referenced this pull request Oct 19, 2021
In some cases, methods taking block parameters don't require extra
paramter setup. They are fairly popular in railsbench.
noahgibbs pushed a commit that referenced this pull request Oct 20, 2021
In some cases, methods taking block parameters don't require extra
paramter setup. They are fairly popular in railsbench.
XrXr added a commit that referenced this pull request Oct 20, 2021
In some cases, methods taking block parameters don't require extra
paramter setup. They are fairly popular in railsbench.
XrXr added a commit that referenced this pull request Oct 20, 2021
In some cases, methods taking block parameters don't require extra
paramter setup. They are fairly popular in railsbench.
XrXr added a commit that referenced this pull request Oct 21, 2021
In some cases, methods taking block parameters don't require extra
paramter setup. They are fairly popular in railsbench.
maximecb pushed a commit that referenced this pull request Aug 2, 2023
[Bug #19793]

Dummy frames are created at the top level when requiring another file.
While requiring a file, it will try to convert using encodings. Some of
these encodings will not respond to to_str. If method_missing is
redefined on Object, then it will call method_missing and attempt raise
an error. However, the iseq is invalid as it's a dummy frame so it will
write an invalid iseq to the created NoMethodError.

The following script crashes:

```
GC.stress = true

class Object
  public :method_missing
end

File.write("/tmp/empty.rb", "")
require "/tmp/empty.rb"
```

With the following backtrace:

```
frame #0: 0x00000001000fa8b8 miniruby`RVALUE_MARKED(obj=4308637824) at gc.c:1638:12
frame #1: 0x00000001000fb440 miniruby`RVALUE_BLACK_P(obj=4308637824) at gc.c:1763:12
frame #2: 0x00000001000facdc miniruby`gc_writebarrier_incremental(a=4308637824, b=4308332208, objspace=0x000000010180b000) at gc.c:8822:9
frame #3: 0x00000001000faad8 miniruby`rb_gc_writebarrier(a=4308637824, b=4308332208) at gc.c:8864:17
frame #4: 0x000000010016aff0 miniruby`rb_obj_written(a=4308637824, oldv=36, b=4308332208, filename="../iseq.c", line=1279) at gc.h:804:9
frame #5: 0x0000000100162a60 miniruby`rb_obj_write(a=4308637824, slot=0x0000000100d09888, b=4308332208, filename="../iseq.c", line=1279) at gc.h:837:5
frame #6: 0x0000000100165b0c miniruby`iseqw_new(iseq=0x0000000100d09880) at iseq.c:1279:9
frame #7: 0x0000000100165a64 miniruby`rb_iseqw_new(iseq=0x0000000100d09880) at iseq.c:1289:12
frame #8: 0x00000001000d8324 miniruby`name_err_init_attr(exc=4309777920, recv=4304780496, method=827660) at error.c:1830:35
frame #9: 0x00000001000d1b80 miniruby`name_err_init(exc=4309777920, mesg=4308332496, recv=4304780496, method=827660) at error.c:1869:12
frame #10: 0x00000001000d1bd4 miniruby`rb_nomethod_err_new(mesg=4308332496, recv=4304780496, method=827660, args=4308332448, priv=0) at error.c:1957:5
frame #11: 0x000000010039049c miniruby`rb_make_no_method_exception(exc=4304914512, format=4308332496, obj=4304780496, argc=1, argv=0x000000016fdfab00, priv=0) at vm_eval.c:959:16
frame #12: 0x00000001003b3274 miniruby`raise_method_missing(ec=0x0000000100b06f40, argc=1, argv=0x000000016fdfab00, obj=4304780496, last_call_status=MISSING_NOENTRY) at vm_eval.c:999:15
frame #13: 0x00000001003945d4 miniruby`rb_method_missing(argc=1, argv=0x000000016fdfab00, obj=4304780496) at vm_eval.c:944:5
...
frame #23: 0x000000010038f5e4 miniruby`rb_vm_call_kw(ec=0x0000000100b06f40, recv=4304780496, id=2865, argc=1, argv=0x000000016fdfab00, me=0x0000000100cbfcf0, kw_splat=0) at vm_eval.c:326:12
frame #24: 0x00000001003c18e4 miniruby`call_method_entry(ec=0x0000000100b06f40, defined_class=4304927952, obj=4304780496, id=2865, cme=0x0000000100cbfcf0, argc=1, argv=0x000000016fdfab00, kw_splat=0) at vm_method.c:2720:20
frame #25: 0x00000001003c440c miniruby`check_funcall_exec(v=6171896792) at vm_eval.c:589:12
frame #26: 0x00000001000dec00 miniruby`rb_vrescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792, args="Pȗ") at eval.c:919:18
frame #27: 0x00000001000deab0 miniruby`rb_rescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792) at eval.c:900:17
frame #28: 0x000000010039008c miniruby`check_funcall_missing(ec=0x0000000100b06f40, klass=4304923536, recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, respond=-1, def=36, kw_splat=0) at vm_eval.c:666:15
frame #29: 0x000000010038fa60 miniruby`rb_check_funcall_default_kw(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, def=36, kw_splat=0) at vm_eval.c:703:21
frame #30: 0x000000010038fb04 miniruby`rb_check_funcall(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000) at vm_eval.c:685:12
frame #31: 0x00000001001c469c miniruby`convert_type_with_id(val=4304780496, tname="String", method=3233, raise=0, index=-1) at object.c:3061:15
frame #32: 0x00000001001c4a4c miniruby`rb_check_convert_type_with_id(val=4304780496, type=5, tname="String", method=3233) at object.c:3153:9
frame #33: 0x00000001002d59f8 miniruby`rb_check_string_type(str=4304780496) at string.c:2571:11
frame #34: 0x000000010014b7b0 miniruby`io_encoding_set(fptr=0x0000000100d09ca0, v1=4304780496, v2=4, opt=4) at io.c:11655:19
frame #35: 0x0000000100139a58 miniruby`rb_io_set_encoding(argc=1, argv=0x000000016fdfb450, io=4308334032) at io.c:13497:5
frame #36: 0x00000001003c0004 miniruby`ractor_safe_call_cfunc_m1(recv=4308334032, argc=1, argv=0x000000016fdfb450, func=(miniruby`rb_io_set_encoding at io.c:13487)) at vm_insnhelper.c:3271:12
...
frame #43: 0x0000000100390b08 miniruby`rb_funcall(recv=4308334032, mid=16593, n=1) at vm_eval.c:1137:12
frame #44: 0x00000001002a43d8 miniruby`load_file_internal(argp_v=6171899936) at ruby.c:2500:5
...
```
casperisfine pushed a commit that referenced this pull request Sep 20, 2023
	Fix crash in NoMethodError for dummy frames
	MIME-Version: 1.0
	Content-Type: text/plain; charset=UTF-8
	Content-Transfer-Encoding: 8bit

	[Bug #19793]

	Dummy frames are created at the top level when requiring another file.
	While requiring a file, it will try to convert using encodings. Some of
	these encodings will not respond to to_str. If method_missing is
	redefined on Object, then it will call method_missing and attempt raise
	an error. However, the iseq is invalid as it's a dummy frame so it will
	write an invalid iseq to the created NoMethodError.

	The following script crashes:

	```
	GC.stress = true

	class Object
	  public :method_missing
	end

	File.write("/tmp/empty.rb", "")
	require "/tmp/empty.rb"
	```

	With the following backtrace:

	```
	frame #0: 0x00000001000fa8b8 miniruby`RVALUE_MARKED(obj=4308637824) at gc.c:1638:12
	frame #1: 0x00000001000fb440 miniruby`RVALUE_BLACK_P(obj=4308637824) at gc.c:1763:12
	frame #2: 0x00000001000facdc miniruby`gc_writebarrier_incremental(a=4308637824, b=4308332208, objspace=0x000000010180b000) at gc.c:8822:9
	frame #3: 0x00000001000faad8 miniruby`rb_gc_writebarrier(a=4308637824, b=4308332208) at gc.c:8864:17
	frame #4: 0x000000010016aff0 miniruby`rb_obj_written(a=4308637824, oldv=36, b=4308332208, filename="../iseq.c", line=1279) at gc.h:804:9
	frame #5: 0x0000000100162a60 miniruby`rb_obj_write(a=4308637824, slot=0x0000000100d09888, b=4308332208, filename="../iseq.c", line=1279) at gc.h:837:5
	frame #6: 0x0000000100165b0c miniruby`iseqw_new(iseq=0x0000000100d09880) at iseq.c:1279:9
	frame #7: 0x0000000100165a64 miniruby`rb_iseqw_new(iseq=0x0000000100d09880) at iseq.c:1289:12
	frame #8: 0x00000001000d8324 miniruby`name_err_init_attr(exc=4309777920, recv=4304780496, method=827660) at error.c:1830:35
	frame #9: 0x00000001000d1b80 miniruby`name_err_init(exc=4309777920, mesg=4308332496, recv=4304780496, method=827660) at error.c:1869:12
	frame #10: 0x00000001000d1bd4 miniruby`rb_nomethod_err_new(mesg=4308332496, recv=4304780496, method=827660, args=4308332448, priv=0) at error.c:1957:5
	frame #11: 0x000000010039049c miniruby`rb_make_no_method_exception(exc=4304914512, format=4308332496, obj=4304780496, argc=1, argv=0x000000016fdfab00, priv=0) at vm_eval.c:959:16
	frame #12: 0x00000001003b3274 miniruby`raise_method_missing(ec=0x0000000100b06f40, argc=1, argv=0x000000016fdfab00, obj=4304780496, last_call_status=MISSING_NOENTRY) at vm_eval.c:999:15
	frame #13: 0x00000001003945d4 miniruby`rb_method_missing(argc=1, argv=0x000000016fdfab00, obj=4304780496) at vm_eval.c:944:5
	...
	frame #23: 0x000000010038f5e4 miniruby`rb_vm_call_kw(ec=0x0000000100b06f40, recv=4304780496, id=2865, argc=1, argv=0x000000016fdfab00, me=0x0000000100cbfcf0, kw_splat=0) at vm_eval.c:326:12
	frame #24: 0x00000001003c18e4 miniruby`call_method_entry(ec=0x0000000100b06f40, defined_class=4304927952, obj=4304780496, id=2865, cme=0x0000000100cbfcf0, argc=1, argv=0x000000016fdfab00, kw_splat=0) at vm_method.c:2720:20
	frame #25: 0x00000001003c440c miniruby`check_funcall_exec(v=6171896792) at vm_eval.c:589:12
	frame #26: 0x00000001000dec00 miniruby`rb_vrescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792, args="Pȗ") at eval.c:919:18
	frame #27: 0x00000001000deab0 miniruby`rb_rescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792) at eval.c:900:17
	frame #28: 0x000000010039008c miniruby`check_funcall_missing(ec=0x0000000100b06f40, klass=4304923536, recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, respond=-1, def=36, kw_splat=0) at vm_eval.c:666:15
	frame #29: 0x000000010038fa60 miniruby`rb_check_funcall_default_kw(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, def=36, kw_splat=0) at vm_eval.c:703:21
	frame #30: 0x000000010038fb04 miniruby`rb_check_funcall(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000) at vm_eval.c:685:12
	frame #31: 0x00000001001c469c miniruby`convert_type_with_id(val=4304780496, tname="String", method=3233, raise=0, index=-1) at object.c:3061:15
	frame #32: 0x00000001001c4a4c miniruby`rb_check_convert_type_with_id(val=4304780496, type=5, tname="String", method=3233) at object.c:3153:9
	frame #33: 0x00000001002d59f8 miniruby`rb_check_string_type(str=4304780496) at string.c:2571:11
	frame #34: 0x000000010014b7b0 miniruby`io_encoding_set(fptr=0x0000000100d09ca0, v1=4304780496, v2=4, opt=4) at io.c:11655:19
	frame #35: 0x0000000100139a58 miniruby`rb_io_set_encoding(argc=1, argv=0x000000016fdfb450, io=4308334032) at io.c:13497:5
	frame #36: 0x00000001003c0004 miniruby`ractor_safe_call_cfunc_m1(recv=4308334032, argc=1, argv=0x000000016fdfb450, func=(miniruby`rb_io_set_encoding at io.c:13487)) at vm_insnhelper.c:3271:12
	...
	frame #43: 0x0000000100390b08 miniruby`rb_funcall(recv=4308334032, mid=16593, n=1) at vm_eval.c:1137:12
	frame #44: 0x00000001002a43d8 miniruby`load_file_internal(argp_v=6171899936) at ruby.c:2500:5
	...
	```
	---
	 error.c                   |  4 +++-
	 test/ruby/test_require.rb | 15 +++++++++++++++
	 2 files changed, 18 insertions(+), 1 deletion(-)
st0012 pushed a commit that referenced this pull request Nov 30, 2024
[Bug #20921]

When we create a cache entry for a constant, the following sequence of
events could happen:

- vm_track_constant_cache is called to insert a constant cache.
- In vm_track_constant_cache, we first look up the ST table for the ID
  of the constant. Assume the ST table exists because another iseq also
  holds a cache entry for this ID.
- We then insert into this ST table with the iseq_inline_constant_cache.
- However, while inserting into this ST table, it allocates memory, which
  could trigger a GC. Assume that it does trigger a GC.
- The GC frees the one and only other iseq that holds a cache entry for
  this ID.
- In remove_from_constant_cache, it will appear that the ST table is now
  empty because there are no more iseq with cache entries for this ID, so
  we free the ST table.
- We complete GC and continue our st_insert. However, this ST table has
  been freed so we now have a use-after-free.

This issue is very hard to reproduce, because it requires that the GC runs
at a very specific time. However, we can make it show up by applying this
patch which runs GC right before the st_insert to mimic the st_insert
triggering a GC:

    diff --git a/vm_insnhelper.c b/vm_insnhelper.c
    index 3cb23f0..a93998136a 100644
    --- a/vm_insnhelper.c
    +++ b/vm_insnhelper.c
    @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic)
            rb_id_table_insert(const_cache, id, (VALUE)ics);
        }

    +    if (id == rb_intern("MyConstant")) rb_gc();
    +
        st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue);
    }

And if we run this script:

    Object.const_set("MyConstant", "Hello!")

    my_proc = eval("-> { MyConstant }")
    my_proc.call

    my_proc = eval("-> { MyConstant }")
    my_proc.call

We can see that ASAN outputs a use-after-free error:

    ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68
    READ of size 8 at 0x606000049528 thread T0
        #0 0x102f3cea8 in do_hash st.c:321
        #1 0x102f3ddd0 in rb_st_insert st.c:1132
        #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345
        #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #5 0x1030bc1e0 in vm_exec_core insns.def:263
        #6 0x1030b55fc in rb_vm_exec vm.c:2585
        #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #8 0x102a82588 in rb_ec_exec_node eval.c:281
        #9 0x102a81fe0 in ruby_run_node eval.c:319
        #10 0x1027f3db4 in rb_main main.c:43
        #11 0x1027f3bd4 in main main.c:68
        #12 0x183900270  (<unknown module>)

    0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558)
    freed by thread T0 here:
        #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40)
        #1 0x102ada89c in rb_gc_impl_free default.c:8183
        #2 0x102ada7dc in ruby_sized_xfree gc.c:4507
        #3 0x102ac4d34 in ruby_xfree gc.c:4518
        #4 0x102f3cb34 in rb_st_free_table st.c:663
        #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119
        #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153
        #7 0x102bbd2a0 in rb_iseq_free iseq.c:166
        #8 0x102b32ed0 in rb_imemo_free imemo.c:564
        #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407
        #10 0x102af4290 in gc_sweep_plane default.c:3546
        #11 0x102af3bdc in gc_sweep_page default.c:3634
        #12 0x102aeb140 in gc_sweep_step default.c:3906
        #13 0x102aeadf0 in gc_sweep_rest default.c:3978
        #14 0x102ae4714 in gc_sweep default.c:4155
        #15 0x102af8474 in gc_start default.c:6484
        #16 0x102afbe30 in garbage_collect default.c:6363
        #17 0x102ad37f0 in rb_gc_impl_start default.c:6816
        #18 0x102ad3634 in rb_gc gc.c:3624
        #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342
        #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #22 0x1030bc1e0 in vm_exec_core insns.def:263
        #23 0x1030b55fc in rb_vm_exec vm.c:2585
        #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #25 0x102a82588 in rb_ec_exec_node eval.c:281
        #26 0x102a81fe0 in ruby_run_node eval.c:319
        #27 0x1027f3db4 in rb_main main.c:43
        #28 0x1027f3bd4 in main main.c:68
        #29 0x183900270  (<unknown module>)

    previously allocated by thread T0 here:
        #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04)
        #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198
        #2 0x102acee44 in ruby_xmalloc gc.c:4438
        #3 0x102f3c85c in rb_st_init_table_with_size st.c:571
        #4 0x102f3c900 in rb_st_init_table st.c:600
        #5 0x102f3c920 in rb_st_init_numtable st.c:608
        #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337
        #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #9 0x1030bc1e0 in vm_exec_core insns.def:263
        #10 0x1030b55fc in rb_vm_exec vm.c:2585
        #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #12 0x102a82588 in rb_ec_exec_node eval.c:281
        #13 0x102a81fe0 in ruby_run_node eval.c:319
        #14 0x1027f3db4 in rb_main main.c:43
        #15 0x1027f3bd4 in main main.c:68
        #16 0x183900270  (<unknown module>)

This commit fixes this bug by adding a inserting_constant_cache_id field
to the VM, which stores the ID that is currently being inserted and, in
remove_from_constant_cache, we don't free the ST table for ID equal to
this one.

Co-Authored-By: Alan Wu <[email protected]>
peterzhu2118 added a commit that referenced this pull request Dec 3, 2024
[Bug #20921]

When we create a cache entry for a constant, the following sequence of
events could happen:

- vm_track_constant_cache is called to insert a constant cache.
- In vm_track_constant_cache, we first look up the ST table for the ID
  of the constant. Assume the ST table exists because another iseq also
  holds a cache entry for this ID.
- We then insert into this ST table with the iseq_inline_constant_cache.
- However, while inserting into this ST table, it allocates memory, which
  could trigger a GC. Assume that it does trigger a GC.
- The GC frees the one and only other iseq that holds a cache entry for
  this ID.
- In remove_from_constant_cache, it will appear that the ST table is now
  empty because there are no more iseq with cache entries for this ID, so
  we free the ST table.
- We complete GC and continue our st_insert. However, this ST table has
  been freed so we now have a use-after-free.

This issue is very hard to reproduce, because it requires that the GC runs
at a very specific time. However, we can make it show up by applying this
patch which runs GC right before the st_insert to mimic the st_insert
triggering a GC:

    diff --git a/vm_insnhelper.c b/vm_insnhelper.c
    index 3cb23f0..a93998136a 100644
    --- a/vm_insnhelper.c
    +++ b/vm_insnhelper.c
    @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic)
            rb_id_table_insert(const_cache, id, (VALUE)ics);
        }

    +    if (id == rb_intern("MyConstant")) rb_gc();
    +
        st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue);
    }

And if we run this script:

    Object.const_set("MyConstant", "Hello!")

    my_proc = eval("-> { MyConstant }")
    my_proc.call

    my_proc = eval("-> { MyConstant }")
    my_proc.call

We can see that ASAN outputs a use-after-free error:

    ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68
    READ of size 8 at 0x606000049528 thread T0
        #0 0x102f3cea8 in do_hash st.c:321
        #1 0x102f3ddd0 in rb_st_insert st.c:1132
        #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345
        #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #5 0x1030bc1e0 in vm_exec_core insns.def:263
        #6 0x1030b55fc in rb_vm_exec vm.c:2585
        #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #8 0x102a82588 in rb_ec_exec_node eval.c:281
        #9 0x102a81fe0 in ruby_run_node eval.c:319
        #10 0x1027f3db4 in rb_main main.c:43
        #11 0x1027f3bd4 in main main.c:68
        #12 0x183900270  (<unknown module>)

    0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558)
    freed by thread T0 here:
        #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40)
        #1 0x102ada89c in rb_gc_impl_free default.c:8183
        #2 0x102ada7dc in ruby_sized_xfree gc.c:4507
        #3 0x102ac4d34 in ruby_xfree gc.c:4518
        #4 0x102f3cb34 in rb_st_free_table st.c:663
        #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119
        #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153
        #7 0x102bbd2a0 in rb_iseq_free iseq.c:166
        #8 0x102b32ed0 in rb_imemo_free imemo.c:564
        #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407
        #10 0x102af4290 in gc_sweep_plane default.c:3546
        #11 0x102af3bdc in gc_sweep_page default.c:3634
        #12 0x102aeb140 in gc_sweep_step default.c:3906
        #13 0x102aeadf0 in gc_sweep_rest default.c:3978
        #14 0x102ae4714 in gc_sweep default.c:4155
        #15 0x102af8474 in gc_start default.c:6484
        #16 0x102afbe30 in garbage_collect default.c:6363
        #17 0x102ad37f0 in rb_gc_impl_start default.c:6816
        #18 0x102ad3634 in rb_gc gc.c:3624
        #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342
        #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #22 0x1030bc1e0 in vm_exec_core insns.def:263
        #23 0x1030b55fc in rb_vm_exec vm.c:2585
        #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #25 0x102a82588 in rb_ec_exec_node eval.c:281
        #26 0x102a81fe0 in ruby_run_node eval.c:319
        #27 0x1027f3db4 in rb_main main.c:43
        #28 0x1027f3bd4 in main main.c:68
        #29 0x183900270  (<unknown module>)

    previously allocated by thread T0 here:
        #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04)
        #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198
        #2 0x102acee44 in ruby_xmalloc gc.c:4438
        #3 0x102f3c85c in rb_st_init_table_with_size st.c:571
        #4 0x102f3c900 in rb_st_init_table st.c:600
        #5 0x102f3c920 in rb_st_init_numtable st.c:608
        #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337
        #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #9 0x1030bc1e0 in vm_exec_core insns.def:263
        #10 0x1030b55fc in rb_vm_exec vm.c:2585
        #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #12 0x102a82588 in rb_ec_exec_node eval.c:281
        #13 0x102a81fe0 in ruby_run_node eval.c:319
        #14 0x1027f3db4 in rb_main main.c:43
        #15 0x1027f3bd4 in main main.c:68
        #16 0x183900270  (<unknown module>)

This commit fixes this bug by adding a inserting_constant_cache_id field
to the VM, which stores the ID that is currently being inserted and, in
remove_from_constant_cache, we don't free the ST table for ID equal to
this one.

Co-Authored-By: Alan Wu <[email protected]>
XrXr pushed a commit that referenced this pull request May 2, 2025
* Profile instructions for fixnum arithmetic

* Drop PartialEq from Type

* Do not push PatchPoint onto the stack

* Avoid pushing the output of the guards

* Pop operands after guards

* Test HIR from profiled runs

* Implement Display for new instructions

* Drop unused FIXNUM_BITS

* Use a Rust function to split lines

* Use Display for GuardType operands

Co-authored-by: Max Bernstein <[email protected]>

* Fix tests with Display-ed values

---------

Co-authored-by: Max Bernstein <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants