Skip to content

Integrating multiple memories and 64-bit addresses with other extensions #1036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
AndrewScheidecker opened this issue Apr 8, 2017 · 12 comments

Comments

@AndrewScheidecker
Copy link

Seeing the shared memory atomics proposal and the SIMD proposal, it struck me that it will be painful if they are not orthogonal to the planned feature to allow accesses to multiple memories. I think the multiple memories feature is relatively low priority, but it seems it's worth fleshing out how it will work to avoid pain integrating it with earlier extensions.

The most general form of accessing multiple memories would require allowing values that are references to memory objects, which has a dependency on GC references.

A high !/$ form of the feature would just add a flag to memory_immediate that indicates there's an extra field with an index into the module's memory index-space.

A way to implement the more general form of the feature without creating a dependency on it from any other extensions that add memory access operators would be to add an operator that modifies the following memory operator. However, it adds a new variety of state to the WASM virtual machine, so I'm not convinced it's the way to go.

This also applies to adding support for 64-bit addresses. Are we going to end up with (32-bit | 64-bit address) x (default memory | immediate memory index | memory operand) versions of all the memory operators?

@jfbastien
Copy link
Member

I'd explore two possibilities for multiple memories:

  1. Specified as an immediate on a memory access. This is what LLVM has in its IR.
  2. Specified as a variable value which affects memory accesses. This is closer to a segment modifier.

This indeed needs to mesh well with atomics and SIMD, but the only interaction is that memory accesses need to also work with multiple memories. Presumably atomics and SIMD will work the same as current existing memory accesses, so I think there's little to no risk of failure.

On 64-bit: I think we'd discussed having wasm32 and wasm64 as separate binaries which can't interop (even through dynamic linking). We'd also discussed doubling all memory access opcodes to support 64-bit accesses. I don't think we'd settled on one approach.

@AndrewScheidecker
Copy link
Author

On 64-bit: I think we'd discussed having wasm32 and wasm64 as separate binaries which can't interop (even through dynamic linking). We'd also discussed doubling all memory access opcodes to support 64-bit accesses. I don't think we'd settled on one approach.

It doesn't seem possible to block interop between wasm32 and wasm64 as long as both interop with JavaScript.

@lukewagner
Copy link
Member

Agreed on wanting both immediate linear memory index and, later, first-class Memory GC reference types. To avoid adding a whole duplicate classes of opcodes, what I've been assuming is that the memory_immediate immediate of all the existing memory ops would indicate one of three cases: default (what we have now), immediate index (in which case the opcode has an additional memory-index varu32 immediate), reference (in which case the operator's signature gains a Memory operand).

@jfbastien
Copy link
Member

Agreed on wanting both immediate linear memory index and, later, first-class Memory GC reference types. To avoid adding a whole duplicate classes of opcodes, what I've been assuming is that the memory_immediate immediate of all the existing memory ops would indicate one of three cases: default (what we have now), immediate index (in which case the opcode has an additional memory-index varu32 immediate), reference (in which case the operator's signature gains a Memory operand).

I'd like to avoid prescribing the order in which these are done. If it turns out variable index is superior in every way then it would be undesirable to do immediate first.

@lukewagner
Copy link
Member

Sure, the ordering wasn't the point of my comment.

@AndrewScheidecker
Copy link
Author

reference (in which case the operator's signature gains a Memory operand).

If we did this, that would be the first instance where you can't just look at the opcode to determine the operand signature of the operator. Maybe worthwhile, but it's a cost to consider.

I'd like to avoid prescribing the order in which these are done. If it turns out variable index is superior in every way then it would be undesirable to do immediate first.

I don't think @lukewagner was referring to an operand-indexed form, just immediate-indexed or operand-referenced forms. An operand-indexed form would be somewhere between the operand-referenced form and the immediate-indexed form. Maybe that would be useful without requiring the full power of Memory refs on the operand stack, but it's painful to expose the module's memory index-space to some value that's possibly being passed around between different modules.

The immediate form has a significant advantage in both code size and the ease of generating good code from it, so the question is just whether that form is useful to any of the actual applications for multiple memories.

@AndrewScheidecker
Copy link
Author

If we did this, that would be the first instance where you can't just look at the opcode to determine the operand signature of the operator.

Actually, that's not true. The call and call_indirect operators' operand signatures depend on immediates.

@jfbastien
Copy link
Member

reference (in which case the operator's signature gains a Memory operand).

If we did this, that would be the first instance where you can't just look at the opcode to determine the operand signature of the operator. Maybe worthwhile, but it's a cost to consider.

I don't understand. Can you clarify with pseudo-code what you mean? I saw your point about call / call_indirect, but what I'm proposing doesn't affect any signature.

The immediate form has a significant advantage in both code size and the ease of generating good code from it, so the question is just whether that form is useful to any of the actual applications for multiple memories.

Agreed, but I want us to ask a wider question: is the operand-indexed form more useful?

@AndrewScheidecker
Copy link
Author

I don't understand. Can you clarify with pseudo-code what you mean? I saw your point about call / call_indirect, but what I'm proposing doesn't affect any signature.

@lukewagner listed three possible forms for memory accesses:

  • default form
  • immediate index form: memory_immediate would be extended to include the index of a memory in the module's memory index space.
  • reference form: memory_immediate would be extended to include a flag that, when set, causes the operator to pop an addition value from the operand stack with the type ref<Memory>.

My point was that the reference form would change the values popped from the operand stack depending on the operator's immediates. I mistakenly thought that would be the first case where that happens, but I was forgetting calls and branches.

@jfbastien
Copy link
Member

My point was that the reference form would change the values popped from the operand stack depending on the operator's immediates. I mistakenly thought that would be the first case where that happens, but I was forgetting calls and branches.

Ah gotcha, thanks. Even if it were the first time, we could make it a separate op instead to reduce your concern?

@rossberg
Copy link
Member

rossberg commented Apr 11, 2017 via email

@AndrewScheidecker
Copy link
Author

Ah gotcha, thanks. Even if it were the first time, we could make it a separate op instead to reduce your concern?

Personally, I'd weigh the cost of this kind of immediate-dependent operator operands as being lower than the cost of having multiple variants of all the memory access operators.

I'd consider it an acceptable solution to say that adding various flags to the memory_immediate causes the memory operator in question to change to one of the immediate index or reference forms, and that the SIMD/atomic extension memory ops handle same forms supported by the non-extension memory ops.

Same goes for the 64-bit memory ops: I'd prefer a 64-bit address flag on the memory type, or simply making the memory ops accept either 32-bit or 64-bit addresses, to adding 64-bit address variants of memory ops. I don't want to end up with i64.atomic.address64.memRef.rmw32_s.cmpxchg. :)

An intermediate solution that could be possible without GC types would be
memories as table elements. That is, introduce a new element type memory
and allow tables to index (multiple) memories. You'd probably still need
new instructions or some hack to the existing ones to access it.

I like that approach in general as a way to add operators for manipulating memories, tables, globals, modules, instances, etc without making those types first class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants