Skip to content

Are 64-bit addresses a mode or separate opcodes? #255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sunfishcode opened this issue Jul 7, 2015 · 13 comments
Closed

Are 64-bit addresses a mode or separate opcodes? #255

sunfishcode opened this issue Jul 7, 2015 · 13 comments
Assignees

Comments

@sunfishcode
Copy link
Member

In #245, @titzer asked whether 64-bit addressing (for >4GiB linear memory) should be a mode or just separate opcodes.

In practice, C/C++ and similar code is going to require one form or the other. Several people have looked at address-size abstractions, but ultimately decided to just accept it as a mode at the C/C++ ABI level, so we're currently expecting to have two ABIs anyway.

And, the utility of having a >4GiB heap with some code in the process that can only access the low 4 GiB of it seems marginal, and the potential for confusion and mistakes under such circumstances is significant.

Having two modes at the platform level seems indicated. Does anyone have other opinions?

@titzer
Copy link

titzer commented Jul 7, 2015

I think we should have separate bytecodes, since loads/stores already have
tons of bells and whistles, and adding another mode to configure them for a
module or a function or a scope seems superfluous. In the
v8-native-prototype the load/store bytecodes break down as follows (forgive
the macro-fu, it's a cut-and-paste):

#define FOREACH_LOAD_MEM_EXPR_OPCODE(V) \
  V(Int32LoadMemL, 0x20, i_i)           \
  V(Int64LoadMemL, 0x21, l_i)           \
  V(Float32LoadMemL, 0x22, f_i)         \
  V(Float64LoadMemL, 0x23, d_i)         \
  V(Int32LoadMemH, 0x24, i_l)           \
  V(Int64LoadMemH, 0x25, l_i)           \
  V(Float32LoadMemH, 0x26, f_l)         \
  V(Float64LoadMemH, 0x27, d_l)

// Store memory expressions.

#define FOREACH_STORE_MEM_EXPR_OPCODE(V) \
  V(Int32StoreMemL, 0x30, i_ii)          \
  V(Int64StoreMemL, 0x31, l_il)          \
  V(Float32StoreMemL, 0x32, f_if)        \
  V(Float64StoreMemL, 0x33, d_id)        \
  V(Int32StoreMemH, 0x34, i_li)          \
  V(Int64StoreMemH, 0x35, l_ll)          \
  V(Float32StoreMemH, 0x36, f_lf)        \
  V(Float64StoreMemH, 0x37, d_ld)

Having separate bytecodes has the nice advantage that the "signatures" i.e.
the input/output types of the bytecodes are fully determined without taking
into account a mode. Above, the -H suffix indicates a 64-bit memory index.

On Tue, Jul 7, 2015 at 3:11 PM, Dan Gohman [email protected] wrote:

In #245 #245, @titzer
https://github.com/titzer asked whether 64-bit addressing (for >4GiB
linear memory) should be a mode or just separate opcodes.

In practice, C/C++ and similar code is going to require one form or the
other. Several people have looked at address-size abstractions, but
ultimately decided to just accept it as a mode at the C/C++ ABI level, so
we're currently expecting to have two ABIs anyway.

And, the utility of having a >4GiB heap with some code in the process that
can only access the low 4 GiB of it seems marginal, and the potential for
confusion and mistakes under such circumstances is significant.

Having two modes at the platform level seems indicated. Does anyone have
other opinions?


Reply to this email directly or view it on GitHub
#255.

@titzer
Copy link

titzer commented Jul 7, 2015

I should also mention the "bells and whistles" I was talking about for
loads/stores, which I enumerated thusly:

// Functionality related to encoding memory accesses.

struct MemoryAccess {

// Atomicity annotations for access to the memory and globals.

enum Atomicity {

kNone = 0,        // non-atomic




kSequential = 1,  // sequential consistency




kAcquire = 2,     // acquire semantics




kRelease = 3      // release semantics

};

// Alignment annotations for memory accesses.

enum Alignment { kAligned = 0, kUnaligned = 1 };

// Memory width for integer accesses.

enum IntWidth { kInt8 = 0, kInt16 = 1, kInt32 = 2, kInt64 = 3 };

// Bitfields for the various annotations for memory accesses.

typedef BitField<IntWidth, 0, 2> IntWidthField;

typedef BitField<bool, 2, 1> SignExtendField;

typedef BitField<Alignment, 3, 1> AlignmentField;

typedef BitField<Atomicity, 4, 2> AtomicityField;

};

That means that loads/stores are 2-byte opcodes, with the first bytecode
indicating the local types, and the second byte indicating the additional
attributes such as the alignedness, memory atomicity, and width (for
integer extension/truncation).

We don't have to use the exact bits above, just pasting for reference.

On Tue, Jul 7, 2015 at 3:15 PM, Ben L. Titzer [email protected] wrote:

I think we should have separate bytecodes, since loads/stores already have
tons of bells and whistles, and adding another mode to configure them for a
module or a function or a scope seems superfluous. In the
v8-native-prototype the load/store bytecodes break down as follows (forgive
the macro-fu, it's a cut-and-paste):

#define FOREACH_LOAD_MEM_EXPR_OPCODE(V) \

V(Int32LoadMemL, 0x20, i_i) \

V(Int64LoadMemL, 0x21, l_i) \

V(Float32LoadMemL, 0x22, f_i) \

V(Float64LoadMemL, 0x23, d_i) \

V(Int32LoadMemH, 0x24, i_l) \

V(Int64LoadMemH, 0x25, l_i) \

V(Float32LoadMemH, 0x26, f_l) \

V(Float64LoadMemH, 0x27, d_l)

// Store memory expressions.

#define FOREACH_STORE_MEM_EXPR_OPCODE(V) \

V(Int32StoreMemL, 0x30, i_ii) \

V(Int64StoreMemL, 0x31, l_il) \

V(Float32StoreMemL, 0x32, f_if) \

V(Float64StoreMemL, 0x33, d_id) \

V(Int32StoreMemH, 0x34, i_li) \

V(Int64StoreMemH, 0x35, l_ll) \

V(Float32StoreMemH, 0x36, f_lf) \

V(Float64StoreMemH, 0x37, d_ld)

Having separate bytecodes has the nice advantage that the "signatures"
i.e. the input/output types of the bytecodes are fully determined without
taking into account a mode. Above, the -H suffix indicates a 64-bit memory
index.

On Tue, Jul 7, 2015 at 3:11 PM, Dan Gohman [email protected]
wrote:

In #245 #245, @titzer
https://github.com/titzer asked whether 64-bit addressing (for >4GiB
linear memory) should be a mode or just separate opcodes.

In practice, C/C++ and similar code is going to require one form or the
other. Several people have looked at address-size abstractions, but
ultimately decided to just accept it as a mode at the C/C++ ABI level, so
we're currently expecting to have two ABIs anyway.

And, the utility of having a >4GiB heap with some code in the process
that can only access the low 4 GiB of it seems marginal, and the potential
for confusion and mistakes under such circumstances is significant.

Having two modes at the platform level seems indicated. Does anyone have
other opinions?


Reply to this email directly or view it on GitHub
#255.

@sunfishcode
Copy link
Member Author

Implementations are free to have separate opcodes internally. Is the utility to developers of being able to mix pointer sizes within a process worth the potential complexity to developers?

@titzer
Copy link

titzer commented Jul 7, 2015

Outside of implementation convenience, I think there are two advantages
here. One, verification of the local types for bytecode is independent of a
"mode" on a module, function, or scope. Every operation has a signature in
terms of local types independent of context (which is still true if we do
the mode scopes for subnormals). Two, applications can mix "32-bit" and
"64-bit" code freely. E.g. I can easily imagine an app that keeps all its
popular data in the low 4gb and therefore uses 32-bit addresses in most
places but needs a huge memory for a few big data items or arrays and thus
only uses 64-bit addresses there.

On Tue, Jul 7, 2015 at 3:25 PM, Dan Gohman [email protected] wrote:

Implementations are free to have separate opcodes internally. Is the
utility to developers of being able to mix pointer sizes within a process
worth the potential complexity to developers?


Reply to this email directly or view it on GitHub
#255 (comment).

@sunfishcode
Copy link
Member Author

Since we expect the binary format to have an opcode name table, all this requires is that implementations check the mode when doing the initial opcode name lookups. For the rest of the decoding, there is no additional burden.

The question here is whether the benefits of allowing mixed 32-bit/64-bit applications outweigh the ecosystem complexity.

A hybrid option is also possible; we could have separate opcode names, but still prohibit them from being used within the same module. This would give us most of the ecosystem simplicity of having modes, while still leaving the door open for mixed-mode operation in the future.

@jfbastien
Copy link
Member

  1. I think we need to also consider the sandboxing burden of supporting mixed 32/64, versus having it be per-module.
  2. Do we have examples of language / codebase that mix 32/64 bit heaps?
    • It seems like using "ILP32" mode is an opt-in thing for static languages that are known to not use more that 4GiB of heap.
    • Would VM runtimes benefit from being able to move to 64-bit heaps if the program uses a large heap? That would require re-JITting and code invalidation, but I remember hearing about such VMs.

@MikeHolman
Copy link
Member

I can easily imagine an app that keeps all its popular data in the low 4gb and therefore uses 32-bit addresses in most places but needs a huge memory for a few big data items or arrays and thus only uses 64-bit addresses there.

Is this something LLVM would/could do? If so, that would certainly give a nice performance motivation (since you could potentially eliminate all bound checks on the 32-bit accesses). Last week I talked with @sunfishcode over IRC about this and it didn't seem like this is something LLVM would take advantage of.

From the wasm VM perspective I don't see much difference in complexity whether we allow mixed access or not, but there might be some things that come up (e.g. with dynamic linking) that make mixed access more difficult.

Would VM runtimes benefit from being able to move to 64-bit heaps if the program uses a large heap? That would require re-JITting and code invalidation, but I remember hearing about such VMs.

This is something I would not want to see. It sounds like a lot of work, added complexity, terrible performance implications, and is something developers might overlook. I'd rather the hard limit for 32-bit heaps. That way if developers need more space, they don't inadvertently (albeit likely with some console warning) have their app cross that threshold, and instead they have to compile for 64-bit.

@lukewagner
Copy link
Member

I see the aesthetic argument for having the signature/semantics of ops be independent of any global mode and so I like the hybrid @sunfishcode proposed above: have separate <4gib and >4gib ops and validation only accepts one kind. With module-local opcode tables, there shouldn't be any index space wasted and the implementation will stay simple and not have to worry about mixing.

@jfbastien
Copy link
Member

Let's make sure dynamic linking also works.

Sidenote: what's nice about separate ops is also that we can loosen the rules later if we want to (something that used to fail validation because it mixed accesses could now be made to pass).

@titzer
Copy link

titzer commented Jul 7, 2015

I agree with JF in that separate bytecode leaves more options for the
future. Also, 32-bit "compressed pointers" is something that many language
VMs have pursued to save space (e.g. it's enabled by default in HotSpot),
so I can see that being useful in practice for non-LLVM wasm producers.

On Tue, Jul 7, 2015 at 10:21 PM, JF Bastien [email protected]
wrote:

Let's make sure dynamic linking also works.

Sidenote: what's nice about separate ops is also that we can loosen the
rules later if we want to (something that used to fail validation because
it mixed accesses could now be made to pass).


Reply to this email directly or view it on GitHub
#255 (comment).

@lukewagner
Copy link
Member

Ok, so you're saying that VMs may want to generate 32-bit code that only operates on data in the low 4GiB while the VM code uses the full 64-bit address space? I think I can see this use case. For MVP, I like the idea of starting with not allowing this mixing and, as @jfbastien said, considering loosening later on.

@sunfishcode
Copy link
Member Author

Ok, then it sounds we have consensus for the hybrid approach: separate opcode names, but prohibit them from being used at the same time for now. Allowing both at the same time can be a Future Feature. I'll make a PR.

And in the LLVM port we need to change the target triple again ;-)

@sunfishcode
Copy link
Member Author

This is now answered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants