Skip to content

Looking for i64.bswap in future impl #1334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lygstate opened this issue Apr 14, 2020 · 34 comments
Closed

Looking for i64.bswap in future impl #1334

lygstate opened this issue Apr 14, 2020 · 34 comments

Comments

@lygstate
Copy link
Contributor

No description provided.

@SamuraiCrow
Copy link

I didn't even see i32 or i16 bswap supported. Is there a newer set of specs and proposed specs to look at?

@binji
Copy link
Member

binji commented Aug 27, 2020

There's some discussion of bswap as a future feature, but there hasn't been anything proposed officially.

@SamuraiCrow
Copy link

That's disappointing but not surprising. Thanks for the heads' up.

@binji
Copy link
Member

binji commented Aug 31, 2020

Do you have an example where not having a bswap instruction has a large performance or code size impact? If so, that would be a compelling reason to look into adding it.

@lygstate
Copy link
Contributor Author

For example, using webassembly emulation PPC or running webassembly on PPC, or Socket htons(), ntohl(), ntohs(),htons()
@binji

@SamuraiCrow
Copy link

Implementing code on "retro" platforms like the Commodore Amiga and Atari ST models is one such case. Both of these use 68000 through 68060 or better CPUs which, rare as they are, are often incorporated as softcores in FPGAs due to the expense of developing dedicated ASIC designs. On a 68040, for example, the simplest little-endian to big-endian swap takes 3 opcodes:
ROR.W D0
SWAP D0
ROR.W D0
The endian swap to support little-endian code on this 40 MHz CPU is immense. This example doesn't take advantage of pipelining at all and the cost penalty of non-pipelined rotate and word-swap operations is even higher on the superscalar 68060 thus requiring endian swap commands of 5 opcodes or more. While the 68080 softcore (not available as ASIC) has a move-with-swap opcode, the memory accesses of the addressing modes of it are still big-endian.

@taralx
Copy link

taralx commented Aug 31, 2020

@SamuraiCrow I'm confused. If the underlying platform doesn't have a bswap instruction, how does adding one to webassembly help?

@lygstate
Copy link
Contributor Author

lygstate commented Aug 31, 2020

@taralx lots of platform have bswap instruction
https://c9x.me/x86/html/file_module_x86_id_21.html

@SamuraiCrow
Copy link

Using bswap as a prefix to a store and a suffix to a load produces an equivalent to big-endian load and store operations. Personally, a big endian load and store for 16, 32 and 64 bits would actually be preferred but having a way to do endian-swaps is necessary for many old platforms that are still supported.

@binji
Copy link
Member

binji commented Aug 31, 2020

Sorry, I think my comment above was a bit confusing. Since bswap-like functionality can currently be implemented in WebAssembly already, we should approach adding a bswap instruction to Wasm like we do with adding a new Wasm SIMD-instruction.

In particular, it would be useful to know how best to lower these instructions for various architectures (at least x64 and ARM), and what the performance difference would be in a real-world benchmark. See for example WebAssembly/simd#128. Many of these do not include a benchmark, but the case for SIMD is a bit different, since it could be shown that these instructions were being used in relevant applications, and not including them would require downshifting to scalar.

@SamuraiCrow
Copy link

For starters, https://github.com/michalsc/Emu68 runs in AArch64eb instruction set (eb for big endian mode) due to the fact that it would take additional instructions to implement endian-swaps with vector units in a register-tight environment. Of course it's a JIT to run 68020 code but that shouldn't matter.

@SamuraiCrow
Copy link

https://www.felixcloutier.com/x86/movbe is the x64 version of the Move from Big Endian instruction to load and store big-endian modes.

@sunfishcode
Copy link
Member

running webassembly on PPC

https://github.com/michalsc/Emu68 runs in AArch64eb instruction set (eb for big endian mode)

Note that even if we add a bswap instruction, most WebAssembly code won't use it, and will still expect little-endian behavior, and these use cases won't be improved.

It would be theoretically possible to build eg. a C compiler that automatically inserts bswap before every store and after every load, to produce a kind of big-endian WebAssembly which runs more efficiently on big-endian hosts. However, this would effectively create a new C ABI, which, if properly supported, would bubble up through a lot of tools, libraries, and ecosystems, creating a lot of extra work for a lot of people who don't otherwise need this ability. I myself would be opposed to adding a bswap to WebAssembly if these use cases are part of the motivation for it.

@lygstate
Copy link
Contributor Author

lygstate commented Sep 1, 2020

running webassembly on PPC

https://github.com/michalsc/Emu68 runs in AArch64eb instruction set (eb for big endian mode)

Note that even if we add a bswap instruction, most WebAssembly code won't use it, and will still expect little-endian behavior, and these use cases won't be improved.

It would be theoretically possible to build eg. a C compiler that automatically inserts bswap before every store and after every load, to produce a kind of big-endian WebAssembly which runs more efficiently on big-endian hosts. However, this would effectively create a new C ABI, which, if properly supported, would bubble up through a lot of tools, libraries, and ecosystems, creating a lot of extra work for a lot of people who don't otherwise need this ability. I myself would be opposed to adding a bswap to WebAssembly if these use cases are part of the motivation for it.

I am getting confused, there is no need toolchain support, only need webassembly can lowering bswap down into native CPU instruction. Think WebAssembly as a IR

@binji
Copy link
Member

binji commented Sep 1, 2020

I'm a little confused here too. WebAssembly is little-endian, by design. I thought we were talking about adding bswap to make it faster to run an emulator for a big-endian machine. I think that's OK, and potentially a good reason to add the instruction.

If instead we're talking about making a new big-endian WebAssembly (with a new ABI), I'm also opposed to that idea.

@SamuraiCrow
Copy link

I agree with lygstate and binji. If WebAssembly were only going to support 3 operating systems and 2 processor architectures, there wouldn't be any point in making it cross-platform. Emulation is a thing too.

If somebody wants to make their own OS or processor architecture, WebAssembly should allow it to happen. That's why it's a standard, not a product. If the native code of that OS is big-endian, of course a little extra custom-lowering will be necessary but that falls on the OS and browser developers to implement it in that case. That doesn't mean that the practice should be disallowed when using WebAssembly outside the browser either. All software will be predominately little-endian and adding bswap is not going to change that.

@SamuraiCrow
Copy link

In addition, I've got a few more use-cases for you. Old file formats and packet formats sometimes used the "network endian" (aka big-endian) architecture. All the little-endian usage in the world is going to make AIFF audio into a little-endian format. Of course you could use Wave files in their place but batch conversion takes time too.

@sunfishcode
Copy link
Member

In my post above, I quoted two use cases from earlier posts which seem to want wasm producer toolchain support and a new big-endian ABI. I don't want a new big-endian ABI for WebAssembly, and it's not clear to me so far that this isn't one of the goals here.

@lygstate
Copy link
Contributor Author

lygstate commented Sep 1, 2020

In my post above, I quoted two use cases from earlier posts which seem to want wasm producer toolchain support and a new big-endian ABI. I don't want a new big-endian ABI for WebAssembly, and it's not clear to me so far that this isn't one of the goals here.
I am sorry for conusing you, I am not talking about toolchain support, I am just demo a example there is big endian machine. Not request for toolchain support

@SamuraiCrow
Copy link

I'm talking about old file formats. Certainly not breaking compatibility with the current ABI. That would defeat the purpose of having a bytecode.

@sunfishcode
Copy link
Member

Ok, cool. So to be sure, a wasm bswap instruction wouldn't help with running WebAssembly on a ppc or a 68000-series CPU, and wouldn't help porting code written with the assumption it's running on aarch64be.

@lygstate
Copy link
Contributor Author

lygstate commented Sep 8, 2020

Ok, cool. So to be sure, a wasm bswap instruction wouldn't help with running WebAssembly on a ppc or a 68000-series CPU, and wouldn't help porting code written with the assumption it's running on aarch64be.

yes, you are right, bswap is something like simd to improve performance

@SamuraiCrow
Copy link

After careful consideration, I've decided to make my own bytecode rather than using an off-the-shelf bytecode that claims to be cross-platform but isn't.

@SamuraiCrow
Copy link

Issue #1212 would solve this.

@SoniEx2
Copy link

SoniEx2 commented Sep 27, 2020

fwiw, we don't actually need big endian. or bswap.

@SamuraiCrow
Copy link

Half of the computers I own use big endianness. I seldom use up-to-date machines so I'll not be using WebAssembly in its current form.

@SoniEx2
Copy link

SoniEx2 commented Sep 27, 2020

That's fine. We'll make you use it. :)

@SamuraiCrow
Copy link

Most of my computers don't have an up-to-date web browser. How will you "make" me use it?

@SoniEx2
Copy link

SoniEx2 commented Sep 27, 2020

with a compiler :p

@SamuraiCrow
Copy link

Not in its current form. I'll have to fix it up first. ;-)

@SoniEx2
Copy link

SoniEx2 commented Sep 27, 2020

how?

@SamuraiCrow
Copy link

Obi Wan voice:
Use the source. Let it guide your actions.

@SoniEx2
Copy link

SoniEx2 commented Sep 27, 2020

well, regardless, we'll make it work.

@sunfishcode
Copy link
Member

Closing this in favor of #1426, which also tracks adding a bswap and has more detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants