Switch to using unicode when parsing the command line on windows #7241

Rageoholic · 2020-11-28T01:22:18Z

This should help out with issue #534. Just using GetCommandLineW and then adding in some fixups to make it compile, return utf8 still, and make the tests pass.

LemonBoy · 2020-11-28T09:46:10Z

The tests are failing because mips is a big-endian machine, the result from nextCodepoint needs to be byte-swapped or the tests disabled on everything but Windows.

lib/std/unicode.zig

Co-authored-by: LemonBoy <[email protected]>

Rageoholic · 2020-11-28T12:59:33Z

Changes are applied. Looks like the W functions should give you back little endian codepoints. MS has AFAIK never released a version of windows that runs on big endian so I don't care but doing the fixup should be really inexpensive or free and really who puts argument parsing on the fast path anyways. Also we can just run the tests on any given machine and you'll know if you broke it.

LemonBoy · 2020-11-28T16:07:25Z

Changes are applied.

Tests are still failing because skip/next are not using littleToNative.

Looks like the W functions should give you back little endian codepoints.

UTF-16 or UCS-2 ?

Also we can just run the tests on any given machine and you'll know if you broke it.

That's a great argument, I like it 👍

MIPs

The lowercase s is making me extra sad for no reason :(

Rageoholic · 2020-11-28T17:14:42Z

The lowercase s is making me extra sad for no reason :(

I wrote that early in the morning and now it's making me sad.

UTF-16 or UCS-2 ?

It's definitely UTF-16. I crawled through MSDN to check. Frankly though they can't port to a big endian architecture without either breaking the people who assumed little endian architecture (given I just did I can't blame them) and people who actually do the fixup on big endian architectures. Hopefully enough people were like "I'll just use a library" and the library did the right thing so that MS can keep it's word without too much pain.
https://docs.microsoft.com/en-us/windows/win32/intl/using-byte-order-marks

daurnimator · 2020-11-29T00:26:22Z

It's definitely UTF-16. I crawled through MSDN to check

Where/how? MSDN usually fails to note that when they say UTF-16 they really mean UCS-2.

Rageoholic · 2020-11-29T00:27:51Z

It says utf-16 in the link I posted

…

-------- Original Message --------

On Nov 28, 2020, 6:26 PM, daurnimator wrote: > It's definitely UTF-16. I crawled through MSDN to check Where/how? MSDN usually fails to note that when they say UTF-16 they really mean UCS-2. — You are receiving this because you authored the thread. Reply to this email directly, [view it on GitHub](#7241 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ACC47ZJO6WFSJREUDEPR4CDSSGILVANCNFSM4UFOMV7Q).

andrewrk · 2020-11-30T18:47:07Z

Thanks!

Switch to using unicode when parsing the command line on windows

60c1553

LemonBoy reviewed Nov 28, 2020

View reviewed changes

lib/std/unicode.zig Outdated Show resolved Hide resolved

lib/std/unicode.zig Outdated Show resolved Hide resolved

Apply changes by LemonBoy and *hopefully* fix tests on MIPs

060ec44

Co-authored-by: LemonBoy <[email protected]>

Fix up next and skip

be04e1a

Move comment to more relevant place

490262e

andrewrk merged commit 0369b65 into ziglang:master Nov 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Switch to using unicode when parsing the command line on windows #7241

Switch to using unicode when parsing the command line on windows #7241

Uh oh!

Rageoholic commented Nov 28, 2020

Uh oh!

LemonBoy commented Nov 28, 2020

Uh oh!

Uh oh!

Uh oh!

Rageoholic commented Nov 28, 2020

Uh oh!

LemonBoy commented Nov 28, 2020

Uh oh!

Rageoholic commented Nov 28, 2020

Uh oh!

daurnimator commented Nov 29, 2020

Uh oh!

Rageoholic commented Nov 29, 2020 via email

Uh oh!

andrewrk commented Nov 30, 2020

Uh oh!

Uh oh!

Uh oh!

Switch to using unicode when parsing the command line on windows #7241

Switch to using unicode when parsing the command line on windows #7241

Uh oh!

Conversation

Rageoholic commented Nov 28, 2020

Uh oh!

LemonBoy commented Nov 28, 2020

Uh oh!

Uh oh!

Uh oh!

Rageoholic commented Nov 28, 2020

Uh oh!

LemonBoy commented Nov 28, 2020

Uh oh!

Rageoholic commented Nov 28, 2020

Uh oh!

daurnimator commented Nov 29, 2020

Uh oh!

Rageoholic commented Nov 29, 2020 via email

Uh oh!

andrewrk commented Nov 30, 2020

Uh oh!

Uh oh!