-
Notifications
You must be signed in to change notification settings - Fork 196
question: on uint vs. int decoding. #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yarg, I forgot to say "thank you" in the first place for this project. I love it, and thats why Im trying to figure out how to morph it to work with legacy applications. It appears, my case Given the strong range checking in the subsequent int types e.g. func ReadInt16Bytes(b []byte) (int16, []byte, error) {
i, o, err := ReadInt64Bytes(b)
if i > math.MaxInt16 || i < math.MinInt16 { it appears they would handle the situation well. Then again you explicitly mentioned "sporadic failure" and how you are trying to avoid them. Any thoughts on perhaps adding a global flag to allow uints to be read into ints (off by default)? |
I don't understand the issue. Are you saying other msgpack implementations are varying the type they write depending on the value received? That sounds like a serious implementation problem that should stay isolated to their own little poor-practices universe. msgp should not cater to such poor habits. |
Hi glycerine, Sorry if I wasn't clear. In many untyped language's implementation of the message pack spec, (and some typed ones), they default to using unsigned if the value is non-negative. The spec doesn't really have any guidance here, so they aren't wrong per-se (and they might not even have a concept of signed, or unsigned, nor any way of even indicating such a thing). However, this means if the client is using one of these libraries, a msgp server can not read them. This isn't a rare issue. In the space of a few minutes I found 3 versions, in three 3 languages that do this. Just doing another spot check I found 2 more, below. I suspect many more do this since it makes the programming of the msgpack algorithm a lot smaller. Now I agree that msgp <--> msgp should keep doing exactly what it is doing as its nice and type safe and prevent all sorts of potential randoms.. My goal is to enable clients who are using common msgpack libraries emitting valid msgpack to be read by msgp (as mentioned, with optional flag of some type, or a code-gen option, and/or "unsafe" version, and still having error checking for overflow). Does this help explain issue? thanks all! n Javascript (first one I found): // Integers
if (value >=0) {
// positive fixnum
if (value < 0x80) {
buffer[offset] = value;
return 1;
}
// uint 8
if (value < 0x100) {
buffer[offset] = 0xcc;
buffer[offset + 1] = value;
return 2;
} and another one for JS but using V8 engine in C++.. if positive, use unsigned.
|
The 'official' work-around for this is to use I'm happy to entertain other proposals, though, provided that the default behavior remain as-is. |
oh interesting @philhofer I'll take a look. (And note,, I like the default behavior). |
wow time flies... so the reason "everyone" does this somewhat odd "use uint-unless-negative, then use int" is because the c reference library does it this way. Its a bit hard to find the code however this here's a sample. I find it a bit odd for a C library not to respect the sign that is being passed in. Anyways this why so many "other" libraries don't play nice with msgp since they want to byte identical to the reference imply. I'm still taking a look at solutions to allow the best of both worlds here in msgp. thx. n https://github.com/msgpack/msgpack-c/blob/master/include/msgpack/pack.hpp
|
@client9 Fascinating. But painful still the same. Thanks for digging this up. |
Ok, back to this old one! Given that automatic but sloppy conversations are not desired (inputs and outputs must match exact sign, and numeric type), there two options.
The current functions such as "Int()" and "Uint()" do a bit-wise cast to a value, and check to see if the type is conversion is even valid or not. (i..e if asking for Int() it returns true if the underlying value is a int, false otherwise... well more or less). There is no range checking. This makes is difficult to handle the wild variety of "correct value" but "wrong type" situations that occur with clients using different message pack library. To make msg.Number easier to deal with I propose the following new methods to be added:
(others such as Int8,16 and Uint8, Uint16. CastFloat32 and CastFloat64 might require more thinking but conceptual the same). These would make best attempt to covert the numeric value to the correct type. Returning the value, and true if success and in range, and false is out of range. This would allow users of the package to be "as sloppy as they want to be" with out writing a lot of boilerplate code. To do this logic outside of the code is a little tricky since the Number type field is not public. correction: "To do this logic outside of the code is a little tricky since the Number type field is not public." actually there is a method Type() n |
@client9 Sounds reasonable; I'm happy to review a PR if you have time to put one together. |
"if you have time to put one together." yeah, exactly the problem :-) As a side note, I took at look at the RFC for CBOR which is maybe a msgpack2.0.. On numbers they say the following:
which to me says the |
+1 |
@dwlnetnl go for it! |
This wiki entry needs updating considering this issue. It's fascinating that the C library would do this and msgp would barf on the output. This lead to quite a serious issue in our app where everything worked fine until you had 128 or more of something, and then it stopped working. One thing I didn't see considered in this thread, is what about the code generators? Will there be a way to specify that the field should be coerced into the struct type? |
I would love like to see this issue reconsidered. For better or worse, the msgpack ecosystem seems to have standardized on a different set of rules than msgp, and in this way, msgp does not seem to be a compatible implementation. Please see the discussion in msgpack/msgpack#164 and msgpack/msgpack-c#247 for more detail. The current setup means that decoding msgpack with "Rule 3: sign matters" currently gives the opposite understanding of how things work in the msgpack ecosystem than they do in reality, unfortunately. I'm particularly concerned right now about getting the code generator to work, since it's not even possible to put a The fix I would like to see is that I could just specify a |
Addresses tinylib#134 in a way that makes it more interoperable with other msgpack libraries. The actual code borrowed from librato#1
@philhofer What is the status of this issue? I'm willing to make a pull request. Any thoughts on librato#1 / antoniomo#1 ? |
@aldencolerain and @philhofer and @client9 it's true that apparently all other MessagePack implementations by default encode non-negative integers as unsigned integers (uint8, uint16, uint32, or uint64). I think the best way to add interoperability with all those libraries is to extend |
I started getting some of these errors on decode:
The culprit was another msgpack implementation
I found a few other libraries doing this trick. Any suggestions here on how i can morph your library to read them?
update:
I'll spare you reading lua code, but
https://github.com/fperrad/lua-MessagePack/
also defaults to unsigned.
update, and one version of python, basically the pattern is "if non-negative, write as unsigned, if negative write as signed"
The text was updated successfully, but these errors were encountered: