Skip to content

V07 value impl #246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Jun 23, 2015
Merged

V07 value impl #246

merged 39 commits into from
Jun 23, 2015

Conversation

frsyuki
Copy link
Member

@frsyuki frsyuki commented Jun 10, 2015

This pull request implements Value API.


@Override
public Value getOrNilValue(int index) {
if (array.length < index && index >= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems to be backwards. I think it should be index < array.length

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch. fixed.

@xerial xerial self-assigned this Jun 10, 2015
@xerial xerial added this to the 0.7.0-M6 milestone Jun 10, 2015
return ValueFactory.newIntegerValue(unpackLong());
}
case FLOAT:
return ValueFactory.newFloatValue(unpackDouble());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remembered why FloatHolder was necessary. When reading a float value as double, it will add some fractions. For example:

scala> 0.1341f.toDouble
res5: Double = 0.13410000503063202

If user wants to print this float value as String, 0.13410000503063202 will be shown. This is unexpected behavior for the user.

Another example: 0.1f becomes 0.10000000149011612 string if we use double as 0.1fs internal representation.

I'll fix this in another pull request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is OK to say say that it's a limitation because of internal representation of a Value because anyways it is impossible to preserve all information of floating point numbers using decimal string format.

A reason is this trade-off: First of all, Float.intBitsToFloat(0x3e09374c) is equal to Double.longBitsToDouble(0x3fc126e980000000) in MessagePack because MessagePack does not distinguish single-precision from double-precision in terms of type system. So, FloatValue#equals method 1) should always use double, or 2) should always cast to single-precision float if precision of the actual value fits in single-precision float like IntgerValue implementation. Same for FloatValue#hashCode(). If we take 1), when floatValue.equals(other) returns true, floatValue.toString().equals(other.toString()) may return false. If we take 2), deserialization of msgpack's float 64 format becomes slow.

If implement, options are:

a) create ImmutableFloatValueImpl as you mention
b) add boolean floatIsSinglePrecision field. If this is true, toString() and writeTo(Packer) uses single-precision.

I think b) is better for Variable. Otherwise Variable#equals method becomes more complicated. For ImmutableValue, a) or b).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this topic with @frsyuki. MessagePack type system will not preserve the error range of float values since some msgpack implementation cannot have float values. So application developers needs to be careful so as not to write a code that depends on float precision. Even if the data is written in FLOAT32, Value interface will represent it in double precision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going this route means that if a single precision number is in the data stream and read into a value. Value.writeTo will write a double to the MessagePacker. Should writeTo reproduce the original msgpacked data or just something close to it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion has started in MessagePack specification side:
msgpack/msgpack#200

If we can preserve original message-packed data in an efficient way, lossless writeTo implementation would be useful, but the message type system does not require it.

@xerial xerial mentioned this pull request Jun 19, 2015
@xerial
Copy link
Member

xerial commented Jun 23, 2015

@muga
I also feel getXXX and toXXX are confusing. And also toString produces JSON string that produces double quoted strings even if user need to get a string value. There are lot of confusing points.

How about renaming these methods as follows?:

  • (round)toXXX (with truncation or rounding)
  • (cast)asXXX throws Exception (checked cast)
  • toJSON (produces json string. The same with the current implementation of toString)

Then remove getXXX methods.

@frsyuki
Copy link
Member Author

frsyuki commented Jun 23, 2015

How about this simply?

  • rename NumberValue#toXxx() methods to NumberValue#roundToXxx()

I think it's a possible idea to separate toJson from toString. However, because some values are not compatible with JSON (non-string keys of a map, ExtensionValue, floating point values, etc.), we would need another class like ValueToJsonPacker that converts Value to JSON so that the conversion behavior can be configurable using the constructor of the packer class.

@xerial
Copy link
Member

xerial commented Jun 23, 2015

@frsyuki
If we recommend to use Value.getXXX to extract values, simply renaming toXXX -> roundToXXX is better.

My concern on StringValue.toString is it always returns double quoted value in generating string messages with StringValue.

@xerial
Copy link
Member

xerial commented Jun 23, 2015

Using double quoted string looks good for distinguishing StringValue "1" and FloatValue 1 when debugging.

OK. I'll go with roundToXXX, getXXX and toString. toJSON should be implemented in another class as @frsyuki said.

@xerial
Copy link
Member

xerial commented Jun 23, 2015

A problem is getting float value has no good name like roundToXXX.

How about using asXXX for truncation and rounding (e.g., asInt, asFloat, etc., rather than roundToInt, roundToFloat?).

If we use asXXX, we can simply say representing this value as XXX type.

mmm. But asXXXValue is already used for converting Value types. So castAsXXX might be better.

xerial added a commit that referenced this pull request Jun 23, 2015
@xerial xerial merged commit 21816fb into v07-develop Jun 23, 2015
@xerial
Copy link
Member

xerial commented Jun 23, 2015

Merged. Will add test codes and further internal improvement in another PR.

@frsyuki
Copy link
Member Author

frsyuki commented Jun 25, 2015

Why don't you use roundToXxx? I think that castAsXxx assumes that users understand the actual implementation the methods. If we assume that users don't have to know the actual implementation, I think castToXxx is confusing because I don't know whether it throws exceptions or not. For example, Double v = 1.1; Integer iv = (Integer) v; throws an exception in Java. So, "cast" doesn't always implies rounding.

In my opinion, roundToFloat / roundToDouble is correct name because it actually can rounding.

@frsyuki
Copy link
Member Author

frsyuki commented Jun 25, 2015

@xerial How about this radical idea?:

  • Concept:
    • toXxx and asXxx are the type casting methods.
    • toXxx methods may lose some information but never throws exceptions.
    • asXxx methods don't lose information but might throw exceptions. Perhaps it's good idea to make them checked-exception so that difference of toXxx and asXxx becomes very clear.
  • Implementation:
    • a) change IntegerValue#getXxx() to asXxx()
    • b) change NumberValue#castAsXxx() to toXxx() including toFloat() and toDouble()
    • c) change RawValue#getString() to asString(), RawValue#getByteBuffer() to asByteBuffer(), RawValue#getByteArray() to asByteArray()
    • d) change ListValue#list() to asList(), MapValue#map() to asMap()
    • e) change BooleanValue#getBoolean() to asBoolean()
    • f) remove StringValue#stringValue()
    • g) change StringValue#toString() so that it doesn't add double quotes
    • h) add String Value#toJson(). This implementation is same with current #toString(). So, it's easy to implement.
    • i) add void Value#toJson(JsonPacker) in the future so that we can customize behavior and improve performance if necessary. For example, "how to convert BinaryValue to JSON?", "how to convert ExntesionValue to JSON?", "how to convert FloatValue to JSON?", "how to handle non-string keys in a MapValue?"

With this idea, you can achieve short names and consistent concept at the same time.

@xerial
Copy link
Member

xerial commented Jun 25, 2015

@frsyuki
If we do h) (toString -> toJson), I agree with your idea:

  • toXXX (may lose information with casting, but throws no exception)
  • asXXX (may throw exception)

@xerial
Copy link
Member

xerial commented Jun 25, 2015

@frsyuki
But NumberValue.asXXX is already used for Value.asXXXValue. That was why I choose castAsXXX to distinguish methods that produce native values and msgpack Values.

@frsyuki
Copy link
Member Author

frsyuki commented Jun 25, 2015

@xerial I think that Value.asAbcValue methods are also a kind of asXxx methods because it doesn't lose information, may throw exception, and Xxx is the returning class name.
(I edited above comment. toJson is not f but h. please check on github)

@xerial
Copy link
Member

xerial commented Jun 26, 2015

If having asInt and asIntegerValue is OK, there would be no problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants