Skip to content

Attempt to fix error messages types #3046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Ivanidzo4ka
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka commented Mar 21, 2019

fixes #3045

#3037

Error messages are painful - I4, R4 etc. etc. - most people will not know what these are.

Let's see is it so easy as I think, or I need to change all baseline

@@ -341,7 +341,7 @@ public static string TestGetLabelGetter(DataViewType type, bool allowKeys)
if (allowKeys && type is KeyType)
return null;

return allowKeys ? "Expected R4, R8, Bool or Key type" : "Expected R4, R8 or Bool type";
return allowKeys ? "Expected Single, Double, Boolean or Key type" : "Expected Single, RDouble8 or Boolean type";
Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double #Closed

@@ -89,13 +89,13 @@ private protected override void CheckScoreAndLabelTypes(RoleMappedSchema schema)
if (t != NumberDataViewType.Single && !(t is KeyType))
{
throw Host.ExceptSchemaMismatch(nameof(RankingMamlEvaluator.Arguments.LabelColumn),
"label", schema.Label.Value.Name, "R4 or a key", t.ToString());
"label", schema.Label.Value.Name, "Single or a key", t.ToString());
Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize #Closed

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
L0: Text
Year: Text
Month: Text
L0: String
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move them to Common folder after successful build run.
Otherwise it's gonna be messy PR.

Input: R4
Output: Vec<R4>
Input: Single
Output: Vec<Single>
Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vec [](start = 10, length = 3)

Should I name you Vector? #Closed

Ivan Matantsev added 2 commits March 21, 2019 11:57

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@@ -400,7 +400,7 @@ public static AffineColumnFunction Create(ModelLoadContext ctx, IHost host, Data
if (vectorType.ItemType == NumberDataViewType.Double)
return Dbl.ImplVec.Create(ctx, host, vectorType);
}
throw host.ExceptUserArg(nameof(AffineArgumentsBase.Columns), "Wrong column type. Expected: R4, R8, Vec<R4, n> or Vec<R8, n>. Got: {0}.", typeSrc.ToString());
throw host.ExceptUserArg(nameof(AffineArgumentsBase.Columns), "Wrong column type. Expected: Single, Double, Vector<Single, n> or Vector<Double, n>. Got: {0}.", typeSrc.ToString());
Copy link

@yaeldekel yaeldekel Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector [](start = 124, length = 6)

In other places we write something like "known-size vector of ...". Might be a bit less confusing than having this mysterious "n" here. #Pending

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to be known size, does this one look ok?


In reply to: 267931607 [](ancestors = 267931607)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right, the code above doesn't require the vector to have a known size :-).

But, isn't this a bug? How can a normalizer be defined on a column of unknown size?


In reply to: 267945509 [](ancestors = 267945509,267931607)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I just noticed that this is the method that loads from ModelLoadContext, shouldn't it be ExceptDecode?


In reply to: 267972968 [](ancestors = 267972968,267945509,267931607)

Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I just noticed that this is the method that loads from ModelLoadContext, shouldn't it be ExceptDecode?

In reply to: 267972968 [](ancestors = 267972968,267945509,267931607)

Not really. Except decode is for faults of format, which this is not. What this is describing is a properly deserialzed and formatted model being applied to user data, and that user data is not compatible with the transform. So, it would be inappropriate to throw an error explicitly describing that the format is at fault. The format was fine, its application by the user to do this data is not fine. Just because there's a model load context somewhere in scope doesn't mean every conceivable exception we could throw must become this format error.

@@ -128,7 +128,7 @@ public static NumberDataViewType Byte
get
{
return _instByte ??
Interlocked.CompareExchange(ref _instByte, new NumberDataViewType(typeof(byte), "U1"), null) ??
Interlocked.CompareExchange(ref _instByte, new NumberDataViewType(typeof(byte), "ushort"), null) ??
Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ushort [](start = 101, length = 6)

byte and sbyte #Closed

@@ -11,7 +11,7 @@ public static void Example()
// Create a new ML context, for ML.NET operations. It can be used for exception tracking and logging,
// as well as the source of randomness.
var ml = new MLContext();

Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random change is random #Closed

@forki
Copy link
Contributor

forki commented Mar 21, 2019

Wow that's a big PR to improve a error message

@codecov
Copy link

codecov bot commented Mar 21, 2019

Codecov Report

Merging #3046 into master will increase coverage by 0.01%.
The diff coverage is 34.73%.

@@            Coverage Diff             @@
##           master    #3046      +/-   ##
==========================================
+ Coverage   72.53%   72.55%   +0.01%     
==========================================
  Files         806      806              
  Lines      144282   144280       -2     
  Branches    16183    16183              
==========================================
+ Hits       104661   104676      +15     
+ Misses      35217    35195      -22     
- Partials     4404     4409       +5
Flag Coverage Δ
#Debug 72.55% <34.73%> (+0.01%) ⬆️
#production 68.17% <34.73%> (+0.01%) ⬆️
#test 88.76% <ø> (ø) ⬆️
Impacted Files Coverage Δ
...osoft.ML.Transforms/Text/TokenizingByCharacters.cs 95.32% <ø> (ø) ⬆️
...ML.Transforms/Text/StopWordsRemovingTransformer.cs 86.1% <ø> (-0.16%) ⬇️
...soft.ML.Transforms/Text/NgramHashingTransformer.cs 88.77% <ø> (ø) ⬆️
src/Microsoft.ML.Transforms/Text/WordTokenizing.cs 79.15% <ø> (ø) ⬆️
src/Microsoft.ML.Transforms/Text/NgramTransform.cs 87.42% <ø> (ø) ⬆️
...rc/Microsoft.ML.Transforms/Text/TextNormalizing.cs 97.11% <ø> (ø) ⬆️
src/Microsoft.ML.Transforms/GcnTransform.cs 77.57% <ø> (ø) ⬆️
src/Microsoft.ML.TimeSeries/IidSpikeDetector.cs 72.54% <0%> (ø) ⬆️
src/Microsoft.ML.Transforms/Text/LdaTransform.cs 89.26% <0%> (-0.63%) ⬇️
....ML.Data/Evaluators/QuantileRegressionEvaluator.cs 68.66% <0%> (ø) ⬆️
... and 36 more

@@ -94,7 +94,7 @@ public override bool Equals(DataViewType other)
return false;
}

public override string ToString() => "Text";
public override string ToString() => "String";
Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String [](start = 46, length = 6)

This one is a little tricky, since of course the type as we see a few lines above is not String at all, but rather ReadOnlyMemory<char>. #Pending

Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it would be internally ReadOnlyMemory, but for user who construct class user should be fine with String user who works with TextLoaderwe haveDataKind.String` as well.


In reply to: 268027832 [](ancestors = 268027832)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, just so long as you've thought about it. Though, incidentally, this is not an "internal" detail. We provide convenience conversions to and from string for the sake of our reflection based API, but IDataView itself works exclusively over ReadOnlyMemory<char>, never string.


In reply to: 268261843 [](ancestors = 268261843,268027832)

@@ -218,7 +218,7 @@ public Mapper(FeatureContributionCalculatingTransformer parent, DataViewSchema s
throw Host.ExceptSchemaMismatch(nameof(schema), "input", _parent.ColumnPairs[0].inputColumnName);
_featureColumnType = schema[_featureColumnIndex].Type as VectorType;
if (_featureColumnType == null || _featureColumnType.ItemType != NumberDataViewType.Single)
throw Host.ExceptSchemaMismatch(nameof(schema), "feature", _parent.ColumnPairs[0].inputColumnName, "vector of float.", _featureColumnType.ItemType.ToString());
throw Host.ExceptSchemaMismatch(nameof(schema), "feature", _parent.ColumnPairs[0].inputColumnName, "Vector of Single", _featureColumnType.ItemType.ToString());
Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector [](start = 120, length = 6)

While Single is appropriate, I think we generally call these vector in the context of a larger sentence.

So I would say "I want a vector of Single," I would not say "I want a Vector of Single." Unless you were planning on changing this in other places, which I notice you haven't. #Closed

Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to figure out which one looks better, and come to conclusion what lower case vector is more appealing for me, but I guess i miss capital case in some places.


In reply to: 268028098 [](ancestors = 268028098)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, thank you @Ivanidzo4ka


In reply to: 268257594 [](ancestors = 268257594,268028098)

@@ -932,7 +932,7 @@ public ImplVec(IHost host, TFloat[] mean, TFloat[] stddev, bool useLog)

public static ImplVec Create(ModelLoadContext ctx, IHost host, VectorType typeSrc)
{
host.Check(typeSrc.ItemType.RawType == typeof(TFloat), "The column type must be vector of R8.");
host.Check(typeSrc.ItemType.RawType == typeof(TFloat), "The column type must be vector of Double.");
Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector [](start = 104, length = 6)

Here's an example of a place where we've left it lowercase vector. Not sure what approach you want to take here. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowercase


In reply to: 268028513 [](ancestors = 268028513)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, I agree, thank you.


In reply to: 268267564 [](ancestors = 268267564,268028513)

@@ -153,7 +153,7 @@ private RangeFilter(IHost host, ModelLoadContext ctx, IDataView input)

_type = schema[_index].Type;
if (_type != NumberDataViewType.Single && _type != NumberDataViewType.Double && _type.GetKeyCount() == 0)
throw Host.ExceptSchemaMismatch(nameof(schema), "source", column, "float, double or KeyType", _type.ToString());
throw Host.ExceptSchemaMismatch(nameof(schema), "source", column, "Single, Double or KeyType", _type.ToString());
Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeyType [](start = 101, length = 7)

So, this is an interesting one... the type is of key, but we are calling it KeyType... yet, we say Single not SingleType. Probably not worth resolving now, but something to think about in future, no doubt. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while Single and Double is .net types, Key is something we introduce. So I fill need to distinguish them somehow.
In same time Eric has PR to rename KeyType to KeyDataViewType


In reply to: 268028603 [](ancestors = 268028603)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's name it Key.


In reply to: 268253632 [](ancestors = 268253632,268028603)

@@ -117,7 +117,7 @@ public static NumberDataViewType SByte
get
{
return _instSByte ??
Interlocked.CompareExchange(ref _instSByte, new NumberDataViewType(typeof(sbyte), "I1"), null) ??
Interlocked.CompareExchange(ref _instSByte, new NumberDataViewType(typeof(sbyte), "sbyte"), null) ??
Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sbyte [](start = 103, length = 5)

We are elsewhere using the .NET type names for these things, are we not? So should this be SByte, not sbyte? #Closed

@@ -128,7 +128,7 @@ public static NumberDataViewType Byte
get
{
return _instByte ??
Interlocked.CompareExchange(ref _instByte, new NumberDataViewType(typeof(byte), "U1"), null) ??
Interlocked.CompareExchange(ref _instByte, new NumberDataViewType(typeof(byte), "byte"), null) ??
Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byte [](start = 101, length = 4)

So, by a similar token, ought this to be Byte? #Closed

@@ -150,7 +150,7 @@ public static NumberDataViewType UInt16
get
{
return _instUInt16 ??
Interlocked.CompareExchange(ref _instUInt16, new NumberDataViewType(typeof(ushort), "U2"), null) ??
Interlocked.CompareExchange(ref _instUInt16, new NumberDataViewType(typeof(ushort), "UIUnt16"), null) ??
Copy link
Contributor

@TomFinley TomFinley Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UIUnt16 [](start = 105, length = 7)

Typo. #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if you had, say, done nameof(UInt16) you wouldn't have had this problem. I think in fact this would probably be a good idea to adopt...


In reply to: 268029020 [](ancestors = 268029020)

Copy link
Member

@eerhardt eerhardt Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just get rid of this string and use RawType.Name instead. #Resolved

@@ -341,7 +341,7 @@ public static string TestGetLabelGetter(DataViewType type, bool allowKeys)
if (allowKeys && type is KeyType)
return null;

return allowKeys ? "Expected R4, R8, Bool or Key type" : "Expected R4, R8 or Bool type";
return allowKeys ? "Expected Single, Double, Boolean or Key type" : "Expected Single, Double or Boolean type";
Copy link
Member

@sfilipi sfilipi Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single [](start = 90, length = 6)

i feel like "float" "double" or "bool" are even more user friendly.

we don't usually go :
Boolean k = true; #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from C# developer point, yes.
But as soon as you work with TextLoader or you inside Visual Basic you no longer have bool our Datakind is Boolean CLS wise i think it's also Boolean


In reply to: 268223385 [](ancestors = 268223385)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our C# code, we prefer to use the language specific keywords, like float, bool, short, etc.

However, for public APIs, strings, etc. where these words/names are visible to other languages we shouldn't be using the C# language specific keywords. Instead we should be using the .NET Type names like Boolean, Single, Int16, etc. We made those changes in our public API, like the public DataKind enum. We should use the same names in these strings.

@@ -1098,18 +1098,18 @@ private void CheckInputColumnTypes(DataViewSchema schema)

var t = schema[(int)LabelIndex].Type;
if (t != NumberDataViewType.Single && t != NumberDataViewType.Double && t != BooleanDataViewType.Instance && t.GetKeyCount() != 2)
throw Host.ExceptSchemaMismatch(nameof(schema), "label", LabelCol, "float, double, bool or a KeyType with cardinality 2", t.ToString());
Copy link
Member

@sfilipi sfilipi Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"float, double, bool [](start = 83, length = 21)

i'd live them #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.microsoft.com/en-us/dotnet/standard/language-independence-and-language-independent-components?view=netframework-4.7.2#types-and-type-member-signatures
If you scroll down it has list of CLS compliant types.
float and double are C# specific, while Single and Double I believe should work on vbasic, F# and other languages.


In reply to: 268223604 [](ancestors = 268223604)


VectorType scoreType = schema[ScoreIndex].Type as VectorType;
if (scoreType == null || scoreType.Size == 0 || (scoreType.ItemType != NumberDataViewType.Single && scoreType.ItemType != NumberDataViewType.Double))
{
throw Host.ExceptSchemaMismatch(nameof(schema), "score", ScoreCol, "known-size vector of float or double", t.ToString());
throw Host.ExceptSchemaMismatch(nameof(schema), "score", ScoreCol, "known-size vector of Single or Double", t.ToString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector [](start = 95, length = 6)

would array be more correct? vector still needs some mapping to a C# type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem with array, is what we also can work with VBuffer in our code.
Would like to hear other people opinion as well, tho.


In reply to: 268224361 [](ancestors = 268224361)

update baselines
@Ivanidzo4ka Ivanidzo4ka changed the title [WIP] Attempt to fix error messages types Attempt to fix error messages types Mar 22, 2019
Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Ivanidzo4ka !

@Ivanidzo4ka Ivanidzo4ka merged commit beb67a1 into dotnet:master Mar 25, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataViewType ToString should not use old DataKind names
6 participants