Skip to content

Enabling FFM tests #1206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Oct 13, 2018
Merged

Enabling FFM tests #1206

merged 14 commits into from
Oct 13, 2018

Conversation

sfilipi
Copy link
Member

@sfilipi sfilipi commented Oct 9, 2018

Resolves part of #404

@sfilipi sfilipi added the test related to tests label Oct 9, 2018
@sfilipi sfilipi self-assigned this Oct 9, 2018
// see https://github.com/dotnet/machinelearning/issues/404
// in Linux, the clang sqrt() results vary highly from the ones in mac and Windows.
if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
RunAllTests(binaryPredictors, binaryClassificationDatasets, digitsOfPrecision:4);
Copy link
Member

@wschin wschin Oct 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that we only check "9487" in "9487.05"? The issue you opened said that the difference happens at the 17th decimal, so probably we can increate it to 7? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the small subset that I explored, there were differences starting on the 4th decimal digit, for Linux. Should update the issue.


In reply to: 223888003 [](ancestors = 223888003)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

giving 5 another try after updating the comparison code.


In reply to: 224267254 [](ancestors = 224267254,223888003)

/// <paramref name="toCompare"/> objects are used for comparison only.
/// </summary>
/// <returns>Whether this test succeeded.</returns>
protected bool TestCore(RunContextBase ctx, string cmdName, string args, int digitsOfPrecision, params PathArgument[] toCompare)
Copy link
Member

@eerhardt eerhardt Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) a single space was added to the beginning of the method, messing up the alignment.
Can you also align the /// comments as well? #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /// comments are still not aligned.


In reply to: 224100533 [](ancestors = 224100533)

@@ -355,7 +360,14 @@ protected bool CheckTestOutputMatchesTrainTest(string trainTestOutPath, string t
/// </summary>
public abstract partial class TestDmCommandBase : TestCommandBase
{
private bool TestCoreCore(RunContextBase ctx, string cmdName, string dataPath, PathArgument.Usage situation, OutputPath inModelPath, OutputPath outModelPath, string loaderArgs, string extraArgs, params PathArgument[] toCompare)
private bool TestCoreCore(RunContextBase ctx, string cmdName, string dataPath, PathArgument.Usage situation,
OutputPath inModelPath, OutputPath outModelPath, string loaderArgs, string extraArgs, params PathArgument[] toCompare){
Copy link
Member

@eerhardt eerhardt Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) opening curly brace should go on a new line. #Resolved

@@ -407,6 +419,11 @@ protected bool TestCore(RunContextBase ctx, string cmdName, string dataPath, str
return TestCoreCore(ctx, cmdName, dataPath, PathArgument.Usage.DataModel, null, ctx.ModelPath(), loaderArgs, extraArgs, toCompare);
}

protected bool TestCore(RunContextBase ctx, string cmdName, string dataPath, string loaderArgs, string extraArgs, int digitsOfPrecision, params PathArgument[] toCompare)
Copy link
Member

@eerhardt eerhardt Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

digitsOfPrecision [](start = 126, length = 17)

I don't see digitsOfPrecision being used in the method. Am I missing something? #Resolved

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@@ -1944,8 +1944,10 @@ public void BinaryClassifierFieldAwareFactorizationMachineTest()

// see https://github.com/dotnet/machinelearning/issues/404
// in Linux, the clang sqrt() results vary highly from the ones in mac and Windows.
// goign for 3 digits of precision, because the range of search is (-0.0001 - 0.0001)
Copy link
Member

@wschin wschin Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goign? #Resolved

@@ -1944,8 +1944,10 @@ public void BinaryClassifierFieldAwareFactorizationMachineTest()

// see https://github.com/dotnet/machinelearning/issues/404
// in Linux, the clang sqrt() results vary highly from the ones in mac and Windows.
// goign for 3 digits of precision, because the range of search is (-0.0001 - 0.0001)
// for one of the values, and the actual value is 0.00099999999999989
Copy link
Member

@wschin wschin Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.00099999999999989 is close enough to 0.001; it looks like the difference is smaller than 10^-7. Why do we need a larger tolerance? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but it is expecting a number in the -0.0001 to 0.0001 range. Still looking into as the test artifacts are not saved.


In reply to: 224266472 [](ancestors = 224266472)

…as treating floats with scientific notation as strings; amending the regex to pick those up.

Our custom Round method was rounding oen digit short than the digitsOfPrecision. Seems like Math.Round is not doing a bad job for the subset of tests i run.

the delta calculated in the basetestbaseline, for some cases were passing the allowedVariance by a small fraction, outside of the digits we care to compare.
Rounding that to truncate those digits before submitting it to the range comparison.

Removing the digitsOfPrecision for a test drive on the CI,  from some of the FFM tests, as it seems like they are doing ok without it for my local windows/linux runs.
@sfilipi sfilipi requested a review from tannergooding October 11, 2018 22:51
@@ -521,26 +521,13 @@ private static void MatchNumberWithTolerance(MatchCollection firstCollection, Ma
double f2 = double.Parse(secondCollection[i].ToString());

double allowedVariance = Math.Pow(10, -digitsOfPrecision);
double delta = Round(f1, digitsOfPrecision) - Round(f2, digitsOfPrecision);
double delta = Math.Round(f1, digitsOfPrecision) - Math.Round(f2, digitsOfPrecision);
Copy link
Member

@tannergooding tannergooding Oct 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not be using Math.Round, it does not correctly handle significant digits that land on the left side of the decimal point (the integer part of the number). It only deals with the fractional half. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-introduced Round, thanks for the feedback. I moved to using it only on the difference, as rounding was not helping on a few cases.


In reply to: 224629061 [](ancestors = 224629061)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good comment in the code - why we have a Round method.


In reply to: 224671986 [](ancestors = 224671986,224629061)

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@sfilipi sfilipi changed the title Enabling FFM tests WIP: Enabling FFM tests Oct 12, 2018
@sfilipi sfilipi changed the title WIP: Enabling FFM tests Enabling FFM tests Oct 12, 2018
@@ -354,7 +354,7 @@ public string GetLoaderTransformSettings(TestDataset dataset)
string[] extraSettings = null, string extraTag = "", bool summary = false, int digitsOfPrecision = DigitsOfPrecision)
{
Contracts.Assert(IsActive);
Run_TrainTest(predictor, dataset, extraSettings, extraTag, summary: summary, digitsOfPrecision: digitsOfPrecision);
// Run_TrainTest(predictor, dataset, extraSettings, extraTag, summary: summary, digitsOfPrecision: digitsOfPrecision);
Copy link
Member

@eerhardt eerhardt Oct 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this a mistake? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this, commented out while debugging.


In reply to: 224863064 [](ancestors = 224863064)

}
return all;
}
}
Copy link
Member

@eerhardt eerhardt Oct 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) you removed a space here. #Closed

@sfilipi sfilipi merged commit 2983312 into dotnet:master Oct 13, 2018
@sfilipi sfilipi deleted the ffmTests branch October 13, 2018 05:22
@sfilipi
Copy link
Member Author

sfilipi commented Oct 13, 2018

This might resolve the fluctuations in precision for the MulticlassTreefeaturizedLR, and therefore not need to disable it like in #1185 .

@ErcinDedeoglu
Copy link

@sfilipi @eerhardt Could you give an example, how can I train decimal datas and predict a decimal data please?

@eerhardt
Copy link
Member

@ErcinDedeoglu - you mean the C# ‘decimal’ type? ML.NET doesn’t support that type. You would need to convert to a ‘float’, run it through ML.NET, and then convert it back to a decimal.

@ErcinDedeoglu
Copy link

Dear @eerhardt,
When I cast my decimal value to float (float) i lost sensitivity...
decimal x = 2031630.73022778M;
float y = 2031630.75;
But the numbers after the point are also very important.

What's your suggestion?,
Thanks.

@eerhardt
Copy link
Member

Yes, that is the downfall of using floating point numbers over a number that has a precise representation.

In the machine learning world, usually the precision is not that important. I don't know of anything in the industry that uses a precise number representation. Everything I've seen uses floating point numbers.

@ErcinDedeoglu
Copy link

@eerhardt Stock market uses :)

@ghost ghost locked as resolved and limited conversation to collaborators Mar 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
test related to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants