Skip to content

Splitting OLS to a separate package called AdditionalLearners #611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Aug 1, 2018

Conversation

sfilipi
Copy link
Member

@sfilipi sfilipi commented Jul 30, 2018

@eerhardt pointed out on PR #594 that OLS would not work on the ML.Net package, because the main ML.Net package doesn't reference the MlNetMklDeps package.

Taking his suggestion and splitting OLS into a separate nuget package.

@@ -8839,7 +8839,7 @@ public OnlineGradientDescentRegressorPipelineStep(Output output)
namespace Trainers
{

/// <include file='../Microsoft.ML.StandardLearners/Standard/doc.xml' path='doc/members/member[@name="OLS"]/*' />
/// <include file='../Microsoft.ML.AdditionalLearners/Standard/doc.xml' path='doc/members/member[@name="OLS"]/*' />
public sealed partial class OrdinaryLeastSquaresRegressor : Microsoft.ML.Runtime.EntryPoints.CommonInputs.ITrainerInputWithWeight, Microsoft.ML.Runtime.EntryPoints.CommonInputs.ITrainerInputWithLabel, Microsoft.ML.Runtime.EntryPoints.CommonInputs.ITrainerInput, Microsoft.ML.ILearningPipelineItem
Copy link
Member Author

@sfilipi sfilipi Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OrdinaryLeastSquaresRegressor [](start = 36, length = 29)

Hmmm this should not be here either. Nor should LightGBM since it lives in a diff package. Opening an issue about it. #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logged issue #618 to document this.


In reply to: 206351985 [](ancestors = 206351985)

<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<IncludeInPackage>Microsoft.ML.AdditionalLearners</IncludeInPackage>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 0, length = 1)

HERESY! #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you spot them in codeflow !!


In reply to: 206356521 [](ancestors = 206356521)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on top right corner next to (show both) (on right side) is "a.b" button. click on it :)


In reply to: 206356725 [](ancestors = 206356725,206356521)

@@ -0,0 +1,5 @@
<Project DefaultTargets="Pack">

<Import Project="Microsoft.ML.AdditionalLearners.nupkgproj" />
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AdditionalLearners [](start = 32, length = 18)

MklDependentLearners maybe? it just separation of standard/additional learner is unclear from the names #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was thinking that we could use it for other learners, that we don;t want on the core package because of other dependencies. We could bundle them all in one package..


In reply to: 206356744 [](ancestors = 206356744)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we continue the discussion in the other comment (Learners.nupkgproj) and close this one


In reply to: 206357569 [](ancestors = 206357569,206356744)

@@ -94,252 +92,7 @@ protected override void CheckLabel(RoleMappedData data)

protected override void ComputeTrainingStatistics(IChannel ch, FloatLabelCursor.Factory cursorFactory, Float loss, int numParams)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ComputeTrainingStatistics [](start = 32, length = 25)

Poisson regression just do nothing, maybe it's better to just do nothing than throw exception. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, MKL needed only for stdErrors, so maybe we should omit them and return other statistics?

And in general it would be shame to just delete code, someone probably put lot of effort in writing it, and it gonna get lost after you check-in


In reply to: 206357557 [](ancestors = 206357557)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did create this to track introducing a Utility that can reference MKL and produce full stats:

#612

But you are right, partial stats is better than no stats. Thanks for the suggestion.


In reply to: 206358011 [](ancestors = 206358011,206357557)


<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<PackageDescription>ML.NET additional learners. They depend on other packages, like MlNetMklDeps. </PackageDescription>
Copy link
Contributor

@TomFinley TomFinley Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

additional [](start = 31, length = 10)

The name "additional learners" is something we might want to reconsider, since in future as we publish more wrappers we will surely want descriptive names.

I think naming it OrdinaryLeastSquares is probably a fine choice since that's literally all it is? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that maybe we can have SymSGD here too, and other learners that need to drag dependencies?

@Ivanidzo4ka didn't like it either and suggested MklDepsLearners.


In reply to: 206582236 [](ancestors = 206582236)

Copy link
Contributor

@Zruty0 Zruty0 Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MklDeps also doesn't feel good, but it's better. Can we have a name that conveys 'these learners use MKL's optimized linear algebra', without actually saying 'MKL' in the name?

I don't think this should be 'all learners with big dependencies', because it doesn't actually improve the experience too much (if you need one of these learners, you have to grab all). I'd much rather have 'learners depending on MKL' and 'learners depending on OtherHeaveDependency' separate


In reply to: 206584367 [](ancestors = 206584367,206582236)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renaming it to HalLearners. If that doesn's sound good, please suggest a name :)


In reply to: 206634151 [](ancestors = 206634151,206584367,206582236)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like HAL learners.


In reply to: 206661889 [](ancestors = 206661889,206634151,206584367,206582236)

@sfilipi sfilipi changed the title WIP Splitting OLS to a separate package called AdditionalLearners (will remove WIP shortly. Need to test the created package) Splitting OLS to a separate package called AdditionalLearners Jul 31, 2018
@sfilipi sfilipi requested review from shauheen and Zruty0 July 31, 2018 17:47
@@ -282,64 +281,7 @@ protected override void ComputeTrainingStatistics(IChannel ch, FloatLabelCursor.
}
}

// Apply Cholesky Decomposition to find the inverse of the Hessian.
Copy link
Contributor

@Zruty0 Zruty0 Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply Cholesky Decomposition to find the inverse of the Hessian. [](start = 15, length = 64)

Hmm, it's too bad we have to completely drop good code.
Is it possible we could somehow move it to the MKL-dependent assembly? At least have a REVIEW for this. #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have a virtual 'UpdateStdErrors' method that does nothing here, and have a better version of LR in MKL-0dependent assembly that overrides this?


In reply to: 206635936 [](ancestors = 206635936)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're not dropping it. logged #612 to do exactly that, move the code to the mkl dependent package.


In reply to: 206635936 [](ancestors = 206635936)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it


In reply to: 206637501 [](ancestors = 206637501,206635936)

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@@ -888,7 +888,7 @@ public void RegressorOlsTest()
/// <summary>
/// A test for ordinary least squares regression.
/// </summary>
[Fact(Skip = "Need CoreTLC specific baseline update")]
[Fact]
Copy link
Member

@eerhardt eerhardt Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enable the test above this too? It is an Ols test... #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that tests fails the baseline comparison for 1/4896 predictions, in just Mac debug builds...

There are quite a bit of tests that are in this state.
Just finished logging a proposal about how to deal with most of them, here: #619


In reply to: 206638252 [](ancestors = 206638252)

@@ -466,7 +467,7 @@ public static void Pptri(Layout layout, UpLo uplo, int n, Double[] ap)
Desc = "Train an OLS regression model.",
UserName = UserNameValue,
ShortName = ShortName,
XmlInclude = new[] { @"<include file='../Microsoft.ML.StandardLearners/Standard/doc.xml' path='doc/members/member[@name=""OLS""]/*' />" })]
XmlInclude = new[] { @"<include file='../Microsoft.ML.AdditionalLearners/Standard/doc.xml' path='doc/members/member[@name=""OLS""]/*' />" })]
Copy link
Member

@eerhardt eerhardt Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right path?

  1. Why do we need to go up a directory, and then back into the same directory we are already in?
  2. I don't see a Standard folder that doc.xml is in. It looks like it is just directly under this current directory. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference is relative to the CSharpApi.cs file, in Microsoft.ML. This line of xml comment will appear atop the OrdinaryLeastSqaresregressor in the C#api, and from there make its way in the XML.


In reply to: 206638688 [](ancestors = 206638688)

<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<IncludeInPackage>Microsoft.ML.AdditionalLearners</IncludeInPackage>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
Copy link
Member

@eerhardt eerhardt Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think AllowUnsafeBlocks is necessary. I couldn't find unsafe in the OlsLinearRegression.cs file. #Resolved

<doc>
<members>

<member name="OLS">
Copy link
Member

@eerhardt eerhardt Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -597,7 +597,7 @@ private static Float DotProduct(Float[] a, int aOffset, Float[] b, int[] indices

private static class Mkl
{
private const string DllName = "Microsoft.ML.MklImports.dll";
private const string DllName = "libMklImports";
Copy link
Member

@eerhardt eerhardt Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to have the same issue as the OLS learner right? Here is a transform that ships in our Microsoft.ML nuget package, that has a dependency on MKL.

However, since this is currently broken (and has been since before v0.1), I am fine with leaving this one broken and logging a separate issue for it. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is exposed at all at this point. there are more components that are not exposed too.


In reply to: 206640653 [](ancestors = 206640653)

Copy link
Member

@eerhardt eerhardt Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, really? I didn't know we had transforms that weren't exposed. Either way, I'm fine with this. #Resolved

…earners

removign unsafe from the Hal csproj.
removing the orphaned member section from the doc.xml

<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<IncludeInPackage>Microsoft.ML.HalLearners</IncludeInPackage>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HalLearners [](start = 35, length = 11)

Can't wait to train my HAL9000! https://pbs.twimg.com/profile_images/1788506913/HAL-MC2_400x400.png #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, you could also sign off on the PR :)


In reply to: 206715372 [](ancestors = 206715372)


<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<PackageDescription>ML.NET additional learners making use of hardware acceleration. They depend on like MlNetMklDeps. </PackageDescription>
Copy link
Contributor

@Zruty0 Zruty0 Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They depend on like MlNetMklDeps [](start = 88, length = 32)

hmm? Either this is a typo, or I need to, like, change my expectations about package descriptions :) #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for spotting it.


In reply to: 206716498 [](ancestors = 206716498)

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:


<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<PackageDescription>ML.NET additional learners making use of hardware acceleration. They depend on the MlNetMklDeps NuGetpackage. </PackageDescription>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NuGetpackage. [](start = 120, length = 14)

wouldbenicetousespace #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one shall not space.


In reply to: 206717086 [](ancestors = 206717086)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it's still NuGetpackage?


In reply to: 206717405 [](ancestors = 206717405,206717086)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, i thought you were pointing out the space after the period.
Fixed.


In reply to: 206718135 [](ancestors = 206718135,206717405,206717086)

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Thanks for all the hard work, Senja.

Thanks for all the help and the guidance, Eric.

@@ -343,7 +343,7 @@ private static bool ShouldSkipPath(string path)
case "libvw.dll":
case "matrixinterf.dll":
case "Microsoft.ML.neuralnetworks.gpucuda.dll":
case "Microsoft.ML.mklimports.dll":
case "MklImports.dll":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously doesn’t need to be changed on this PR, but this doesn’t handle OS X and Linux, right? It only works for Windows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd have to do it for all those dlls.

@sfilipi sfilipi merged commit 697ea3d into dotnet:master Aug 1, 2018
@sfilipi sfilipi deleted the olsSeparate branch August 1, 2018 01:30
codemzs pushed a commit to codemzs/machinelearning that referenced this pull request Aug 1, 2018
…#611)

* moving Ols to a separate project

* Revert "moving Ols to a separate project"

This reverts commit 9b7eab3.

* separating OLS in its own project

* adding nupkgproj files to create a package for AdditionalLearners
adding AdditionalLearners to the core.tests project
fixing the core_ep and core_manifest

* CSharpApi should not get generated every time.

* Addressing Ivan's comments

* referencing package version 0.0.0.4 of MlNetMklDeps that contains new names for the mkl library.

* Correcting the error message.

* renaming AdditionalLearners to HalLearners < - Hardware Accelerated Learners
removign unsafe from the Hal csproj.
removing the orphaned member section from the doc.xml

* referencing package 0.0.0.5 and updating the name.

* regenerating the CsharpApi and the eplist post merge

* regenerate the CSharpApi and the eplists post merge. Fix the namespace post merge.

* typo

* one shall not space

* spacing
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants