Skip to content

Public API for remaining learners #1901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 22, 2018
Merged

Conversation

najeeb-kazmi
Copy link
Member

Fixes #1703

Predictors covered in this PR:

  • EnsemblePredictorBase
  • EnsembleDistributionPredictor
  • EnsemblePredictor
  • EnsembleMultiClassPredictor
  • GamPredictorBase
  • BinaryClassificationGamPredictor
  • RegressionGamPredictor
  • PcaPredictor
  • FieldAwareFactorizationmachinePredictor
  • MultiClassLogisticRegressionPredictor`
  • MultiClassNaiveBayesPredictor
  • OvaPredictor
  • PkpdPredictor
  • RandomPredictor
  • PriorPredictor

@Ivanidzo4ka
Copy link
Contributor

Ivanidzo4ka commented Dec 18, 2018

/// The ctor takes an alignment value, which must be a power of two at least sizeof(Float).

would be nice if you update comments as well. #Resolved


Refers to: src/Microsoft.ML.CpuMath/AlignedArray.cs:16 in d3ea4df. [](commit_id = d3ea4df, deletion_comment = False)

@Ivanidzo4ka
Copy link
Contributor

Ivanidzo4ka commented Dec 18, 2018

    internal abstract void CheckLabel(RoleMappedData data);

Curios, why it has to be internal? Did someone actually call it outside of class? #Resolved


Refers to: src/Microsoft.ML.FastTree/GamTrainer.cs:245 in d3ea4df. [](commit_id = d3ea4df, deletion_comment = False)

_latentWeightsAligned.CopyFrom(latentWeightsAligned.AsSpan());
}

internal FieldAwareFactorizationMachineModelParameters(IHostEnvironment env, bool norm, int fieldCount, int featureCount, int latentDim,
float[] linearWeights, AlignedArray latentWeightsAligned) : base(env, LoaderSignature)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float[] linearWeights, AlignedArray latentWeightsAligned [](start = 12, length = 56)

@wschin, are this two similar array with only difference is one of them is array and second is align array?
If yes, I don't understand why I need to pass both of them, and not just get first and construct align version.
If not, why they called this way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linearWeights is linear coefficient where SSE is not very helpful to compute related values. In contrast, latentWeights is the latent vector of each feature and heavily uses SEE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all nice from internal point of view.
My question are user specific.
Does user need to provide this TWO SAME ARRAYS or not?
If not, can you formulate rule according to which we can construct align array?


In reply to: 242650269 [](ancestors = 242650269)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same array? LinearWeights and latentWeights are two independent things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shouldn't review code at 11 pm, you are right.
Still I don't think we should expect user to pass align array and construct it internally.


In reply to: 242666580 [](ancestors = 242666580)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new iteration, I have added methods to return latent weights and linear weights to the user. The latent dimension is something specified by the user at training. The latent weights are converted to the aligned weights by adjusting the latent dimension. This is now hidden from the user. The expectation is that user will call FieldAwareFactorizationMachines.GetLatentWeights() which will have the unadjusted latent dimension, then they can adjust the weights as they want, and pass it to the public constructor. The latent dimension will be adjusted during construction.


In reply to: 242693412 [](ancestors = 242693412,242666580)

public override PredictionKind PredictionKind => PredictionKind.BinaryClassification;

public BinaryClassificationGamPredictor(IHostEnvironment env, int inputLength, Dataset trainset,
public GamBinaryClassificationModelParameters(IHostEnvironment env, int inputLength, Dataset trainset,
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GamBinaryClassificationModelParameters [](start = 15, length = 38)

Do you plan to add comments to all public constructors in separate PR? #Resolved

Copy link
Member

@wschin wschin Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR sometimes means forever. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm making the GAM constructors internal because they use Dataset object, which is in Microsoft.ML.FastTree.Internal that I don't think end user of API should be expected to know. I have filed issue #1948 to document it.


In reply to: 242421554 [](ancestors = 242421554)

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@sfilipi sfilipi dismissed their stale review December 19, 2018 17:51

revoking review

@najeeb-kazmi
Copy link
Member Author

    internal abstract void CheckLabel(RoleMappedData data);

Yeah, it's overridden in GamRegression and GamVlassification, but those are derived classes so I'll make it private protected.


In reply to: 448111459 [](ancestors = 448111459)


Refers to: src/Microsoft.ML.FastTree/GamTrainer.cs:245 in d3ea4df. [](commit_id = d3ea4df, deletion_comment = False)

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooof, quite a lot of work I see @najeeb-kazmi ! Thanks for fixing the names in so many places, and making so much more internal.

@wschin
Copy link
Member

wschin commented Dec 21, 2018

Overall looks good. Two comments:
(1) Do you have at least a test for each public function?
(2) For vector and matrix arguments, do you specify their shapes and the meaning of the i-th (or (i,j)-element for matrix where i is row index and j is column index) element?

Items.AsSpan().CopyTo(dst);
}

public void CopyTo(Span<float> dst, int index, int count)
{
Contracts.Assert(0 <= count && count <= _size);
Contracts.Assert(dst != null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double-checking that this is intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be. I did not touch this code, except Float -> float!


In reply to: 243714825 [](ancestors = 243714825)

@@ -107,44 +107,44 @@ protected override SchemaShape.Column[] GetOutputColumnsCore(SchemaShape inputSc
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated to your PR, but since i stumped upon this, is this used at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is part of TrainerEstimatorBase and has to be implemented


In reply to: 243714960 [](ancestors = 243714960)

_latentWeightsAligned[indexAligned + k] = 0;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging @wschin to check on this.

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@najeeb-kazmi najeeb-kazmi merged commit 00577c0 into dotnet:master Dec 22, 2018
@najeeb-kazmi najeeb-kazmi deleted the 1703 branch January 30, 2020 01:13
@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants