Skip to content

Update MacroUtils to map trainer kinds to the correct suffix of trainer entry point names #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 14, 2018

Conversation

yaeldekel
Copy link

Closes #112 .

Entry points for binary classifiers end with "BinaryClassifier", for regressors end with "Regressor", etc. Update the mapping in MacroUtils to enable finding the entry points of trainers for a specific task.

@yaeldekel yaeldekel requested review from codemzs and Ivanidzo4ka May 10, 2018 18:33
if (type.Name == "BinaryLogisticRegressor")
return trainerKind == TrainerKinds.SignatureBinaryClassifierTrainer;
if (type.Name == "LogisticRegressor")
return trainerKind == TrainerKinds.SignatureMultiClassClassifierTrainer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

God, this is so ugly.
Is there a way to cast EntryPoint type to real type? Maybe you can use EntryPointCatalog to find mapping, and after you get real type you can maybe get prediction type, or get kind through reflection?
This is unmaintainable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better solution might be harder to imagine though. We might let this slide for now since this change will fix things that are broken now, but we ought definitely to file a new issue so we can imagine something a bit less fragile than this system.


In reply to: 187421786 [](ancestors = 187421786)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be addressed when we change the mechanism to use the EntryPoint attribute?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entry-point attribute may indeed be the best place to put a category like this -- something akin to the old-style PredictionKind enum, but not that please. :D :D :D


In reply to: 187457279 [](ancestors = 187457279)


public static bool IsTrainerOfKind(Type type, TrainerKinds trainerKind)
{
if (type.Name == "BinaryLogisticRegressor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type.Name == "BinaryLogisticRegressor" [](start = 16, length = 38)

All of these string operations make me nervous.

Is it at all reasonable to change this (and other) type operations to just be type == typeof(BinaryLogisticRegressor) or at least rely on some other reference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there also any way we can write a test for this, so we can detect when something like this happens again?


In reply to: 187422829 [](ancestors = 187422829)

@@ -119,7 +119,7 @@ private sealed class TaskInformationBundle
{
TrainerKinds.SignatureMultiOutputRegressorTrainer,
new TaskInformationBundle {
TrainerFunctionName = "TrainMultiRegression",
TrainerFunctionName = "MultiOutputRegressor",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiOutputRegressor [](start = 47, length = 20)

You said something about a name suffix... is it a concern that the suffixes are not unique? So Regressor above would be a suffix of any MultiOutputRegressor. We previously worked around this problem I guess by having every suffix start with Train, but that solution has gone out the window I suppose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar notes for Classifier vs. BinaryClassifier.


In reply to: 187424111 [](ancestors = 187424111)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK now I know you've special cased this. :) Hmmm I'd still rather have a different solution.


In reply to: 187424194 [](ancestors = 187424194,187424111)

Copy link
Member

@codemzs codemzs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments but they are not blockers. Thanks for making this change.

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yaeldekel for fixing the problem... let's think more perhaps in an issue about how best to address the architectural deficiencies of this overall scheme.

@yaeldekel yaeldekel merged commit 33c8364 into dotnet:master May 14, 2018
@yaeldekel yaeldekel deleted the macroutils branch May 14, 2018 14:39
eerhardt pushed a commit to eerhardt/machinelearning that referenced this pull request Jul 27, 2018
…er entry point names (dotnet#113)

* Update suffix of trainer entry point names by trainer kind.

* Address PR comments.

* Add unit test.

* Update C# API

* Move unit test to TestAutoInference and fix EntryPointCatalog test.

* Trigger build.

* Add reference to the test project to make the sweeper entry point visible to EntryPointCatalog test.
@ghost ghost locked as resolved and limited conversation to collaborators Mar 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants