Skip to content

MLContext.Transform should be further organized #2361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
artidoro opened this issue Feb 1, 2019 · 3 comments
Closed

MLContext.Transform should be further organized #2361

artidoro opened this issue Feb 1, 2019 · 3 comments
Labels
API Issues pertaining the friendly API

Comments

@artidoro
Copy link
Contributor

artidoro commented Feb 1, 2019

It seems that MLContext.Transforms can be further organized.
In MLContext.Transforms it is possible to access many transforms directly:

  • IndicateMissingValues
  • ReplaceMissingValues
  • ApplyOnnxModel
  • Concatenate
  • LoadImage
  • Normalize
  • Resize
  • ScoreTensorFlowModel
  • SelectColumns
  • TemsorFlow

There are also subgroups:

  • Transforms.Text
  • Transforms.Projection
  • Transforms.Categorical
  • Transforms.Conversion
  • Transforms.FeatureSelection

Suggestions:

  1. It seems that more groupings can be made:
    • Transforms.Image
    • Transforms.MissingValues
    • Transforms.TensorFlow
  2. And maybe Normalize can be moved to the Transforms.Projections.

Question:
Does it even make sense to have some transforms in a subgroup, while others directly accessible?

/cc: @rogancarr, @sfilipi, @TomFinley

@artidoro artidoro added the API Issues pertaining the friendly API label Feb 1, 2019
@rogancarr
Copy link
Contributor

rogancarr commented Feb 1, 2019

I'd like to move schema operations into something like Schema. e.g. mlContext.Transforms.Schema.DropColumns().

@TomFinley
Copy link
Contributor

TomFinley commented Feb 1, 2019

Transforms.MissingValues might be OK, but I'm not sure I see the point in having something if there are only two entries. Honestly I am not too excited about that.

I do not think Transforms.Image or Transforms.TensorFlow are sensible to have. Bear in mind that these extensions are not seen unless someone explicitly chooses to include the relevant nuget packages. So if we were to have another "subcategory," everyone would have to see those whether they had included the relevant nuget or not. (Since there is no such thing as an extension property.) So it would just have to be there, and certainly it would be bad to have these "categories" which are by default empty. And if someone has chosen to explicitly rely on those nugets, that's a pretty solid indication that they want to use that code. So why hide it?

So I don't necessarily agree with the points in this issue? Is the fact that there are some transforms at the root of the hierarchy really so very bad?

@artidoro
Copy link
Contributor Author

artidoro commented Jul 3, 2019

This is not applicable any longer since we have now release with the categories currently present in MLContext. When we will consider breaking changes, we might think about reorganizing MLContext.

@artidoro artidoro closed this as completed Jul 3, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

No branches or pull requests

3 participants