Skip to content

Value-tuple stragglers in the public API #2881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomFinley opened this issue Mar 7, 2019 · 3 comments · Fixed by #2950
Closed

Value-tuple stragglers in the public API #2881

TomFinley opened this issue Mar 7, 2019 · 3 comments · Fixed by #2950
Assignees
Labels
API Issues pertaining the friendly API
Milestone

Comments

@TomFinley
Copy link
Contributor

TomFinley commented Mar 7, 2019

There was a prior PR #2581 and issue #2501 related to value-tuples and why they should not be part of our public surface. I have noticed that there are some "stragglers" still remaining in the public API. So, the work is perhaps not yet complete.

The following list is I believe complete for Core/Data/Transforms/FastTree/ImageAnalytics/KMeansClustering/LightGBM/PCA/Tensorflow/StandardLearners/Data.DataView assemblies.

There are three distinct categories where this flaw has remained. (Though the last "category" other has only one item.)

Properties on transformers

Some transformers are exposing information about themselves via this mechanism.

  • KeyToBinaryVectorMappingTransformer.Columns
  • MissingValueDroppingTransformer.Columns
  • MissingValueIndicatorTransformer.Columns
  • CustomStopWordsRemovingTransformer.Columns
  • TextNormalizingTransformer.Columns
  • TokenizingByCharactersTransformer.Columns
  • WordEmbeddingsExtractingTransformer.Columns
  • ImageGrayscalingTransformer.Columns
  • ImageLoadingTransformer.Columns
  • LatentDirichletAllocationTransformer.ItemScoresPerTopic and WordScoresPerTopic

MLContext estimator creation extension methods

There are some overloads of MLContext extension methods on various catalogs that are stil using it. I view this as a lesser sin since this is at least something that could conceivably be fixed using an overload if we decide it is necessary, but I'd still prefer to be consistent.

  • ProduceHashedNgrams extension method
  • ProduceHashedWordBags extension method
  • ProduceNgrams extension method
  • ProduceWordBags extension method
  • RemoveDefaultStopWords extension method
  • TokenizeWords extension method

Others

Lastly, I see a Microsoft.ML.ColumnOptions global class with an implicit operator from value-tuples. This one is probably harmless, since that specific class is for representing a simple case.

/cc @yaeldekel @Ivanidzo4ka

@TomFinley TomFinley added the API Issues pertaining the friendly API label Mar 7, 2019
@Ivanidzo4ka
Copy link
Contributor

#2827 number 5 should take care of Properties on transformers
I see PRs which make that property internal.

@TomFinley
Copy link
Contributor Author

#2827 number 5 should take care of Properties on transformers
I see PRs which make that property internal.

All right, cool, thank you @Ivanidzo4ka ... all of them you think?

@shauheen shauheen added this to the 0319 milestone Mar 7, 2019
@Ivanidzo4ka
Copy link
Contributor

Well eventually, yes. If you want to go and sweep them in one PR, I wouldn't stop you. You will make life of scrubbing people easier.
But ask yourself a question, do you really want to make someone else life easier? That's not the Tom I used to knew! :D

@Ivanidzo4ka Ivanidzo4ka self-assigned this Mar 13, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants