Skip to content

Full scope of API review #583

Closed
Closed
@Zruty0

Description

@Zruty0

Let us list all issues that we want to handle during the work on 'final user API for ML.NET'.

This way we can scope it to something of a finite project, a point that @shauheen brought up.

So, we spoke with @TomFinley today and arrived at this list. After all the below is done, we can safely call it the 1.0 API.

  • (Three major concepts: Estimators, Transformers and Data #581) Estimators and Transformers
    • The most challenging part is how to expose the most complicated components, like validating trainers, ensembles, model inspectability etc. through this. It might not even be possible.
    • This work also includes more convenience constructors, like the work in Proposal for Major Change in API #371
  • (Direct API: Static Typing of Data Pipelines #632) Type-checked pipelining. This is @TomFinley 's idea for compile-time schema propagation.
  • (TBD) Saving and loading transformer models. Make sure that we can save and load models with the same expressive power as before, using the new API.
  • (Replace SubComponent with IComponentFactory #585) Replace SubComponent with IComponentFactory. The only place where dependency injection will take place in the API is during model loading. We are going to remove the remnants of the old string-based system (SubComponents) in favor of the new one.
    • Completely removing SubComponents will probably be breaking the 'maml.exe commandline language', so we'll have to make some decision here.
  • (TBD) Move our type system closer to the C# one.
    • Replace DvXXX with native C# types wherever possible. This means DvInts into integers, DvTimeSpan into TimeSpan etc.
    • Potentially use Nullable<> to provide missing values to types that don't have them.
    • (What to do with VBuffer? #608) Consider splitting VBuffer into two types SparseVector and DenseVector.
    • If the above is done, the only difference between our type systems will be our vectors (sparse or dense, fixed or variable), and key types.

Again, if we do all of the above, we can safely call it API v1.

@TomFinley @shauheen @ericstj @eerhardt

Metadata

Metadata

Assignees

No one assigned

    Labels

    APIIssues pertaining the friendly API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions