Closed
Description
Let us list all issues that we want to handle during the work on 'final user API for ML.NET'.
This way we can scope it to something of a finite project, a point that @shauheen brought up.
So, we spoke with @TomFinley today and arrived at this list. After all the below is done, we can safely call it the 1.0 API.
- (Three major concepts: Estimators, Transformers and Data #581) Estimators and Transformers
- The most challenging part is how to expose the most complicated components, like validating trainers, ensembles, model inspectability etc. through this. It might not even be possible.
- This work also includes more convenience constructors, like the work in Proposal for Major Change in API #371
- (Direct API: Static Typing of Data Pipelines #632) Type-checked pipelining. This is @TomFinley 's idea for compile-time schema propagation.
- (TBD) Saving and loading transformer models. Make sure that we can save and load models with the same expressive power as before, using the new API.
- (Replace SubComponent with IComponentFactory #585) Replace SubComponent with IComponentFactory. The only place where dependency injection will take place in the API is during model loading. We are going to remove the remnants of the old string-based system (SubComponents) in favor of the new one.
- Completely removing SubComponents will probably be breaking the 'maml.exe commandline language', so we'll have to make some decision here.
- (TBD) Move our type system closer to the C# one.
- Replace
DvXXX
with native C# types wherever possible. This meansDvInts
into integers,DvTimeSpan
intoTimeSpan
etc. - Potentially use
Nullable<>
to provide missing values to types that don't have them. - (What to do with VBuffer? #608) Consider splitting
VBuffer
into two typesSparseVector
andDenseVector
. - If the above is done, the only difference between our type systems will be our vectors (sparse or dense, fixed or variable), and key types.
- Replace
Again, if we do all of the above, we can safely call it API v1.