Skip to content

V1 Scenarios need to be covered by tests #2498

Open
@rogancarr

Description

@rogancarr

In issue #584, we laid out a set of scenarios that we'd like to cover for V1.0 of ML.NET. We need high-level functional tests to make sure that these work well in the 1.0 library.

Here is a list of tests that cover the scenarios. Let's use this issue as a top-level issue to track coverage of the APIs.

Category Scenarios Link to Test Completed PR Blocked by Issue
Data I/O I can use objects already in memory (as IEnumerable) as input to my ML pipeline/experiment Link  #2518   
Data I/O I can use locally stored delimited files (.csv, .tsv, etc.) as input to my ML pipeline/experiment Link  #2518     
Data I/O I can use locally stored binary files (.idv) as input to my ML pipeline/experiment Link  #2518   
Data I/O I can go through any arbitrary data transformation / model training and save the output to disk as a delimited file (.csv, .tsv, etc.). Link  #2518   
Data I/O I can go through any arbitrary data transformation / model training and save the output to disk as a binary file (.idv). Link  #2518   
Data I/O I can go through any arbitrary data transformation / model training and convert the output to an IEnumerable. Link  #2518   
Data I/O I can use data from a SQL database by reading it into memory or to disk using an existing SQL reader and then use that as input to my ML pipeline/experiment (May be a sample)     
Data Transformation, Feature Engineering I can take an existing ONNX model and get predictions from it (as both final output and as input to downstream pipelines)      
Data Transformation, Feature Engineering Extensible transformation: It should be possible to write simple row-mapping transforms.   Examples: "I can add custom steps to my pipeline such as creating a new column that is the addition of two other columns, or easily add cosine similarity, without having to create my own build of ML.NET."    #2803  
Data Transformation, Feature Engineering I can modify settings in the TextFeaturizer to update the number of word-grams and char-grams used along with things like the normalization.    #2803  #2802
Data Transformation, Feature Engineering I can apply normalization to the columns of my data   #2803   
Data Transformation, Feature Engineering I can take an existing TF model and get predictions from it or any layer in the model   WIP Rogan   
Data Transformation and Feature Engineering P1: I can take an existing TF model and use ML.NET APIs to identify the input and output nodes   WIP Rogan   
Debugging I can see how my data was read in to verify that I specified the schema correctly   #2937   
Debugging I can see the output at the end of my pipeline to see which columns are available (score, probability, predicted label)   #2937   
Debugging I can look at intermediate steps of the pipeline to debug my model.   Example: > I were to have the text "Help I'm a bug!" I should be able to see the steps where it is normalized to "help i'm a bug" then tokenized into ["help", "i'm", "a", "bug"] then mapped into term numbers [203, 25, 3, 511] then projected into the sparse float vector {3:1, 25:1, 203:1, 511:1}, etc. etc.   #2937   
Debugging P1: I can access the information needed for understanding the progress of my training (e.g. number of trees trained so far out of how many)    #2937  
Evaluation I can evaluate a model trained for any of my tasks on test data. The evaluation outputs metrics that are relevant to the task (e.g. AUC, accuracy, P/R, and F1 for binary classification)   #2646  
Evaluation P1: I can get the data that will allow me to plot PR curves   #2646  #2645
Explainability & Interpretability I can get near-free (local) feature importance for scored examples (Feature Contributions)    #2584  
Explainability & Interpretability I can view how much each feature contributed to each prediction for trees and linear models (Feature Contributions)    #2584  
Explainability & Interpretability I can view the overall importance of each feature (Permutation Feature Importance, GetFeatureWeights)   #2584    
Explainability & Interpretability I can train interpretable models (linear model, GAM)      
Introspective training I can take an existing model file and inspect what transformers were included in the pipeline   #2859   
Introspective training I can inspect the coefficients (weights and bias) of a linear model without much work. Easy to find via auto-complete.   #2859  
Introspective training I can inspect the normalization coefficients of a normalizer in my pipeline without much work. Easy to find via auto-complete.    #2859  
Introspective training I can inspect the trees of a boosted decision tree model without much work. Easy to find via auto-complete.    #2859  
Introspective training I can inspect the topics after training an LDA transform. Easy to find via auto-complete.    #2859  
Introspective training I can inspect a categorical transform and see which feature values map to which key values. Easy to find via auto-complete.    #2859  
Introspective training I can access the GAM feature histograms through APIs   #2859   
Model files I can train a model and save it as a file. This model includes the learner as well as the transforms (e.g. Decomposability)      
Model files I can use a model file in a completely different process to make predictions. (e.g. Decomposability)      
Model files I can use newer versions of ML.NET with ML.NET model files of previous versions (for v1.x)    test in V1.1  
Model files I can easily figure out which NuGets (and versions) I need to score an ML.NET model      
Model files P2: I can move data between NimbusML and ML.NET (using IDV). Prepare with NimbusML and load with ML.NET    V1.1  
Model files P2: I can use model files interchangeably between compatible versions of ML.NET and NimbusML.    V1.1  
Model files P1: I can export ML.NET models to ONNX (limited to the existing internal functionality)      
Model files I can save a model to text    V1.1  
Prediction I can get predictions (scores, probabilities, predicted labels) for every row in a test dataset      
Prediction I can reconfigure the threshold of my binary classification model based on analysis of the PR curves or other metrics scores. Link    #2465
Prediction (Might not work?) I can map the score/probability for each class to the original class labels I provided in the pipeline (multiclass, binary classification).      
Tasks I can train a model to do classification (binary and multiclass)   #2646  
Tasks I can train a model to do regression   #2646  
Tasks I can train a model to do anomaly detection   #2646  
Tasks I can train a model to do recommendations   #2646  
Tasks I can train a model to do ranking   #2646  
Tasks I can train a model to do clustering   #2646  
Training I can provide multiple learners and easily compare evaluation metrics between them.   #2921   
Training I can use an initial predictor to update/train the model for some trainers (e.g. linear learners like averaged perceptron). Specifically, start the weights for the model from the existing weights.    #2921  
Training Metacomponents smartly restrict their use to compatible components.   Example: "When specifying what trainer OVA should use, a user will be able to specify any binary classifier. If they specify a regression or multi-class classifier ideally that should be a compile error."    #2921  
Training I can train TF models when I bring a TF model topology   WIP Rogan   
Training I can use OVA and easily add any binary classifier to it    #2921  
Use in web environments I can use ML.NET models to make predictions in multi-threaded environments like ASP.NET. (This doesn't have to be inherent in the prediction engine but should be easy to do.)      
Validation Cross-validation: I can take a pipeline and easily do cross validation on it without having to know how CV works. Link   #2470  
Validation I can use a validation set in a pipeline for learners that support them (e.g. FastTree, GAM)   Link  #2503  

Metadata

Metadata

Assignees

No one assigned

    Labels

    APIIssues pertaining the friendly APIP2Priority of the issue for triage purpose: Needs to be fixed at some point.onnxExporting ONNX models or loading ONNX modelstestrelated to tests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions