Open
Description
State of Development on feat/getml_mlflow:
- Logging of Input parameters in the Parameters tab. Includes all parameters of objects in the pipeline: predictors, feature learners, preprocessors etc
- Logging of model performance parameters (currently during fitting, but should be moved to evaluate/scoring)
- Logging of live engine performance metrics (memory usage, CPU usage)
- Logging of system metrics
On log-dataset-info:
- logging of data set Meta data: we can govern the displayed names of dataframes and their contexts. But runs on PandasDataSet under the hood
- Scoring using mlflow.evaluate(). But only to limited extent (produces only prediction based metrics and plots, for probability based plots, model provision is necessary)
Under development locally:
- getMLDataSet: avoids reliance on PandasDataSet, enables the controlling of the "profile" (the dictionary in the dataset view). so far unable to add
roles
to the schema
Under development:
- Enahnced communication between engine and API: detailed logging of progress of Feature Learner and Predictor progress. Soeren developed communication protocol on engine side already.
Future developments:
- Integration of improved engine communication on API side
- Input example and signature (challenge: getML uses containers)
- enable mlflow.evaluate() to use the model for on the spot predictions
- sensible naming and grouping of tags
- one-pager or more complex interactive interface to report logged information
Stale/obsolete less relevant branches and PRs:
#1 (project-folder-identifier) : identifies correct getML installation folder. Might be less relevant due to correct getml1.5 release
parameter-revision: merged
log-metrics: merged
save-load-getml: obsolete
Metadata
Metadata
Assignees
Labels
No labels