Scoring model as a good practice #444

ArturoAmorQ · 2021-08-25T12:55:18Z

Though in almost all of the notebooks we score the presented model somehow, there are some exceptions:

logistic_regression.py (not really important as we have an intuition by eye but would be a nice-to-have, or least we can add a Warning message, I think)
linear_models_ex_05.py and its solution
trees_hyperparameters.py

On the same line of thought, ensemble_hyperparameters.py computes cv scores for parameter tuning but it doesn't pass the best parameters to a final test-set scoring.

Plotting a validation curve using the train set but no further scoring on the test-set (as in Solution for Exercise M6.03 and Solution for Exercise M6.04) is discussed in this forum post. In this case we can either use the whole data-set or add a Warning message.

We also don't score most of the notebooks that use the full data-set for modeling (see #441), but that is perfectly fine.

## August 31th, 2021 ### Gael * TODO: Jeremy's renewal, Chiara's replacement, Mathis's consulting gig ### Olivier - input feature names: main PR [#18010](scikit-learn/scikit-learn#18010) that links into sub PRs - remaining (need review): [#20853](scikit-learn/scikit-learn#20853) (found a bug in `OvOClassifier.n_features_in_`) - reviewing `get_feature_names_out`: [#18444](scikit-learn/scikit-learn#18444) - next: give feedback to Chiara on ARM wheel building [#20711](scikit-learn/scikit-learn#20711) (needed for the release) - next: assist Adrin for the release process - next: investigate regression in loky that blocks the cloudpickle release [#432](cloudpipe/cloudpickle#432) - next: come back to intel to write a technical roadmap for a possible collaboration ### Julien - Was on holidays - Planned week @ Nexedi, Lille, from September 13th to 17th - Reviewed PRs - [`#20567`](scikit-learn/scikit-learn#20567) Common Private Loss module - [`#18310`](scikit-learn/scikit-learn#18310) ENH Add option to centered ICE plots (cICE) - Others PRs prior to holidays - [`#20254`](scikit-learn/scikit-learn#20254) - Adapted benchmarks on `pdist_aggregation` to test #20254 against sklearnex - Adapting PR for `fast_euclidean` and `fast_sqeuclidean` on user-facing APIs - Next: comparing against scipy's - Next: Having feedback on [#20254](scikit-learn/scikit-learn#20254) would also help - Next: I need to block time to study Cython code. ### Mathis - `sklearn_benchmarks` - Adapting benchmark script to run on Margaret - Fix issue with profiling files too big to be deployed on Github Pages - Ensure deterministic benchmark results - Working on declarative pipeline specification - Next: run long HPO benchmarks on Margaret ### Arturo - Finished MOOC! - Finished filling [Loïc's notes](https://notes.inria.fr/rgSzYtubR6uSOQIfY9Fpvw#) to find questions with score under 60% (Issue [#432](INRIA/scikit-learn-mooc#432)) - started addressing easy-to-fix questions, resulting in gitlab MRs [#21](https://gitlab.inria.fr/learninglab/mooc-scikit-learn/mooc-scikit-learn-coordination/-/merge_requests/21) and [#22](https://gitlab.inria.fr/learninglab/mooc-scikit-learn/mooc-scikit-learn-coordination/-/merge_requests/22) - currently working on expanding the notes up to 70% - Continued cross-linking forum posts with issues in GitHub, resulting in [#444](INRIA/scikit-learn-mooc#444), [#445](INRIA/scikit-learn-mooc#445), [#446](INRIA/scikit-learn-mooc#446), [#447](INRIA/scikit-learn-mooc#447) and [#448](INRIA/scikit-learn-mooc#448) ### Jérémie - back from holidays, catching up - Mathis' benchmarks - trying to find what's going on with ASV benchmarks (asv should display the versions of all build and runtime depndencies for each run) ### Guillaume - back from holidays - Next: - release with Adrin - check the PR and issue trackers ### TODO / Next - Expand Loïc’s notes up to 70% (Arturo) - Create presentation to discuss my experience doing the MOOC (Arturo) - Help with the scikit-learn release (Olivier, Guillaume) - HR: Jeremy's renewal, Chiara's replacement (Gael) - Mathis's consulting gig (Olivier, Gael, Mathis)

ArturoAmorQ mentioned this issue Sep 24, 2021

Scoring model as a good practice #464

Merged

glemaitre closed this as completed in #464 Jan 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Scoring model as a good practice #444

Scoring model as a good practice #444

ArturoAmorQ commented Aug 25, 2021 •

edited

Loading

Scoring model as a good practice #444

Scoring model as a good practice #444

Comments

ArturoAmorQ commented Aug 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ArturoAmorQ commented Aug 25, 2021 •

edited

Loading