Skip to content

Feature Request: AutoSklearnOutlierDetector #578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Y-oHr-N opened this issue Nov 7, 2018 · 5 comments
Open

Feature Request: AutoSklearnOutlierDetector #578

Y-oHr-N opened this issue Nov 7, 2018 · 5 comments
Labels
enhancement A new improvement or feature

Comments

@Y-oHr-N
Copy link

Y-oHr-N commented Nov 7, 2018

Hello,

scikit-learn 0.20 provides more consistent outlier detection API.
https://speakerdeck.com/albertcthomas/anomaly-detection-in-scikit-learn-ongoing-work-and-future-developments

  • covariance.EllipticEnvelope
  • svm.OneClassSVM
  • ensemble.IsolationForest
  • neighbors.LocalOutlierFactor

So I want an estimator that fits all outlier detection models like AutoSklearnClassifier.

Thank you.

@mfeurer
Copy link
Contributor

mfeurer commented Nov 19, 2018

Just for clarification, do you think that these should be part of the pipeline tuned by Auto-sklearn or that there should be a standalone mode AutoSklearnOutlierDetector?

According to the title you want the second thing. From my understanding, this is an unsupervised learning problem. The central assumption in Auto-sklearn is that there as a loss function which can be used to tune the hyperparameters. What would such a loss function look like for outlier detection?

@Y-oHr-N
Copy link
Author

Y-oHr-N commented Nov 21, 2018

Thank you for your reply.
As far as I know, threre are two metrics for outlier function.

One is the square of the geometric mean of precision and recall.

outliers - Metrics for one-class classification - Cross Validated
https://stats.stackexchange.com/questions/192530/metrics-for-one-class-classification
Lee, W. S, and Liu, B., "Learning with positive and unlabeled examples using weighted Logistic Regression," In Proceedings of ICML, pp. 448-455, 2003.
https://www.aaai.org/Papers/ICML/2003/ICML03-060.pdf

The other is the area under the Mass-Volume curve.

Goix, N., "How to evaluate the quality of unsupervised anomaly detection algorithms?" In ICML Anomaly Detection Workshop, 2016.
https://arxiv.org/pdf/1607.01152.pdf
Thomas, A., Clémençon, S., Feuillard, V., and Gramfort, A., "Learning hyperparameters for unsupervised anomaly detection," In ICML Anomaly Detection Workshop, 2016.
https://github.com/albertcthomas/anomaly_tuning

I implemented two scikit-learn compatible metrics.
https://github.com/HazureChi/kenchi/blob/master/kenchi/metrics.py

@mfeurer
Copy link
Contributor

mfeurer commented Nov 30, 2018

I'm afraid that I won't have the time to implement something here. Also, I think this is somewhat out of scope for Auto-sklearn if the metrics are not in scikit-learn yet.

@franchuterivera franchuterivera added the enhancement A new improvement or feature label Feb 17, 2021
@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs for the next 7 days. Thank you for your contributions.

@github-actions github-actions bot added the stale label May 5, 2021
@mfeurer mfeurer removed the stale label May 6, 2021
@jmren168
Copy link

jmren168 commented Feb 24, 2023

Hi @mfeurer,

Is it possible to create a customized one-class SVM as a two-class SVM, and then put it into AutoSklearnClassifier?
What I'm trying to do is

  1. add a customized classifier (input: a one-class SVM, and X_train and pseudo_y_train)
  2. make a customized score
    if pseudo_y_train are all 0 (only one class), then the score is 1e-5;
    otherwise, give a higher socre if it classifies outliers correctly
  3. put the customized classifier and the customized score into AutoSklearnClassifier

Does it sound reasonable and workable?

Any comments are highly appreciated.

JM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new improvement or feature
Projects
None yet
Development

No branches or pull requests

4 participants