Skip to content

[Balancing strategy] How can I disable the balancing strategy for an AutoSklearnClassifier? #1431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
johnantonn opened this issue Mar 25, 2022 · 3 comments
Labels

Comments

@johnantonn
Copy link

How can I disable the balancing strategy parameter for an AutoSklearnClassifier?

I am experimenting with CASH optimization using AutoSklearn. I use a AutoSklearnClassifier instance and I have disabled feature preprocessing and data preprocessing components following the provided examples, like so:

include={
'classifier': classifiers,
'feature_preprocessor': ['no_preprocessing'],
'data_preprocessor': ['NoPreprocessing']
}

However, I noticed in cv_results that there is an additional balancing strategy hyperparameter that results in double the evaluations without affecting the final metrics. More specifically, it takes two values:

  • 'balancing:strategy': 'none'
  • 'balancing:strategy': 'weighted'

The parameter is referenced in the AutoSklearnClassifier API (screenshot provided below), but I am not quite sure how to disable it. Any ideas?

image

@eddiebergman
Copy link
Contributor

Hi @johnantonn,

So unfortunately I do not have a good solution for you. My only suggestion if you really need to disable it is to delete this line: https://github.com/automl/auto-sklearn/blob/master/autosklearn/pipeline/classification.py#L311. I tested it and it seems to work.

As for the reason why, you can see that around the line, there are 3 other steps DataPreprocessorChoice, etc.... These all have a get_available_components which process the include and exclude lines while Balancing step does not. I have no idea how to disable this and we will have to wait until the main developer is back (Mid April).

I apologies for the poor answer, if you find a better solution, please do let me know so I can document it.

Best,
Eddie

@johnantonn
Copy link
Author

Hi @eddiebergman,

Thanks for the prompt response, much appreciated.

Indeed not including the Balancing instance inside classification.py, as you suggested, worked. I also tried providing the argument strategy='none' and keeping Balancing in the pipeline steps list, but the results showed that both 'none' and 'weighted' balancing were used anyway.

I guess the code is missing a handler at this point, as you already noticed. I will update the question if I find more on this.

Best,
Giannis

@mfeurer
Copy link
Contributor

mfeurer commented Apr 19, 2022

Indeed, the solution by @eddiebergman is the only way to disable balancing that is possible right now, and there is a handler missing to do this programatically. We'll fix this together with #379 when we get there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants