How to weight a given class? [class balancing] #1596

UnixJunkie · 2022-10-10T23:44:11Z

Is it possible to give a list of weights that should be tried for a given class?
I have some data where very heavy reweighting of the under-represented class is necessary to get any good classifier.

I don't know in advance what is the weight to use; apparently it depends of the ML being used.
So, this is another hyper parameter that needs to be optimized.

aron-bram · 2022-10-20T16:53:32Z

Hi,

Unfortunately we do not provide a way to give a list of class weights to be tried out during the optimization process.

Although, by default auto-sklearn should handle the imbalance in the dataset by also including estimators in the search that use sample/class weights, and sets their weights to be the inverse of each class's frequency (refer to Balancing for implementation).
Similarly to how sklearn's "balance" value for the class_weights parameter works with some estimators.

May I ask, what performance you reached on this dataset using auto-sklearn and how it compared to some other methods?

In general, an alternative would be to oversample the under-represented class or to undersample the over-represented one. Not sure if this is a good enough option for you, though.

Or you could define your custom metric in auto-sklearn that somehow takes the imbalance of the classes into account.

You may also be interested in defining your own balancing component (Extending Auto-Sklearn with Classification Component example)

I hope I could help, and please feel free to follow up on it.

Let me know @eddiebergman if I forgot about something.

UnixJunkie · 2022-10-21T00:58:18Z

Class weight is just another hyperparameter that needs to be optimized in some datasets, with some ML methods (like SVM).
Using inverse of the class frequency is just an initial guess. Sometimes very far a guess from what optimization would give you.

auto-sklearn miserably failed on this dataset; while by hand I could optimize a model using liblinear (and very strong class weighting for the under-represented class). So, auto-skelarn AUC's was 0.5; mine was 0.58 (yes, it is a hard binary classification dataset).

Trying to resample the classes doesn't help on this dataset. I tried bagging for class balancing.

There are already metrics in there that take class imbalance into account (e.g. AUC if you output probabilities is fine).

FYI, caret allows users to pass the class weights to try to all methods that support class weights.
Although caret doesn't do it right: it should be optimized like all other hyperparameters, not scanned by the user.

aron-bram · 2022-10-21T10:30:12Z

We do realize that handling it as a hyperparameter would improve results achieved on such extremely unbalanced datasets. It just hasn't been a prority for us given the lack of such requests.
But thank you for your suggestion, it indeed has the potential to improve the library.

We will consider adding this as a floating-point hyperparameter, which could be used by the Balancing class. However, I can not
yet give you an exact date by which this feature will be included unfortunately.
Is this an urgent issue for you?

If so, then you could implement your own balancing class as indicated at the bottom of my previous answer. This is far from being the optimal solution, but it should work. I can try to give you a hint on how to achieve this with a dummy implementation soon.

Thank you for your patience.

UnixJunkie · 2022-10-24T00:56:36Z

This is not urgent; auto-sklearn fails on this dataset, so I don't use it.

eddiebergman added enhancement A new improvement or feature question labels Oct 19, 2022

eddiebergman mentioned this issue Jul 21, 2023

What's in store for Auto-Sklearn? -- From the Developers #1677

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to weight a given class? [class balancing] #1596

How to weight a given class? [class balancing] #1596

UnixJunkie commented Oct 10, 2022

aron-bram commented Oct 20, 2022 •

edited by eddiebergman

Loading

Uh oh!

UnixJunkie commented Oct 21, 2022

Uh oh!

aron-bram commented Oct 21, 2022

Uh oh!

UnixJunkie commented Oct 24, 2022

Uh oh!

How to weight a given class? [class balancing] #1596

How to weight a given class? [class balancing] #1596

Comments

UnixJunkie commented Oct 10, 2022

aron-bram commented Oct 20, 2022 • edited by eddiebergman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

UnixJunkie commented Oct 21, 2022

Uh oh!

aron-bram commented Oct 21, 2022

Uh oh!

UnixJunkie commented Oct 24, 2022

Uh oh!

aron-bram commented Oct 20, 2022 •

edited by eddiebergman

Loading