Getting only TIMEOUT for PredefinedSplit #1274

mereldawu · 2021-10-28T21:12:06Z

Describe the bug

When passing PredefinedSplit as a resampling strategy, the result only shows timeout for even a small dataset. By using the default configuration, auto-sklearn can create successful trials in a couple of seconds.

To Reproduce

This is the minimal code I can come up with, based on the example here.

import pandas as pd
import numpy as np
import autosklearn.metrics
from sklearn.model_selection import PredefinedSplit, train_test_split
from benatar.models.automl import  AutoSklearn

# Using credit card public dataset to demonstrate the problem
df = pd.read_csv("https://github.com/raw/irenebenedetto/default-of-credit-card-clients/master/dataset/credit_cards_dataset.csv")
X_train, X_test = train_test_split(
    df, test_size=0.2, random_state=42
)
y_train = X_train.pop(X_train.columns[-1])

# Using a random column to create validation set, it's meaningless but also just to demonstrate the point
resampling_strategy = PredefinedSplit(
    test_fold=np.where(X_train.to_numpy()[:, 4] < np.mean(X_train.to_numpy()[:, 4]))[0]
)

autosk = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=600,
    per_run_time_limit=200,
    tmp_folder="./tmp/autosklearn",
    disable_evaluator_output=False,
    resampling_strategy=resampling_strategy,
    metric=autosklearn.metrics.accuracy,
    delete_tmp_folder_after_terminate=False,
    seed=42
)

autosk.fit(X_train, y_train)

By commenting out the resampling_strategy line, the trials run successfully.

I've also tried to increase the time_left_for_this_task and per_run_time_limit both to 6000, still only got TIMEOUT.

I also tried to run the example code and it ran with successfully generated trials.

I'm not sure if the issue is the dataset, how I'm using PredefinedSplit or?

Expected behavior

Generate multiple successful trials.

Actual behavior, stacktrace or logfile

Result from sprint statistics:
auto-sklearn results:
Dataset name: 1e6334d4-3831-11ec-9a9c-0255ac100090
Metric: accuracy
Number of target algorithm runs: 12
Number of successful target algorithm runs: 0
Number of crashed target algorithm runs: 0
Number of target algorithms that exceeded the time limit: 12
Number of target algorithms that exceeded the memory limit: 0

Logfile uploaded.

Environment and installation:

Please give details about your installation:

OS: Ubuntu 20.04.2 LTS (Focal Fossa) - a pod inside Kubeflow cluster
Is your installation in a virtual environment or conda environment: Normal python in a Kubeflow notebook
Python version: 3.7.1
Auto-sklearn version: 0.13.0

The text was updated successfully, but these errors were encountered:

eddiebergman · 2021-11-17T08:57:46Z

Hi @mereldawu,

Sorry to read about this issue, I meant to reply earlier but apologies that I did not. I'm not sure what would cause this issue as it should not cause a TIMEOUT if things weren't to work. Thank you for the full reproducible code example, we will look at this soon.

eddiebergman · 2021-12-08T16:25:50Z

Hi @mereldawu,

Sorry that I'm only getting to this now. I'm looking into it and getting the same time out issue.

Two things:

The ID column could be dropped.
You are using PredefinedSplit incorrectly. It should return a numeric array containing 1, 0, -1 and not a list of indicies.

selected = (X_train.to_numpy()[:, 4] < np.mean(X_train.to_numpy()[:, 4])).astype(int)
assert len(X_train) == len(selected) # an array with 1 for selected and 0 for not

However, even doing these things, I get a timeout error so I will keep looking for the reason for that.

EDIT: I'm surprised that example runs correctly given this fact...

EDIT_2: There's actually nothing in that example that says it works, given that the final accuracy presented is so low.

eddiebergman · 2021-12-08T17:04:42Z

So with this sample code where I fixed what PredefinedSplit uses, things work for me. we get some timeouts but also some completions. I'll try running for 10 minutes and see the results.

import os
import pickle

import pandas as pd
import numpy as np

from sklearn.model_selection import PredefinedSplit, train_test_split
from sklearn.datasets import load_iris

from autosklearn.classification import AutoSklearnClassifier


# Using credit card public dataset to demonstrate the problem
def user_data():
    df = pd.read_csv("https://github.com/raw/irenebenedetto/default-of-credit-card-clients/master/dataset/credit_cards_dataset.csv")
    df.drop(columns="ID", inplace=True)

    X_train, X_test = train_test_split(df, test_size=0.2, random_state=42)
    y_train = X_train.pop(X_train.columns[-1])

    selected = (X_train.to_numpy()[:, 4] < np.mean(X_train.to_numpy()[:, 4])).astype(int)

    strategy = PredefinedSplit(test_fold=selected)
    model_name = "user"
    return X_train, y_train, strategy, model_name, 240

"""
def sample_data():
    df = load_iris(as_frame=True)["frame"]

    X_train, X_test = train_test_split(df, test_size=0.2, random_state=42)
    y_train = X_train.pop(X_train.columns[-1])

    selected = (X_train.to_numpy()[:, 3] < np.mean(X_train.to_numpy()[:, 3])).astype(int)

    strategy = PredefinedSplit(test_fold=selected)
    model_name = "sample"
    return X_train, y_train, strategy, model_name, 30
"""

X_train, y_train, strategy, model_name, time = user_data()

model = None
if os.path.exists(model_name):
    with open(model_name, "rb") as f:
        model = pickle.load(f)
else:

    model = AutoSklearnClassifier(
        time_left_for_this_task=time,
        resampling_strategy=strategy,
    )
    model.fit(X_train, y_train)

    with open(model_name, "wb") as f:
        pickle.dump(model, f)

print(model.sprint_statistics())
print(model.leaderboard(detailed=True, ensemble_only=False))

auto-sklearn results:                                                                                                                                                                                                                                           
  Dataset name: b0e76f59-5846-11ec-9387-f47b09df72c1                                                                                                                                                                                                            
  Metric: accuracy                                                                                                                                                                                                                                              
  Best validation score: 0.820417                                                                                                                                                                                                                               
  Number of target algorithm runs: 13                                                                                                                                                                                                                           
  Number of successful target algorithm runs: 4                                                                                                                                                                                                                 
  Number of crashed target algorithm runs: 0                                                                                                                                                                                                                    
  Number of target algorithms that exceeded the time limit: 9                                                                                                                                                                                                   
  Number of target algorithms that exceeded the memory limit: 0

eddiebergman · 2021-12-08T17:14:53Z

An update with the 10 minute version, it appears it was just our example is incorrect and I'm not sure why sklearn does not complain given that array.

auto-sklearn results:                                                                                                                                                                                                                                           
  Dataset name: b083b909-5848-11ec-95c5-f47b09df72c1                                                                                                                                                                                                            
  Metric: accuracy                                                                                                                                                                                                                                              
  Best validation score: 0.820417                                                                                                                                                                                                                               
  Number of target algorithm runs: 16                                                                                                                                                                                                                           
  Number of successful target algorithm runs: 7                                                                                                                                                                                                                 
  Number of crashed target algorithm runs: 0                                                                                                                                                                                                                    
  Number of target algorithms that exceeded the time limit: 9                                                                                                                                                                                                   
  Number of target algorithms that exceeded the memory limit: 0

eddiebergman · 2021-12-08T17:15:47Z

Please let me know if this fixed your issue, I will update the example in the meantime and do a small investigation as to why we had silent errors and if we can catch them.

You may also want to perform a refit as specified in that example.

eddiebergman · 2021-12-08T21:25:11Z

Some further looking show's that there's not much we can do to detect a bad PredefinedSplit, it still returns two arrays all be it with some odd indices included into the test_set and some indices missing.

The only way I could think to test for this would require that the length of both splits add up to the original length before splitting. While this is generally the case, I don't think we should enforce this.

x = np.ones((100, 9))  # 100 rows, 9 features
y = np.ones((100,)) # 100 targets

splitter_good = PredefinedSplitter(test_fold=[1]*50 + [0]*50)
splitter_bad = PredefinedSplitter(test_fold=list(range(0, 51)))

# Correctly creates a 50/50 split
print(next(splitter_good.split(x, y)))
(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
        34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]),
 array([50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
        67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
        84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]))

# Test split only has element 0 for some reason
print(next(splitter_bad.split(x, y)))
(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
        35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]),
 array([0]))
 
 # To enforce, I think it's a bad idea
 test_idxs, train_idxs = next(splitter.split(x, y))
 assert len(test_idxs) + len(train_idxs) == len(x)

eddiebergman · 2021-12-13T13:27:28Z

Hi @mereldawu, this was resolved by PR #1340 updating our example on how to use PredefinedSplit which gave you the errors. There is nothing we can do about automatically detecting bad splits returned by the custom splitter.

eddiebergman added the bug label Nov 17, 2021

eddiebergman mentioned this issue Dec 8, 2021

Update example to use predefined_split properly #1340

Merged

eddiebergman added the Feedback-Required label Dec 8, 2021

eddiebergman mentioned this issue Dec 8, 2021

Allow argument to specfiy how auto-sklearn handles compressing dataset size #1341

Merged

eddiebergman linked a pull request Dec 8, 2021 that will close this issue

Allow argument to specfiy how auto-sklearn handles compressing dataset size #1341

Merged

eddiebergman removed a link to a pull request Dec 8, 2021

Allow argument to specfiy how auto-sklearn handles compressing dataset size #1341

Merged

eddiebergman linked a pull request Dec 8, 2021 that will close this issue

Allow argument to specfiy how auto-sklearn handles compressing dataset size #1341

Merged

eddiebergman closed this as completed Dec 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Getting only TIMEOUT for PredefinedSplit #1274

Getting only TIMEOUT for PredefinedSplit #1274

mereldawu commented Oct 28, 2021

eddiebergman commented Nov 17, 2021

Uh oh!

eddiebergman commented Dec 8, 2021 •

edited

Loading

Uh oh!

eddiebergman commented Dec 8, 2021

Uh oh!

eddiebergman commented Dec 8, 2021

Uh oh!

eddiebergman commented Dec 8, 2021 •

edited

Loading

Uh oh!

eddiebergman commented Dec 8, 2021

Uh oh!

eddiebergman commented Dec 13, 2021

Uh oh!

Getting only TIMEOUT for PredefinedSplit #1274

Getting only TIMEOUT for PredefinedSplit #1274

Comments

mereldawu commented Oct 28, 2021

Describe the bug

To Reproduce

Expected behavior

Actual behavior, stacktrace or logfile

Environment and installation:

eddiebergman commented Nov 17, 2021

Uh oh!

eddiebergman commented Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eddiebergman commented Dec 8, 2021

Uh oh!

eddiebergman commented Dec 8, 2021

Uh oh!

eddiebergman commented Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eddiebergman commented Dec 8, 2021

Uh oh!

eddiebergman commented Dec 13, 2021

Uh oh!

eddiebergman commented Dec 8, 2021 •

edited

Loading

eddiebergman commented Dec 8, 2021 •

edited

Loading