Skip to content

Commit ab5c016

Browse files
authored
Change HP Name & Include Text example (#1410)
* rename "ngram_range" to "ngram_upper_bound" this includes renaming it in all *csv and *json files for metalearning * rename "ngram_range" to "ngram_upper_bound" this includes renaming it in all *csv and *json files for metalearning * handle the following issue #1373 (comment) this commit fixes the first 3 bullet points on the to do list. 1. rename hyperparameter "ngram_range" --> "ngram_upper_bound" this includes changing all *csv and *json files 2. Create a new textpreprocessing example_text_preprocessing.py, this new example features the 20Newsgroups dataset import in example_text_preprocessing.py to long, but i can not come up with a good solution * handle the following issue #1373 (comment) this commit fixes the first 3 bullet points on the to do list. 1. rename hyperparameter "ngram_range" --> "ngram_upper_bound" this includes changing all *csv and *json files 2. Create a new textpreprocessing example_text_preprocessing.py, this new example features the 20Newsgroups dataset import in example_text_preprocessing.py to long, but i can not come up with a good solution include feedback from 02.24. * handle the following issue #1373 (comment) this commit fixes the first 3 bullet points on the to do list. 1. rename hyperparameter "ngram_range" --> "ngram_upper_bound" this includes changing all *csv and *json files 2. Create a new textpreprocessing example_text_preprocessing.py, this new example features the 20Newsgroups dataset import in example_text_preprocessing.py to long, but i can not come up with a good solution include feedback from 02.24. * handle the following issue #1373 (comment) this commit fixes the first 3 bullet points on the to do list. 1. rename hyperparameter "ngram_range" --> "ngram_upper_bound" this includes changing all *csv and *json files 2. Create a new textpreprocessing example_text_preprocessing.py, this new example features the 20Newsgroups dataset import in example_text_preprocessing.py to long, but i can not come up with a good solution include feedback from 02.24. * handle the following issue #1373 (comment) this commit fixes the first 3 bullet points on the to do list. 1. rename hyperparameter "ngram_range" --> "ngram_upper_bound" this includes changing all *csv and *json files 2. Create a new textpreprocessing example_text_preprocessing.py, this new example features the 20Newsgroups dataset import in example_text_preprocessing.py to long, but i can not come up with a good solution include feedback from 02.24. * handle the following issue #1373 (comment) this commit fixes the first 3 bullet points on the to do list. 1. rename hyperparameter "ngram_range" --> "ngram_upper_bound" this includes changing all *csv and *json files 2. Create a new textpreprocessing example_text_preprocessing.py, this new example features the 20Newsgroups dataset import in example_text_preprocessing.py to long, but i can not come up with a good solution include feedback from 02.24. * handle the following issue #1373 (comment) this commit fixes the first 3 bullet points on the to do list. 1. rename hyperparameter "ngram_range" --> "ngram_upper_bound" this includes changing all *csv and *json files 2. Create a new textpreprocessing example_text_preprocessing.py, this new example features the 20Newsgroups dataset import in example_text_preprocessing.py to long, but i can not come up with a good solution include feedback from 02.24. * limit 20NG to 5 labels. automl.leaderboard has problems if the ensamble contains only one model. Therefore we reduced the problem complexity * limit 20NG to 5 labels. automl.leaderboard has problems if the ensamble contains only one model. Therefore we reduced the problem complexity * limit 20NG to 2 labels. automl.leaderboard has problems if the ensamble contains only one model. Therefore we reduced the problem complexity * limit 20NG to 2 labels. automl.leaderboard has problems if the ensamble contains only one model. Therefore we reduced the problem complexity
1 parent 00b8e6e commit ab5c016

File tree

122 files changed

+951
-944
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

122 files changed

+951
-944
lines changed

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_None_10CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_None_3CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_None_5CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_None_holdout_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_SH-eta4-i_10CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_SH-eta4-i_3CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_SH-eta4-i_5CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/balanced_accuracy/askl2_portfolios/RF_SH-eta4-i_holdout_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_None_10CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_None_3CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_None_5CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_None_holdout_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_SH-eta4-i_10CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_SH-eta4-i_3CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_SH-eta4-i_5CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/log_loss/askl2_portfolios/RF_SH-eta4-i_holdout_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_None_10CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_None_3CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_None_5CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_None_holdout_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_SH-eta4-i_10CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_SH-eta4-i_3CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_SH-eta4-i_5CV_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/experimental/roc_auc/askl2_portfolios/RF_SH-eta4-i_holdout_iterative_es_if.json

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

autosklearn/metalearning/files/accuracy_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/accuracy_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/accuracy_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/accuracy_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/average_precision_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/average_precision_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/average_precision_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/average_precision_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/balanced_accuracy_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/balanced_accuracy_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/balanced_accuracy_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/balanced_accuracy_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_macro_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_macro_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_macro_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_macro_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_micro_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_micro_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_micro_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_micro_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_samples_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_samples_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_samples_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_samples_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_weighted_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_weighted_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_weighted_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/f1_weighted_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/log_loss_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/log_loss_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/log_loss_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/log_loss_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/mean_absolute_error_regression_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/mean_absolute_error_regression_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/mean_squared_error_regression_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/mean_squared_error_regression_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/mean_squared_log_error_regression_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/mean_squared_log_error_regression_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/median_absolute_error_regression_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/median_absolute_error_regression_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_macro_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_macro_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_macro_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_macro_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_micro_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_micro_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_micro_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_micro_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_samples_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_samples_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_samples_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_samples_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_weighted_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_weighted_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_weighted_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/precision_weighted_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/r2_regression_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/r2_regression_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_macro_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_macro_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_macro_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_macro_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_micro_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_micro_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_micro_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_micro_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_samples_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_samples_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_samples_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_samples_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_weighted_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_weighted_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_weighted_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/recall_weighted_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/roc_auc_binary.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/roc_auc_binary.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/roc_auc_multiclass.classification_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/roc_auc_multiclass.classification_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/root_mean_squared_error_regression_dense/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/metalearning/files/root_mean_squared_error_regression_sparse/configurations.csv

100755100644
Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

autosklearn/pipeline/components/data_preprocessing/text_encoding/bag_of_word_encoding.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,13 @@
1717
class BagOfWordEncoder(AutoSklearnPreprocessingAlgorithm):
1818
def __init__(
1919
self,
20-
ngram_range: int = 1,
20+
ngram_upper_bound: int = 1,
2121
min_df_choice: str = "min_df_absolute",
2222
min_df_absolute: int = 0,
2323
min_df_relative: float = 0.01,
2424
random_state: Optional[Union[int, np.random.RandomState]] = None,
2525
) -> None:
26-
self.ngram_range = ngram_range
26+
self.ngram_upper_bound = ngram_upper_bound
2727
self.random_state = random_state
2828
self.min_df_choice = min_df_choice
2929
self.min_df_absolute = min_df_absolute
@@ -46,13 +46,13 @@ def fit(
4646
if self.min_df_choice == "min_df_absolute":
4747
self.preprocessor = CountVectorizer(
4848
min_df=self.min_df_absolute,
49-
ngram_range=(1, self.ngram_range),
49+
ngram_range=(1, self.ngram_upper_bound),
5050
)
5151

5252
elif self.min_df_choice == "min_df_relative":
5353
self.preprocessor = CountVectorizer(
5454
min_df=self.min_df_relative,
55-
ngram_range=(1, self.ngram_range),
55+
ngram_range=(1, self.ngram_upper_bound),
5656
)
5757

5858
else:
@@ -98,8 +98,8 @@ def get_hyperparameter_search_space(
9898
dataset_properties: Optional[DATASET_PROPERTIES_TYPE] = None,
9999
) -> ConfigurationSpace:
100100
cs = ConfigurationSpace()
101-
hp_ngram_range = CSH.UniformIntegerHyperparameter(
102-
name="ngram_range", lower=1, upper=3, default_value=1
101+
hp_ngram_upper_bound = CSH.UniformIntegerHyperparameter(
102+
name="ngram_upper_bound", lower=1, upper=3, default_value=1
103103
)
104104
hp_min_df_choice_bow = CSH.CategoricalHyperparameter(
105105
"min_df_choice", choices=["min_df_absolute", "min_df_relative"]
@@ -112,7 +112,7 @@ def get_hyperparameter_search_space(
112112
)
113113
cs.add_hyperparameters(
114114
[
115-
hp_ngram_range,
115+
hp_ngram_upper_bound,
116116
hp_min_df_choice_bow,
117117
hp_min_df_absolute_bow,
118118
hp_min_df_relative_bow,

autosklearn/pipeline/components/data_preprocessing/text_encoding/bag_of_word_encoding_distinct.py

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,13 @@
1616
class BagOfWordEncoder(AutoSklearnPreprocessingAlgorithm):
1717
def __init__(
1818
self,
19-
ngram_range: int = 1,
19+
ngram_upper_bound: int = 1,
2020
min_df_choice: str = "min_df_absolute",
2121
min_df_absolute: int = 0,
2222
min_df_relative: float = 0.01,
2323
random_state: Optional[Union[int, np.random.RandomState]] = None,
2424
) -> None:
25-
self.ngram_range = ngram_range
25+
self.ngram_upper_bound = ngram_upper_bound
2626
self.random_state = random_state
2727
self.min_df_choice = min_df_choice
2828
self.min_df_absolute = min_df_absolute
@@ -40,7 +40,8 @@ def fit(
4040

4141
for feature in X.columns:
4242
vectorizer = CountVectorizer(
43-
min_df=self.min_df_absolute, ngram_range=(1, self.ngram_range)
43+
min_df=self.min_df_absolute,
44+
ngram_range=(1, self.ngram_upper_bound),
4445
).fit(X[feature])
4546
self.preprocessor[feature] = vectorizer
4647

@@ -50,7 +51,8 @@ def fit(
5051

5152
for feature in X.columns:
5253
vectorizer = CountVectorizer(
53-
min_df=self.min_df_relative, ngram_range=(1, self.ngram_range)
54+
min_df=self.min_df_relative,
55+
ngram_range=(1, self.ngram_upper_bound),
5456
).fit(X[feature])
5557
self.preprocessor[feature] = vectorizer
5658
else:
@@ -102,8 +104,8 @@ def get_hyperparameter_search_space(
102104
dataset_properties: Optional[DATASET_PROPERTIES_TYPE] = None,
103105
) -> ConfigurationSpace:
104106
cs = ConfigurationSpace()
105-
hp_ngram_range = CSH.UniformIntegerHyperparameter(
106-
name="ngram_range", lower=1, upper=3, default_value=1
107+
hp_ngram_upper_bound = CSH.UniformIntegerHyperparameter(
108+
name="ngram_upper_bound", lower=1, upper=3, default_value=1
107109
)
108110
hp_min_df_choice_bow = CSH.CategoricalHyperparameter(
109111
"min_df_choice", choices=["min_df_absolute", "min_df_relative"]
@@ -116,7 +118,7 @@ def get_hyperparameter_search_space(
116118
)
117119
cs.add_hyperparameters(
118120
[
119-
hp_ngram_range,
121+
hp_ngram_upper_bound,
120122
hp_min_df_choice_bow,
121123
hp_min_df_absolute_bow,
122124
hp_min_df_relative_bow,

autosklearn/pipeline/components/data_preprocessing/text_encoding/tfidf_encoding.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,14 @@
1717
class TfidfEncoder(AutoSklearnPreprocessingAlgorithm):
1818
def __init__(
1919
self,
20-
ngram_range: int = 1,
20+
ngram_upper_bound: int = 1,
2121
use_idf: bool = True,
2222
min_df_choice: str = "min_df_absolute",
2323
min_df_absolute: int = 0,
2424
min_df_relative: float = 0.01,
2525
random_state: Optional[Union[int, np.random.RandomState]] = None,
2626
) -> None:
27-
self.ngram_range = ngram_range
27+
self.ngram_upper_bound = ngram_upper_bound
2828
self.random_state = random_state
2929
self.use_idf = use_idf
3030
self.min_df_choice = min_df_choice
@@ -50,14 +50,14 @@ def fit(
5050
self.preprocessor = TfidfVectorizer(
5151
min_df=self.min_df_absolute,
5252
use_idf=self.use_idf,
53-
ngram_range=(1, self.ngram_range),
53+
ngram_range=(1, self.ngram_upper_bound),
5454
)
5555

5656
elif self.min_df_choice == "min_df_relative":
5757
self.preprocessor = TfidfVectorizer(
5858
min_df=self.min_df_relative,
5959
use_idf=self.use_idf,
60-
ngram_range=(1, self.ngram_range),
60+
ngram_range=(1, self.ngram_upper_bound),
6161
)
6262

6363
else:
@@ -103,8 +103,8 @@ def get_hyperparameter_search_space(
103103
dataset_properties: Optional[DATASET_PROPERTIES_TYPE] = None,
104104
) -> ConfigurationSpace:
105105
cs = ConfigurationSpace()
106-
hp_ngram_range = CSH.UniformIntegerHyperparameter(
107-
name="ngram_range", lower=1, upper=3, default_value=1
106+
hp_ngram_upper_bound = CSH.UniformIntegerHyperparameter(
107+
name="ngram_upper_bound", lower=1, upper=3, default_value=1
108108
)
109109
hp_use_idf = CSH.CategoricalHyperparameter("use_idf", choices=[False, True])
110110
hp_min_df_choice = CSH.CategoricalHyperparameter(
@@ -118,7 +118,7 @@ def get_hyperparameter_search_space(
118118
)
119119
cs.add_hyperparameters(
120120
[
121-
hp_ngram_range,
121+
hp_ngram_upper_bound,
122122
hp_use_idf,
123123
hp_min_df_choice,
124124
hp_min_df_absolute,
Lines changed: 59 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,79 +1,84 @@
11
# -*- encoding: utf-8 -*-
22
"""
33
==================
4-
Text Preprocessing
4+
Text preprocessing
55
==================
6-
This example shows, how to use text features in *auto-sklearn*. *auto-sklearn* can automatically
7-
encode text features if they are provided as string type in a pandas dataframe.
86
9-
For processing text features you need a pandas dataframe and set the desired
10-
text columns to string and the categorical columns to category.
7+
The following example shows how to fit a simple NLP problem with
8+
*auto-sklearn*.
119
12-
*auto-sklearn* text embedding creates a bag of words count.
10+
For an introduction to text preprocessing you can follow these links:
11+
1. https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html
12+
2. https://machinelearningmastery.com/clean-text-machine-learning-python/
1313
"""
14+
from pprint import pprint
15+
16+
import pandas as pd
1417
import sklearn.metrics
15-
import sklearn.datasets
18+
from sklearn.datasets import fetch_20newsgroups
19+
1620
import autosklearn.classification
1721

1822
############################################################################
1923
# Data Loading
2024
# ============
25+
cats = ["comp.sys.ibm.pc.hardware", "rec.sport.baseball"]
26+
X_train, y_train = fetch_20newsgroups(
27+
subset="train", # select train set
28+
shuffle=True, # shuffle the data set for unbiased validation results
29+
random_state=42, # set a random seed for reproducibility
30+
categories=cats, # select only 2 out of 20 labels
31+
return_X_y=True, # 20NG dataset consists of 2 columns X: the text data, y: the label
32+
) # load this two columns separately as numpy array
33+
34+
X_test, y_test = fetch_20newsgroups(
35+
subset="test", # select test set for unbiased evaluation
36+
categories=cats, # select only 2 out of 20 labels
37+
return_X_y=True, # 20NG dataset consists of 2 columns X: the text data, y: the label
38+
) # load this two columns separately as numpy array
2139

22-
X, y = sklearn.datasets.fetch_openml(data_id=40945, return_X_y=True)
23-
24-
# by default, the columns which should be strings are not formatted as such
25-
print(f"{X.info()}\n")
26-
27-
# manually convert these to string columns
28-
X = X.astype(
29-
{
30-
"name": "string",
31-
"ticket": "string",
32-
"cabin": "string",
33-
"boat": "string",
34-
"home.dest": "string",
35-
}
36-
)
40+
############################################################################
41+
# Creating a pandas dataframe
42+
# ===========================
43+
# Both categorical and text features are often strings. Python Pandas stores python stings
44+
# in the generic `object` type. Please ensure that the correct
45+
# `dtype <https://pandas.pydata.org/docs/user_guide/basics.html#dtypes>`_ is applied to the correct
46+
# column.
3747

38-
# now *auto-sklearn* handles the string columns with its text feature preprocessing pipeline
48+
# create a pandas dataframe for training labeling the "Text" column as sting
49+
X_train = pd.DataFrame({"Text": pd.Series(X_train, dtype="string")})
3950

40-
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
41-
X, y, random_state=1
42-
)
51+
# create a pandas dataframe for testing labeling the "Text" column as sting
52+
X_test = pd.DataFrame({"Text": pd.Series(X_test, dtype="string")})
4353

44-
cls = autosklearn.classification.AutoSklearnClassifier(
45-
time_left_for_this_task=30,
46-
# Bellow two flags are provided to speed up calculations
47-
# Not recommended for a real implementation
48-
initial_configurations_via_metalearning=0,
49-
smac_scenario_args={"runcount_limit": 1},
54+
############################################################################
55+
# Build and fit a classifier
56+
# ==========================
57+
58+
# create an autosklearn Classifier or Regressor depending on your task at hand.
59+
automl = autosklearn.classification.AutoSklearnClassifier(
60+
time_left_for_this_task=60,
61+
per_run_time_limit=30,
62+
tmp_folder="/tmp/autosklearn_text_example_tmp",
5063
)
5164

52-
cls.fit(X_train, y_train, X_test, y_test)
53-
54-
predictions = cls.predict(X_test)
55-
print("Accuracy score", sklearn.metrics.accuracy_score(y_test, predictions))
65+
automl.fit(X_train, y_train, dataset_name="20_Newsgroups") # fit the automl model
5666

67+
############################################################################
68+
# View the models found by auto-sklearn
69+
# =====================================
5770

58-
X, y = sklearn.datasets.fetch_openml(data_id=40945, return_X_y=True, as_frame=True)
59-
X = X.select_dtypes(exclude=["object"])
71+
print(automl.leaderboard())
6072

61-
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
62-
X, y, random_state=1
63-
)
73+
############################################################################
74+
# Print the final ensemble constructed by auto-sklearn
75+
# ====================================================
6476

65-
cls = autosklearn.classification.AutoSklearnClassifier(
66-
time_left_for_this_task=30,
67-
# Bellow two flags are provided to speed up calculations
68-
# Not recommended for a real implementation
69-
initial_configurations_via_metalearning=0,
70-
smac_scenario_args={"runcount_limit": 1},
71-
)
77+
pprint(automl.show_models(), indent=4)
7278

73-
cls.fit(X_train, y_train, X_test, y_test)
79+
###########################################################################
80+
# Get the Score of the final ensemble
81+
# ===================================
7482

75-
predictions = cls.predict(X_test)
76-
print(
77-
"Accuracy score without text preprocessing",
78-
sklearn.metrics.accuracy_score(y_test, predictions),
79-
)
83+
predictions = automl.predict(X_test)
84+
print("Accuracy score:", sklearn.metrics.accuracy_score(y_test, predictions))

0 commit comments

Comments
 (0)