You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am new to onnx in general so apologies if the issue is misplaced or I am missing something fundamental.
I'm coming from the tool autosklearn and planning to introduce some basic onnx support by exporting found models after doing some optimization over possible pipelines. These pipelines will mostly consist of an ensemble (VotingClassifier) which they themselves contain Pipelines with disjoint imputation strategies, feature preprocessing and estimators.
Based on the error below, it seems that using a VotingClassifier would require all features to be numeric (or at least of the same TensorType) to be viable? Is this correct? Is there something fundamental which would prevent the SklearnVotingClassifier operator from working with more than 1 input?
I am linking to this issue here in case anyone using autosklearn would like to enable onnx support and would be able to contribute! I've included a reproducible example and the traceback
Reproducible Example
Apologies for using openml, sklearn toy datasets do not have such varied column types.
The converter does expect to have one tensor as input. You can use a ColumnTransformer to concatenate all columns into a single one. Then, I put the encoder first as onnx only support numerical values for Imputer. This is the pipeline validated in PR #1030.
model=Pipeline(
steps=[
(
"concat",
ColumnTransformer(
[("concat", "passthrough", list(range(X.shape[1])))],
sparse_threshold=0,
),
),
(
"voting",
VotingClassifier(
flatten_transform=False,
estimators=[
(
"est",
Pipeline(
steps=[
# This encoder is placed before SimpleImputer because# onnx does not support text for Imputer
("encoder", OrdinalEncoder()),
(
"imputer",
SimpleImputer(strategy="most_frequent"),
),
(
"rf",
RandomForestClassifier(
n_estimators=4,
max_depth=4,
random_state=0,
),
),
],
),
),
],
),
),
]
)
Hi there,
I am new to
onnx
in general so apologies if the issue is misplaced or I am missing something fundamental.I'm coming from the tool
autosklearn
and planning to introduce some basic onnx support by exporting found models after doing some optimization over possible pipelines. These pipelines will mostly consist of an ensemble (VotingClassifier
) which they themselves containPipelines
with disjoint imputation strategies, feature preprocessing and estimators.Based on the error below, it seems that using a
VotingClassifier
would require all features to be numeric (or at least of the same TensorType) to be viable? Is this correct? Is there something fundamental which would prevent theSklearnVotingClassifier
operator from working with more than 1 input?I am linking to this issue here in case anyone using
autosklearn
would like to enableonnx
support and would be able to contribute! I've included a reproducible example and the tracebackReproducible Example
Apologies for using
openml
, sklearn toy datasets do not have such varied column types.Traceback
The text was updated successfully, but these errors were encountered: