-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Get Selected Features by Preprocessing Steps #524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, I'm not sure if I can follow the steps you did. Do you want to find the selected features after configuration? If yes, you can achieve it with this piece of code:
If that's not the case please provide a brief code example of what you would like to where I can fill in the missing pieces. |
Yeah that's pretty much it! The steps attribute was the key! Thanks! Thank you! |
Hi , The above solution gives the Feature importance scores but does not give the features related to those scores. I ask because I am passing categorical features in I would appreciate any guidance to get Features and Feature importance scores used by the final model. Thanks. |
In scikit-learn this could be done via the get_feature_names method. However, Auto-sklearn does not implement this. I would be very happy about a contribution of this functionality, though. |
I would love to contribute. How can I help? can you point me in the right direction? |
All relevant pipeline code lives in the pipeline subpackage: https://github.com/automl/auto-sklearn/tree/development/autosklearn/pipeline I must admit that I don't know how get_feature_names works exactly and scikit-learn's pipeline class itself doesn't implement it. Maybe figuring out how this is supposed to be used in a scikit-learn pipeline would be a good first step? |
@aimanakheel can you share the code for getting feature importance? thanks! |
@kevinsay My New Input: pipeline = list(automl.automl.models.values())[0] XgClass = pipeline.final_estimator.choice # .estimator.coef XGEstmtr = XgClass.estimator feature_importance = XGEstmtr.coef_ My Output: |
Hi, I'm interested in using the results of the models found in a more independent way. Is there any way to get the selected features when the preprocessing step is a feature selection algorithm?
So far my approach is:
This is similar to what you did in the function test_weighting_effect (test file test_balancing.py). The thing is when I call the fit_transformer method the data transformed that is returned is a numpy array and not a dataframe (it doesn't have the column headers) so I can't know which were the features kept and removed.
Is there any way I can accomplished this? Perhaps in a easier way than this approach?
The text was updated successfully, but these errors were encountered: