FastTree: Instantiate feature map for disk transpose and make Generalized Additive Models predictor resilient when feature map is not available.

We drop features from FastTree gradient boosting decision tree during training that offer little to no value such as features that have zero instance count during training or features that don't have enough instance count for unique feature values. Due to this the feature count in training set can be less than or equal to the feature count in the input features vector from the user, hence we use a featuremap internally to map dataset training features to the input features. 

Issue# 1:
If no features are dropped or filtered during training then feature map is not created. FastTree handles a null featuremap but Generalized Additive Model(GAM) predictor does not.

Issue# 1.1:
Before training starts in FastTree we go through a data preparation step where we transpose the dataset and eliminate examples that have missing feature values. The transpose can be done in memory or on disk(recommended for larger dataset). In disk transpose the code was not filtering features that were not supposed to be included in training and it was also not creating a feature map when one was supposed to be created. Hence a null feature map was passed to GAM predictor which was not resilient to it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FastTree: Instantiate feature map for disk transpose and make Generalized Additive Models predictor resilient when feature map is not available. #123

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FastTree: Instantiate feature map for disk transpose and make Generalized Additive Models predictor resilient when feature map is not available. #123

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions