Skip to content

FastTree: Instantiate feature map for disk transpose and make Generalized Additive Models predictor resilient when feature map is not available. #122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 12, 2018
Merged
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
maml.exe TrainTest test=%Data% tr=BinaryClassificationGamTrainer dout=%Output% data=%Data% out=%Output% seed=1
Not adding a normalizer.
Making per-feature arrays
Changing data from row-wise to column-wise
Warning: Skipped 16 instances with missing features during training
Processed 683 instances
Binning and forming Feature objects
Starting to train ...
Training calibrator.
TEST POSITIVE RATIO: 0.3448 (241.0/(241.0+458.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 227 | 14 | 0.9419
negative || 13 | 445 | 0.9716
||======================
Precision || 0.9458 | 0.9695 |
OVERALL 0/1 ACCURACY: 0.961373
LOG LOSS/instance: 0.145652
Test-set entropy (prior Log-Loss/instance): 0.929318
LOG-LOSS REDUCTION (RIG): 84.326961
AUC: 0.991198

OVERALL RESULTS
---------------------------------------
AUC: 0.991198 (0.0000)
Accuracy: 0.961373 (0.0000)
Positive precision: 0.945833 (0.0000)
Positive recall: 0.941909 (0.0000)
Negative precision: 0.969499 (0.0000)
Negative recall: 0.971616 (0.0000)
Log-loss: 0.145652 (0.0000)
Log-loss reduction: 84.326961 (0.0000)
F1 Score: 0.943867 (0.0000)
AUPRC: 0.971819 (0.0000)

---------------------------------------
Physical memory usage(MB): %Number%
Virtual memory usage(MB): %Number%
%DateTime% Time elapsed(s): %Number%

--- Progress log ---
[1] 'FastTree data preparation' started.
[1] 'FastTree data preparation' finished in %Time%.
[2] 'FastTree in-memory bins initialization' started.
[2] 'FastTree in-memory bins initialization' finished in %Time%.
[3] 'FastTree feature conversion' started.
[3] 'FastTree feature conversion' finished in %Time%.
[4] 'GAM training' started.
[4] 'GAM training' finished in %Time%.
[5] 'Saving model' started.
[5] 'Saving model' finished in %Time%.
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
BinaryClassificationGamTrainer
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction F1 Score AUPRC Learner Name Train Dataset Test Dataset Results File Run Time Physical Memory Virtual Memory Command Line Settings
0.991198 0.961373 0.945833 0.941909 0.969499 0.971616 0.145652 84.32696 0.943867 0.971819 BinaryClassificationGamTrainer %Data% %Data% %Output% 99 0 0 maml.exe TrainTest test=%Data% tr=BinaryClassificationGamTrainer dout=%Output% data=%Data% out=%Output% seed=1

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
maml.exe TrainTest test=%Data% tr=BinaryClassificationGamTrainer{dt+} dout=%Output% data=%Data% out=%Output% seed=1
Not adding a normalizer.
Making per-feature arrays
Changing data from row-wise to column-wise on disk
Warning: 16 of 699 examples will be skipped due to missing feature values
Processed 683 instances
Binning and forming Feature objects
Starting to train ...
Training calibrator.
TEST POSITIVE RATIO: 0.3448 (241.0/(241.0+458.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 227 | 14 | 0.9419
negative || 13 | 445 | 0.9716
||======================
Precision || 0.9458 | 0.9695 |
OVERALL 0/1 ACCURACY: 0.961373
LOG LOSS/instance: 0.145652
Test-set entropy (prior Log-Loss/instance): 0.929318
LOG-LOSS REDUCTION (RIG): 84.326961
AUC: 0.991198

OVERALL RESULTS
---------------------------------------
AUC: 0.991198 (0.0000)
Accuracy: 0.961373 (0.0000)
Positive precision: 0.945833 (0.0000)
Positive recall: 0.941909 (0.0000)
Negative precision: 0.969499 (0.0000)
Negative recall: 0.971616 (0.0000)
Log-loss: 0.145652 (0.0000)
Log-loss reduction: 84.326961 (0.0000)
F1 Score: 0.943867 (0.0000)
AUPRC: 0.971819 (0.0000)

---------------------------------------
Physical memory usage(MB): %Number%
Virtual memory usage(MB): %Number%
%DateTime% Time elapsed(s): %Number%

--- Progress log ---
[1] 'FastTree disk-based bins initialization' started.
[1] 'FastTree disk-based bins initialization' finished in %Time%.
[2] 'GAM training' started.
[2] 'GAM training' finished in %Time%.
[3] 'Saving model' started.
[3] 'Saving model' finished in %Time%.
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
BinaryClassificationGamTrainer
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction F1 Score AUPRC /dt Learner Name Train Dataset Test Dataset Results File Run Time Physical Memory Virtual Memory Command Line Settings
0.991198 0.961373 0.945833 0.941909 0.969499 0.971616 0.145652 84.32696 0.943867 0.971819 + BinaryClassificationGamTrainer %Data% %Data% %Output% 99 0 0 maml.exe TrainTest test=%Data% tr=BinaryClassificationGamTrainer{dt+} dout=%Output% data=%Data% out=%Output% seed=1 /dt:+

Loading