-
Notifications
You must be signed in to change notification settings - Fork 267
Description
Please note that this problem may just be a bug in my code but I'd rather report this anyway in case someone is able to confirm whether the problem is in Encog or not.
The PROBEN1 report (page 32) claims that they get 0.027 MSE over 101 epochs on average for that dataset. And the classification error is around 17%.
I can't get anything near that with my code based on Encog. So I've implement the same experiment in MATLAB and Encog to objectively compare the behaviours. I stop training if the MSE goes under 0.05 or the number of iterations is 300. I repeat the training process over 100 randomly initialised networks.
In MATLAB 88 times out of 100 the training stops before the 300 limit as it reaches the MSE target of 0.05. And on average it does so in 95 epochs. And the classification accuracies are in line with those described in PROBEN1.
With my code based on Encog and with the same dataset, the same network architecture and, I believe, the same RProp heuristics, the MSE is 0.14 on average. It goes below 0.11 only once out of the 100 trainings.
In practice this difference means that the trained network should be able to correctly classify 83% of the testing samples but instead it is stuck at 65% even when I let the training run up to 3000 epochs. As you can see below the differences in classification error over the training set is even worse.
Here are the results for 10 trials:
MATLAB:
experiment(gene1, 300, [4,2,3], 10, 0.05);
Trial: 1, Epochs: 97, MSE: 0.0489183, CE(trn): 6.4%, CE(tst): 16.4%
Trial: 2, Epochs: 28, MSE: 0.0482678, CE(trn): 7.6%, CE(tst): 18.0%
Trial: 3, Epochs: 50, MSE: 0.0497159, CE(trn): 8.0%, CE(tst): 22.2%
Trial: 4, Epochs: 56, MSE: 0.0497302, CE(trn): 6.5%, CE(tst): 19.2%
Trial: 5, Epochs: 40, MSE: 0.0493543, CE(trn): 6.4%, CE(tst): 18.9%
Trial: 6, Epochs: 54, MSE: 0.0499842, CE(trn): 6.9%, CE(tst): 16.3%
Trial: 7, Epochs: 94, MSE: 0.0497391, CE(trn): 6.2%, CE(tst): 20.9%
Trial: 8, Epochs: 28, MSE: 0.0496143, CE(trn): 6.9%, CE(tst): 15.6%
Trial: 9, Epochs: 44, MSE: 0.0483782, CE(trn): 7.7%, CE(tst): 17.9%
Trial: 10, Epochs: 64, MSE: 0.0497216, CE(trn): 7.7%, CE(tst): 19.4%
Min MSE: 0.0482678, Max MSE: 0.0499842, Avg MSE: 0.0493424
Min epoch: 28, Max epoch: 97, Avg epoch: 55.5
Successes: 10 / 10
4 2 3
My code using Encog:
Trial: 1, Epochs: 300, MSE: 0.130949, CE(trn): 28.5%, CE(tst): 35.2%
Trial: 2, Epochs: 300, MSE: 0.155891, CE(trn): 41.8%, CE(tst): 41.2%
Trial: 3, Epochs: 300, MSE: 0.167867, CE(trn): 46.3%, CE(tst): 46.0%
Trial: 4, Epochs: 300, MSE: 0.149423, CE(trn): 34.5%, CE(tst): 38.5%
Trial: 5, Epochs: 300, MSE: 0.129148, CE(trn): 28.0%, CE(tst): 34.3%
Trial: 6, Epochs: 300, MSE: 0.167622, CE(trn): 46.0%, CE(tst): 46.5%
Trial: 7, Epochs: 300, MSE: 0.153942, CE(trn): 36.5%, CE(tst): 40.6%
Trial: 8, Epochs: 300, MSE: 0.136173, CE(trn): 29.3%, CE(tst): 32.7%
Trial: 9, Epochs: 300, MSE: 0.170760, CE(trn): 47.5%, CE(tst): 47.2%
Trial: 10, Epochs: 300, MSE: 0.125813, CE(trn): 26.8%, CE(tst): 34.9%
Min MSE: 0.125813, max MSE: 0.170760, Avg MSE: 0.148759
Min epoch: 300, max epoch: 300, Avg epoch: 300.000
Successes: 0 / 10
Architecture: ?:B->SIGMOID->4:B->SIGMOID->2:B->LINEAR->?