Skip to content

Speed of Random Forest predictions #179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mjmckp opened this issue May 17, 2018 · 14 comments
Closed

Speed of Random Forest predictions #179

mjmckp opened this issue May 17, 2018 · 14 comments
Assignees
Labels
question Further information is requested
Milestone

Comments

@mjmckp
Copy link

mjmckp commented May 17, 2018

I trained a FastForestBinaryClassifier on a toy model with two features and zero input transformations, and found that calling the Predict method on the resulting model takes about 5ms for a single input on a high spec machine. This seems quite slow, shouldn't it take less than 1ms?

I notice that the FastTreePredictionWrapper type is able to write itself out as code:

public void SaveAsCode(TextWriter writer, RoleMappedSchema schema)

Would it be possible to write out the calibrated model as C# code, which should presumably be faster to run? This would have the additional benefit of being able to be included as a static model which can be deployed without any ML.Net dependencies...

@shauheen shauheen added the question Further information is requested label May 17, 2018
@glebuk
Copy link
Contributor

glebuk commented May 17, 2018

@mjmckp ,
The speed of inference is a bit subtle. Every call of Predict() creates a scoring pipeline. That is relatively expensive. Actual scoring of each items is quite cheap by comparison (10-1000x faster, depending on model complexity) You can see this by comparing time to make a Predict to a single data point vs Predict of an enumeration of 10 items. I'd expect those times to be nearly the same.
Please note the discussion around PR #62. In that case the scoring time went from 1ms to 15ns
(60000 faster)
There are known patterns on how to create a super-fast single-item scoring that reuses the scoring pipeline.
Do you have a batch or single item scoring scenario?

@glebuk
Copy link
Contributor

glebuk commented May 17, 2018

With regards to save as code, this is on our roadmap to expose this functionality. You can use SaveAsCode() for now at your own risk, but the API might change. Be aware, that the method might produce incorrect result as this function does not capture the full pipeline. Any transforms will not be included, that will also exclude potential feature normalization and raw score calibration.

@mjmckp
Copy link
Author

mjmckp commented May 17, 2018

@glebuk I'm interested in using this in a single item scoring scenario, with prediction times ideally less than 100 microseconds. Are there any examples of how to:

  1. Invoke the prediction model directly, without creating a scoring pipeline (assuming there are no feature transformation steps), and/or,
  2. Call SaveAsCode on a calibrated random forest.

For 2), I can't see how to get a Microsoft.ML.Runtime.FastTree.FastTreePredictionWrapper on which to call SaveAsCode from a Microsoft.ML.Trainers.FastForestBinaryClassifier used in the setup of the training pipeline, and after that, how to obtain the necessary Microsoft.ML.Runtime.Data.RoleMappedSchema argument required by SaveAsCode

@glebuk
Copy link
Contributor

glebuk commented May 18, 2018

I'd suggest, instead of converting to code, for now try a different approach that would reuse the pipeline initialization between calls.

We have plans to add the support for fast single scoring. However, for now, you can work around this issue in the following way. It's not the most elegant, but it works.

First, can you please check , how much longer does a Predict() for array of 2 inputs vs predict for a single input? The difference is the time of individual prediction minus the pipeline setup overhead. My guess is that it would be within your time budget. The advantage of the method below is that you can use any learner and full featurization pipeline.

Here's the workaround:

  1. Create a class that contains your trained model.
  2. In constructor, create instance of the trained model, feed the Predict method the FacadeEnumerator as input and save the output enumerator into a class field. Note you need to save the enumerator, not the actual value it returns.
  3. Add method Predict. It would take the input, call Retarget() on the input enumeration, MoveNext() the enumerator and return the Current.
    public sealed class FacadeEnumerator<T> : IEnumerator<T>
    {
        private IEnumerator<T> _target;

        public FacadeEnumerator()
        {
        }

        public void Retarget(IEnumerator<T> target)
        {
            _target = target;
        }

        public T Current
        {
            get
            {
                if (_target == null)
                    return default(T);
                return _target.Current;
            }
        }

        object IEnumerator.Current
        {
            get
            {
                return Current;
            }
        }

        public void Dispose()
        {
        }

        public bool MoveNext()
        {
            var hasNext = _target.MoveNext();
            Contracts.Check(hasNext, "Moved past the end!");
            return hasNext;
        }

        public void Reset()
        {
            throw Contracts.ExceptNotImpl("Reset is not expected to be used when scoring.");
        }
    }

    public sealed class FacadeEnumerable<T> : IEnumerable<T>
    {
        private readonly FacadeEnumerator<T> _enumerator;

        public FacadeEnumerable()
        {
            _enumerator = new FacadeEnumerator<T>();
        }

        public FacadeEnumerator<T> GetEnumerator()
        {
            return _enumerator;
        }

        IEnumerator<T> IEnumerable<T>.GetEnumerator()
        {
            return _enumerator;
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return _enumerator;
        }
    }

@mjmckp
Copy link
Author

mjmckp commented May 18, 2018

Thanks @glebuk, I gave that a go, and the first prediction works fine, but then the pipeline is attempting to get the next input value in the sequence and evaluate it (before I have called Retarget for the next singleton input). How do I get it to wait around until the next input is ready?

@shauheen shauheen added this to the 0518 milestone May 18, 2018
@glebuk
Copy link
Contributor

glebuk commented May 18, 2018

The trick is to not try to enumerate, but actually get the enumerator from the function using the Predict(IEnumerable<TInput> inputs) overload and then manually enumerate via "moveNext"

I'll try to fix this in the code in the next few days.

@mjmckp
Copy link
Author

mjmckp commented May 20, 2018

Thanks @glebuk, I think that is what I am doing, see below. The output of the program when run is:

Not adding a normalizer.
Making per-feature arrays
Changing data from row-wise to column-wise
Processed 100 instances
Binning and forming Feature objects
Reserved memory for tree learner: 31824 bytes
Starting to train ...
Warning: 5 of the boosting iterations failed to grow a tree. This is commonly because the minimum documents in leaf hyperparameter was set too high for this dataset.
Training calibrator.
Calling inputEnumerator.Retarget
FacadeEnumerator: Retarget
Calling outputEnumerator.MoveNext
FacadeEnumerator: GetEnumerator
FacadeEnumerator: MoveNext
FacadeEnumerator: Current
FacadeEnumerator: MoveNext
FacadeEnumerator: Moved past the end!

Self-contained repro:

using System;
using System.Collections;
using System.Collections.Generic;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Runtime.Api;
using Microsoft.ML.Trainers;

namespace Axon.Research.ML.Net.Cs
{
    class Program
    {
        public sealed class FacadeEnumerator<T> : IEnumerator<T>
        {
            private IEnumerator<T> _target;

            public FacadeEnumerator()
            {
            }

            public void Retarget(IEnumerator<T> target)
            {
                Console.WriteLine("FacadeEnumerator: Retarget");
                _target = target;
            }

            public T Current
            {
                get
                {
                    Console.WriteLine("FacadeEnumerator: Current");
                    if (_target == null)
                        return default(T);
                    return _target.Current;
                }
            }

            object IEnumerator.Current
            {
                get
                {
                    return Current;
                }
            }

            public void Dispose()
            {
            }

            public bool MoveNext()
            {
                Console.WriteLine("FacadeEnumerator: MoveNext");
                var hasNext = _target.MoveNext();
                if (!hasNext)
                {
                    Console.WriteLine("FacadeEnumerator: Moved past the end!");
                    throw (new Exception("Moved past the end!"));
                }
                return hasNext;
            }

            public void Reset()
            {
                throw (new Exception("Reset is not expected to be used when scoring."));
            }
        }

        public sealed class FacadeEnumerable<T> : IEnumerable<T>
        {
            private readonly FacadeEnumerator<T> _enumerator;

            public FacadeEnumerable()
            {
                _enumerator = new FacadeEnumerator<T>();
            }

            public FacadeEnumerator<T> GetEnumerator()
            {
                return _enumerator;
            }

            IEnumerator<T> IEnumerable<T>.GetEnumerator()
            {
                Console.WriteLine("FacadeEnumerator: GetEnumerator");
                return _enumerator;
            }

            IEnumerator IEnumerable.GetEnumerator()
            {
                return _enumerator;
            }
        }
        public class Data
        {
            [ColumnName("Features")]
            [VectorType(2)]
            public float[] Features;

            [ColumnName("Label")]
            public float Label;
        }

        public class Prediction
        {
            [ColumnName("PredictedLabel")]
            public bool PredictedLabel;
        }

        static PredictionModel<Data,Prediction> Train(IEnumerable<Data> data)
        {
            var pipeline = new LearningPipeline();
            pipeline.Add(CollectionDataSource.Create(data));
            pipeline.Add(new FastForestBinaryClassifier());
            var model = pipeline.Train<Data, Prediction>();
            return model;
        }

        static void Main(string[] args)
        {
            var data = new Data[100];
            var rand = new Random();
            for (var i = 0; i < data.Length; i++)
            {
                data[i] = new Data();
                data[i].Features = new float[] { (float)rand.NextDouble(), (float)rand.NextDouble() };
                data[i].Label = rand.Next(2) == 0 ? 0.0f : 1.0f;
            }

            var model = Train(data);

            var facade = new FacadeEnumerable<Data>();
            var inputEnumerator = facade.GetEnumerator();
            var outputEnumerator = model.Predict(facade).GetEnumerator();

            // simulate evaluation of single input
            var input = new List<Data>(1);
            input.Add(new Data());

            for (var i=0; i<10; i++)
            {
                input[0].Features = new float[] { (float)rand.NextDouble(), (float)rand.NextDouble() };
                Console.WriteLine("Calling inputEnumerator.Retarget");
                inputEnumerator.Retarget(input.GetEnumerator());
                Console.WriteLine("Calling outputEnumerator.MoveNext");
                if (!outputEnumerator.MoveNext())
                    throw (new Exception("Failed to MoveNext on output enumerator"));
                Console.WriteLine("{0} {1}", i, outputEnumerator.Current.PredictedLabel);
            }
        }
    }
}

@glebuk glebuk assigned Ivanidzo4ka and unassigned glebuk May 21, 2018
@glebuk
Copy link
Contributor

glebuk commented May 21, 2018

Ivan,
Please add the solution for this using the Retargetable Enumerable approach (aka FacadeEnumerator) and add a "bool reuseObjects" option on Predict to do both the output and enumeration reuse within the method

@Ivanidzo4ka
Copy link
Contributor

@glebuk It's a bit harder than you can imagine.
We (not in github) use FacadeEnumerable as underlying layer for StreamingDataView, which is IDataView. Here we use BatchPredictionEngine which creates engine on top of IDataView, iterates over it, creates items, fill them. And all that stuff based on cursor, which can't be overused, as soon as it done going through collection, it's done.

@glebuk
Copy link
Contributor

glebuk commented May 21, 2018

@Ivanidzo4ka,
The issue (in ML.NET codebase) is that the prediction engine tries to optimize and setup parallel pipeline cursors cursorSplitter/Validator. That requires it to load more examples vs what is available.
If you can update the predictor code for that situation to just use a single-threaded prediction, I'm sure it will work.
@mjmckp,
Looks like the solution would require just a bit work on the PredictonModel class to ensure that it does not use the parallel scoring.

@Ivanidzo4ka
Copy link
Contributor

@glebuk Fair point. From what I see if I disable parallel cursor all that Facade thing becomes useless. (same speed with facade or without).

@Ivanidzo4ka
Copy link
Contributor

I have solution which for 10_000 examples reduce time from 27 sec to 613ms where 200ms is model loading.
it look like this:

var loadedModel = await PredictionModel.ReadAsync<Data, Prediction>(@"d:\tlc\model.zip", 1);

instead of

var loadedModel = await PredictionModel.ReadAsync<Data, Prediction>(@"d:\tlc\model.zip");

where second parameter is concurrency option which specify how many threads we should use to run prediction engine. But it feels counterintuitive. Let me check, can I use collection size to decide which cursor to create, parallel or single threaded.

@glebuk
Copy link
Contributor

glebuk commented May 25, 2018

@mjmckp
The PR went in, can you please try and re-benchmark to see if it got any better? AFAIK you don't need to do anything.

@mjmckp
Copy link
Author

mjmckp commented May 27, 2018

All good now thanks @glebuk @Ivanidzo4ka - for the example above, predictions on a single input are now take about 50 microseconds.

@mjmckp mjmckp closed this as completed May 27, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants