-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Speed of Random Forest predictions #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@mjmckp , |
With regards to save as code, this is on our roadmap to expose this functionality. You can use SaveAsCode() for now at your own risk, but the API might change. Be aware, that the method might produce incorrect result as this function does not capture the full pipeline. Any transforms will not be included, that will also exclude potential feature normalization and raw score calibration. |
@glebuk I'm interested in using this in a single item scoring scenario, with prediction times ideally less than 100 microseconds. Are there any examples of how to:
For 2), I can't see how to get a |
I'd suggest, instead of converting to code, for now try a different approach that would reuse the pipeline initialization between calls. We have plans to add the support for fast single scoring. However, for now, you can work around this issue in the following way. It's not the most elegant, but it works. First, can you please check , how much longer does a Predict() for array of 2 inputs vs predict for a single input? The difference is the time of individual prediction minus the pipeline setup overhead. My guess is that it would be within your time budget. The advantage of the method below is that you can use any learner and full featurization pipeline. Here's the workaround:
public sealed class FacadeEnumerator<T> : IEnumerator<T>
{
private IEnumerator<T> _target;
public FacadeEnumerator()
{
}
public void Retarget(IEnumerator<T> target)
{
_target = target;
}
public T Current
{
get
{
if (_target == null)
return default(T);
return _target.Current;
}
}
object IEnumerator.Current
{
get
{
return Current;
}
}
public void Dispose()
{
}
public bool MoveNext()
{
var hasNext = _target.MoveNext();
Contracts.Check(hasNext, "Moved past the end!");
return hasNext;
}
public void Reset()
{
throw Contracts.ExceptNotImpl("Reset is not expected to be used when scoring.");
}
}
public sealed class FacadeEnumerable<T> : IEnumerable<T>
{
private readonly FacadeEnumerator<T> _enumerator;
public FacadeEnumerable()
{
_enumerator = new FacadeEnumerator<T>();
}
public FacadeEnumerator<T> GetEnumerator()
{
return _enumerator;
}
IEnumerator<T> IEnumerable<T>.GetEnumerator()
{
return _enumerator;
}
IEnumerator IEnumerable.GetEnumerator()
{
return _enumerator;
}
} |
Thanks @glebuk, I gave that a go, and the first prediction works fine, but then the pipeline is attempting to get the next input value in the sequence and evaluate it (before I have called |
The trick is to not try to enumerate, but actually get the enumerator from the function using the I'll try to fix this in the code in the next few days. |
Thanks @glebuk, I think that is what I am doing, see below. The output of the program when run is:
Self-contained repro:
|
Ivan, |
@glebuk It's a bit harder than you can imagine. |
@Ivanidzo4ka, |
@glebuk Fair point. From what I see if I disable parallel cursor all that Facade thing becomes useless. (same speed with facade or without). |
I have solution which for 10_000 examples reduce time from 27 sec to 613ms where 200ms is model loading.
instead of
where second parameter is concurrency option which specify how many threads we should use to run prediction engine. But it feels counterintuitive. Let me check, can I use collection size to decide which cursor to create, parallel or single threaded. |
@mjmckp |
All good now thanks @glebuk @Ivanidzo4ka - for the example above, predictions on a single input are now take about 50 microseconds. |
I trained a
FastForestBinaryClassifier
on a toy model with two features and zero input transformations, and found that calling thePredict
method on the resulting model takes about 5ms for a single input on a high spec machine. This seems quite slow, shouldn't it take less than 1ms?I notice that the
FastTreePredictionWrapper
type is able to write itself out as code:machinelearning/src/Microsoft.ML.FastTree/FastTree.cs
Line 2942 in ae1ecef
Would it be possible to write out the calibrated model as C# code, which should presumably be faster to run? This would have the additional benefit of being able to be included as a static model which can be deployed without any
ML.Net
dependencies...The text was updated successfully, but these errors were encountered: