-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Update to use OnnxRuntime library instead of Sonoma #1717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/Microsoft.ML.OnnxTransform/Microsoft.ML.OnnxTransform.csproj
Outdated
Show resolved
Hide resolved
{ | ||
_srcgetter(ref _vBuffer); | ||
_vBuffer.CopyToDense(ref _vBufferDense); | ||
return OnnxUtils.CreateTensor(_vBufferDense.GetValues(), _tensorShape); | ||
return OnnxUtils.CreateNamedOnnxValue(_colName, _vBufferDense.GetValues(), _tensorShape); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate how many copies we are making here.
1._srcgetter
will copy the data into _vBuffer
.
2. _vBuffer
will copy the dense data into _vBufferDense
3. OnnxUtils.CreateNamedOnnxValue
calls ToArray()
on the Span passed into it, which will make a 3rd copy.
I think in a "perfect" world, ML.NET would have "Tensor" as a "well-known" type, just like VBuffer
. Then we could talk in terms of Tensor
here instead of VBuffer
, and not need to do so many copies.
NOTE: my suggestion is a much larger work item that would be best to be done separately, not part of this PR. #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point noted. Currently the APIs require this level of copying. I don't think there's an easy way to avoid it in the framework. If ML.NET had a tensor type column, we could pass it directly into Onnxruntime, and avoid the extra copies.
I agree we should track separately. There are multiple data structures are in play (vbuffer, span, arrays and tensors), and data needs to be converted from one representation to the other. There's also the another subtle dimension of CPU memory (managed/native heap or stack), versus GPU memory, in case of GPU execution.
In reply to: 236341262 [](ancestors = 236341262)
We also need to update machinelearning/pkg/Microsoft.ML.OnnxTransform/Microsoft.ML.OnnxTransform.nupkgproj Line 10 in ce44870
To point to the OnnxRuntime package instead of ML.Scoring. #Resolved |
Thanks for the catch. Updated. In reply to: 441720143 [](ancestors = 441720143) |
src/Microsoft.ML.OnnxTransform/Microsoft.ML.OnnxTransform.csproj
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…er enumerating them explicitly
Fixes #1272.
Fixes #1228.
Fixes #1514
Replaces the Microsoft.ML.Scoring library with the new Microsoft.ML.OnnxRuntime library.
Upgraded runtime to Onnx 1.3, with isNan operator.
Descriptive error messages instead of SEH exception.
XML documentation for important classes and methods.
Adds a full end-to-end example for users to start with.