Update to use OnnxRuntime library instead of Sonoma #1717

jignparm · 2018-11-25T11:10:45Z

Fixes #1272.
Fixes #1228.
Fixes #1514

Replaces the Microsoft.ML.Scoring library with the new Microsoft.ML.OnnxRuntime library.

Upgraded runtime to Onnx 1.3, with isNan operator.

Descriptive error messages instead of SEH exception.

XML documentation for important classes and methods.

Adds a full end-to-end example for users to start with.

src/Microsoft.ML.OnnxTransform/Microsoft.ML.OnnxTransform.csproj

test/Microsoft.ML.OnnxTransformTest/OnnxTransformTests.cs

build/Dependencies.props

docs/samples/Microsoft.ML.Samples/Dynamic/OnnxTransform.cs

src/Microsoft.ML.OnnxTransform/OnnxTransform.cs

src/Microsoft.ML.OnnxTransform/OnnxUtils.cs

eerhardt · 2018-11-26T17:02:02Z

src/Microsoft.ML.OnnxTransform/OnnxTransform.cs

                {
                    _srcgetter(ref _vBuffer);
                    _vBuffer.CopyToDense(ref _vBufferDense);
-                    return OnnxUtils.CreateTensor(_vBufferDense.GetValues(), _tensorShape);
+                    return OnnxUtils.CreateNamedOnnxValue(_colName, _vBufferDense.GetValues(), _tensorShape);


It's unfortunate how many copies we are making here.

1._srcgetter will copy the data into _vBuffer.
2. _vBuffer will copy the dense data into _vBufferDense
3. OnnxUtils.CreateNamedOnnxValue calls ToArray() on the Span passed into it, which will make a 3rd copy.

I think in a "perfect" world, ML.NET would have "Tensor" as a "well-known" type, just like VBuffer. Then we could talk in terms of Tensor here instead of VBuffer, and not need to do so many copies.

NOTE: my suggestion is a much larger work item that would be best to be done separately, not part of this PR. #ByDesign

Point noted. Currently the APIs require this level of copying. I don't think there's an easy way to avoid it in the framework. If ML.NET had a tensor type column, we could pass it directly into Onnxruntime, and avoid the extra copies.
I agree we should track separately. There are multiple data structures are in play (vbuffer, span, arrays and tensors), and data needs to be converted from one representation to the other. There's also the another subtle dimension of CPU memory (managed/native heap or stack), versus GPU memory, in case of GPU execution.

In reply to: 236341262 [](ancestors = 236341262)

src/Microsoft.ML.OnnxTransform/OnnxUtils.cs

build/Dependencies.props

eerhardt · 2018-11-26T17:13:03Z

We also need to update

machinelearning/pkg/Microsoft.ML.OnnxTransform/Microsoft.ML.OnnxTransform.nupkgproj

Line 10 in ce44870

    
           <PackageReference Include="Microsoft.ML.Scoring" Version="$(MicrosoftMLScoring)"/>

To point to the OnnxRuntime package instead of ML.Scoring. #Resolved

docs/samples/Microsoft.ML.Samples/Microsoft.ML.Samples.csproj

docs/samples/Microsoft.ML.Samples/Dynamic/OnnxTransform.cs

src/Microsoft.ML.OnnxTransform/OnnxUtils.cs

jignparm · 2018-11-26T23:43:54Z

Thanks for the catch. Updated.

In reply to: 441720143 [](ancestors = 441720143)

src/Microsoft.ML.OnnxTransform/Microsoft.ML.OnnxTransform.csproj

src/Microsoft.ML.OnnxTransform/OnnxUtils.cs

eerhardt

docs/samples/Microsoft.ML.Samples/Microsoft.ML.Samples.csproj

shmoradims

…er enumerating them explicitly

jignparm added 2 commits November 25, 2018 11:02

update transform to use onnxruntime library instead of sonoma

e0496f8

Updated to conform to newer API

8bc3bb1

jignparm requested review from eerhardt, Ivanidzo4ka, shmoradims and vaeksare November 26, 2018 03:40

jignparm changed the title ~~WIP: Update to use OnnxRuntime library instead of Sonoma~~ Update to use OnnxRuntime library instead of Sonoma Nov 26, 2018