Skip to content

Add onnxruntime as wasi-nn backend #4485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

dongsheng28849455
Copy link
Contributor

Steps to verify:

  • Install the onnx runtime (official release), assuming in /opt/onnxruntime
  • Build iwasm with WAMR_BUILD_WASI_NN_ONNX enabled
  • Using an onnx model of ssd-mobilenetv1 from ONNX Model Zoo
  • Generate the data file of input_tensor.bin, from origin picture for wasi-nn (with shape [1, 383, 640, 3])
  • Use nn-cli for test, eg.
--load-graph=file=./ssd_mobilenet_v1.onnx,id=graph1,encoding=1 
--init-execution-context=graph-id=graph1,id=exec0 
--set-input=file=./input_tensor.bin,context-id=exec0,dim=1,dim=383,dim=640,dim=3,type=3 
--compute=context-id=exec0 
--get-output=context-id=exec0,file=output.bin

Generate output.bin, with shape [1, 100, 4] and f32 type, which contents match the sample's output

@lum1n0us lum1n0us added the new feature Determine if this Issue request a new feature or this PR introduces a new feature. label Jul 14, 2025
Copy link
Collaborator

@yamt yamt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the relationship with #4304?

@dongsheng28849455
Copy link
Contributor Author

what's the relationship with #4304?

1, Adapt to latest wasi-nn arch and support WAMR_BUILD_WASI_EPHEMERAL_NN
2, Test with models and nn-cli

@yamt yamt added the wasi-nn label Jul 14, 2025
@dongsheng28849455 dongsheng28849455 requested a review from yamt July 17, 2025 06:49
@dongsheng28849455 dongsheng28849455 force-pushed the feature/support_onnx_for_wasi-nn branch from aa88085 to 1fb25ad Compare July 29, 2025 08:09
@dongsheng28849455 dongsheng28849455 requested a review from yamt July 29, 2025 08:13
@dongsheng28849455 dongsheng28849455 force-pushed the feature/support_onnx_for_wasi-nn branch 2 times, most recently from 1e60909 to 29e4dd5 Compare July 29, 2025 08:17
@dongsheng28849455 dongsheng28849455 requested a review from yamt July 29, 2025 09:06
@yamt
Copy link
Collaborator

yamt commented Aug 4, 2025

1, type converter btw wasi-nn and onnx runtime returns bool instead of type
2, out_buffer_size does not hold the expected size.
3, onnx runtime does not need calculate input_tenser size.
@dongsheng28849455 dongsheng28849455 force-pushed the feature/support_onnx_for_wasi-nn branch from 0dcdcab to cd3cb6c Compare August 5, 2025 01:44
@dongsheng28849455 dongsheng28849455 requested a review from yamt August 5, 2025 01:46
@yamt
Copy link
Collaborator

yamt commented Aug 5, 2025

* Using an onnx model of [ssd-mobilenetv1](https://github.com/onnx/models/tree/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1) from [ONNX Model Zoo](https://github.com/onnx/models/blob/main/README.md#onnx-model-zoo)

* Generate the data file of input_tensor.bin, from origin picture for wasi-nn (with shape [1, 383, 640, 3])

* Use [nn-cli](https://github.com/bytecodealliance/wasm-micro-runtime/pull/4373) for test, eg.
--load-graph=file=./ssd_mobilenet_v1.onnx,id=graph1,encoding=1 
--init-execution-context=graph-id=graph1,id=exec0 
--set-input=file=./input_tensor.bin,context-id=exec0,dim=1,dim=383,dim=640,dim=3,type=3 
--compute=context-id=exec0 
--get-output=context-id=exec0,file=output.bin

Generate output.bin, with shape [1, 100, 4] and f32 type, which contents match the sample's output

using this model, i had to use non-zero index for get_output. thus i had to fix nn-cli bug.
which model have you used?

@dongsheng28849455
Copy link
Contributor Author

* Using an onnx model of [ssd-mobilenetv1](https://github.com/onnx/models/tree/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1) from [ONNX Model Zoo](https://github.com/onnx/models/blob/main/README.md#onnx-model-zoo)

* Generate the data file of input_tensor.bin, from origin picture for wasi-nn (with shape [1, 383, 640, 3])

* Use [nn-cli](https://github.com/bytecodealliance/wasm-micro-runtime/pull/4373) for test, eg.
--load-graph=file=./ssd_mobilenet_v1.onnx,id=graph1,encoding=1 
--init-execution-context=graph-id=graph1,id=exec0 
--set-input=file=./input_tensor.bin,context-id=exec0,dim=1,dim=383,dim=640,dim=3,type=3 
--compute=context-id=exec0 
--get-output=context-id=exec0,file=output.bin

Generate output.bin, with shape [1, 100, 4] and f32 type, which contents match the sample's output

using this model, i had to use non-zero index for get_output. thus i had to fix nn-cli bug. which model have you used?

I'm using this one: https://github.com/onnx/models/blob/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1/model/ssd_mobilenet_v1_10.onnx

@yamt
Copy link
Collaborator

yamt commented Aug 5, 2025

I'm using this one: https://github.com/onnx/models/blob/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1/model/ssd_mobilenet_v1_10.onnx

thank you. however, this model looks same in the regard. (have 4 outputs)

@yamt
Copy link
Collaborator

yamt commented Aug 5, 2025

I'm using this one: https://github.com/onnx/models/blob/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1/model/ssd_mobilenet_v1_10.onnx

thank you. however, this model looks same in the regard. (have 4 outputs)

maybe you somehow interpreted only the first (idx=0) output, which contains bounding boxes?

@dongsheng28849455
Copy link
Contributor Author

dongsheng28849455 commented Aug 5, 2025

I'm using this one: https://github.com/onnx/models/blob/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1/model/ssd_mobilenet_v1_10.onnx

thank you. however, this model looks same in the regard. (have 4 outputs)

maybe you somehow interpreted only the first (idx=0) output, which contains bounding boxes?

output shape is [1, 100, 4]:
hexdump output.bin:
0000000 c05e 3ef8 5d0f 3e2f 06ad 3f09 693f 3e5a
0000010 a9d3 3f03 52ae 3d37 4e7d 3f18 1dd4 3e02
0000020 5914 3dc3 41a7 3efa 2644 3f3a 0000 3f80
0000030 73c5 3ef8 af0e 3e65 02ea 3f0b 69e3 3e84
0000040 50f7 3ef7 e514 3e04 d723 3f04 bb62 3e1f
0000050 5a2c 3ef3 75e2 3e84 4f2e 3f0e 329c 3e9e
0000060 c04a 3ef2 ca20 3e99 df0d 3f10 8744 3eb9
0000070 f5c0 3ec3 61ce 3e0f 066a 3ed6 aada 3e1c
0000080 0000 0000 0000 0000 0000 0000 0000 0000

in which, one bounding box for example:
c05e 3ef8 (0.48584265) stands for the ymin
5d0f 3e2f (0.17125343) for xmin
06ad 3f09 (0.5352581) is ymax
693f 3e5a(0.2132921) is xmax

it looks good for my test picure (http://images.cocodataset.org/val2017/000000088462.jpg)

@yamt
Copy link
Collaborator

yamt commented Aug 6, 2025

I'm using this one: https://github.com/onnx/models/blob/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1/model/ssd_mobilenet_v1_10.onnx

thank you. however, this model looks same in the regard. (have 4 outputs)

maybe you somehow interpreted only the first (idx=0) output, which contains bounding boxes?

output shape is [1, 100, 4]: hexdump output.bin: 0000000 c05e 3ef8 5d0f 3e2f 06ad 3f09 693f 3e5a 0000010 a9d3 3f03 52ae 3d37 4e7d 3f18 1dd4 3e02 0000020 5914 3dc3 41a7 3efa 2644 3f3a 0000 3f80 0000030 73c5 3ef8 af0e 3e65 02ea 3f0b 69e3 3e84 0000040 50f7 3ef7 e514 3e04 d723 3f04 bb62 3e1f 0000050 5a2c 3ef3 75e2 3e84 4f2e 3f0e 329c 3e9e 0000060 c04a 3ef2 ca20 3e99 df0d 3f10 8744 3eb9 0000070 f5c0 3ec3 61ce 3e0f 066a 3ed6 aada 3e1c 0000080 0000 0000 0000 0000 0000 0000 0000 0000

in which, one bounding box for example: c05e 3ef8 (0.48584265) stands for the ymin 5d0f 3e2f (0.17125343) for xmin 06ad 3f09 (0.5352581) is ymax 693f 3e5a(0.2132921) is xmax

it looks good for my test picure (http://images.cocodataset.org/val2017/000000088462.jpg)

ok. i understood.

actually, the model has 4 outputs as documented
and (with this nn-cli fix) you can get them as the following.

--load-graph=file=ssd_mobilenet_v1_10.onnx,encoding=1 \
--init-execution-context \
--set-input=file=input.bin,dim=1,dim=383,dim=640,dim=3,type=3 \
--compute \
--get-output=idx=0,file=output0.bin \
--get-output=idx=1,file=output1.bin \
--get-output=idx=2,file=output2.bin \
--get-output=idx=3,file=output3.bin

as wasi-nn doesn't have get-output-by-name, you need to use integer indexes.
output tensors of this model are (detection_boxes, detection_classes, detection_scores, num_detections). (in this order)

you were only looking at detection_boxes.

@dongsheng28849455
Copy link
Contributor Author

I'm using this one: https://github.com/onnx/models/blob/main/validated/vision/object_detection_segmentation/ssd-mobilenetv1/model/ssd_mobilenet_v1_10.onnx

thank you. however, this model looks same in the regard. (have 4 outputs)

maybe you somehow interpreted only the first (idx=0) output, which contains bounding boxes?

output shape is [1, 100, 4]: hexdump output.bin: 0000000 c05e 3ef8 5d0f 3e2f 06ad 3f09 693f 3e5a 0000010 a9d3 3f03 52ae 3d37 4e7d 3f18 1dd4 3e02 0000020 5914 3dc3 41a7 3efa 2644 3f3a 0000 3f80 0000030 73c5 3ef8 af0e 3e65 02ea 3f0b 69e3 3e84 0000040 50f7 3ef7 e514 3e04 d723 3f04 bb62 3e1f 0000050 5a2c 3ef3 75e2 3e84 4f2e 3f0e 329c 3e9e 0000060 c04a 3ef2 ca20 3e99 df0d 3f10 8744 3eb9 0000070 f5c0 3ec3 61ce 3e0f 066a 3ed6 aada 3e1c 0000080 0000 0000 0000 0000 0000 0000 0000 0000
in which, one bounding box for example: c05e 3ef8 (0.48584265) stands for the ymin 5d0f 3e2f (0.17125343) for xmin 06ad 3f09 (0.5352581) is ymax 693f 3e5a(0.2132921) is xmax
it looks good for my test picure (http://images.cocodataset.org/val2017/000000088462.jpg)

ok. i understood.

actually, the model has 4 outputs as documented and (with this nn-cli fix) you can get them as the following.

--load-graph=file=ssd_mobilenet_v1_10.onnx,encoding=1 \
--init-execution-context \
--set-input=file=input.bin,dim=1,dim=383,dim=640,dim=3,type=3 \
--compute \
--get-output=idx=0,file=output0.bin \
--get-output=idx=1,file=output1.bin \
--get-output=idx=2,file=output2.bin \
--get-output=idx=3,file=output3.bin

as wasi-nn doesn't have get-output-by-name, you need to use integer indexes. output tensors of this model are (detection_boxes, detection_classes, detection_scores, num_detections). (in this order)

you were only looking at detection_boxes.

OK, Thank you!

@dongsheng28849455 dongsheng28849455 requested a review from yamt August 8, 2025 02:17
Copy link
Collaborator

@yamt yamt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dongsheng28849455
Copy link
Contributor Author

@lum1n0us could you help review the code?


/* Helper functions */
static void
check_status_and_log(const OnnxRuntimeContext *ctx, OrtStatus *status)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used?

}

static bool
convert_ort_type_to_wasi_nn_type(ONNXTensorElementDataType ort_type,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used?

err = convert_ort_error_to_wasi_nn_error(ctx, status);
NN_ERR_PRINTF("Failed to create ONNX Runtime environment: %s",
error_message);
ctx->ort_api->ReleaseStatus(status);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems convert_ort_error_to_wasi_nn_error() will ReleaseStatus(status). therefore, L194 might not be necessary

wasi_nn_error err = convert_ort_error_to_wasi_nn_error(ctx, status);
NN_ERR_PRINTF("Failed to create ONNX Runtime session: %s",
error_message);
ctx->ort_api->ReleaseStatus(status);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems convert_ort_error_to_wasi_nn_error() will ReleaseStatus(status). therefore, L352 might not be necessary

NN_ERR_PRINTF("Maximum number of graphs reached");
return runtime_error;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just my doubt, add a protector about name? like name is empty or NULL

__attribute__((visibility("default"))) wasi_nn_error
load(void *onnx_ctx, graph_builder_array *builder, graph_encoding encoding,
execution_target target, graph *g)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a protector about onnx_ctx like others?

   if (!onnx_ctx) {
        return runtime_error;
    }

__attribute__((visibility("default"))) wasi_nn_error
load_by_name(void *onnx_ctx, const char *name, uint32_t filename_len, graph *g)
{
OnnxRuntimeContext *ctx = (OnnxRuntimeContext *)onnx_ctx;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a protector about onnx_ctx like others?

   if (!onnx_ctx) {
        return runtime_error;
    }

ctx->ort_api->ReleaseValue(output.second);
}
ctx->ort_api->ReleaseMemoryInfo(ctx->exec_ctxs[i].memory_info);
ctx->exec_ctxs[i].is_initialized = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_names and output_names?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature Determine if this Issue request a new feature or this PR introduces a new feature. wasi-nn
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants