Skip to content

Need clarify constraint of 'beginningPadding' and 'endingPadding' for pad operation in "reflection" and "symmetric" mode #377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BruceDai opened this issue Apr 11, 2023 · 5 comments · Fixed by #843

Comments

@BruceDai
Copy link
Contributor

When I implemented pad operation for WebNN-Baseline webmachinelearning/webnn-baseline#47, I found there would be missing constraints of 'beginningPadding' and 'endingPadding' in "reflection" and "symmetric" mode, I propose to add constraints likes below table. @huningxin @wchao1115 @fdwr PTAL, thanks.

padding\mode reflection symmetric
beginningPadding beginningPadding[d] < input size of dimension d beginningPadding[d] <= input size of dimension d
endingPadding endingPadding[d] < input size of dimension d endingPadding[d] <= input size of dimension d

Reference:

  1. DML_PADDING_OPERATOR_DESC
  2. tf.mirrorPad
@fdwr
Copy link
Collaborator

fdwr commented Apr 14, 2023

Padding in principle isn't and shouldn't be limited to the input size - it just repeats as many times as needed to fill the gap:

image

Now extended with extra repeated padding -->

image

For example with DML_PADDING_OPERATOR_DESC, if you start with a 4D input tensor and pad with 8 trailing elements using DML_PADDING_MODE_REFLECTION, the reflection will occur more than once to yield 12 elements (notice the 8 elements have been repeated more than once).

dxdispatch.exe models\dml_padding.json
Running on 'NVIDIA Quadro P620 '
Dispatch 'clip': 1 iteration, 3.1778 ms (CPU), 0.0082 ms (GPU)
Resource 'input': 0, 1, 2, 3
Resource 'output': 0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1

It's generally the case for imaging API's and texture wrapping in OpenGL/D3D that reflection/mirroring repeats.

Evidently though, TF appears limited here:

If mode is "reflect" then both paddings[D, 0] and paddings[D, 1] must be no greater than x.shape[D] - 1

And PyTorch doesn't like it either:

import torch
import torch.nn

m = torch.nn.ReflectionPad2d((0, 8, 0, 0))
input = torch.arange(4, dtype=torch.float).reshape(1, 1, 1, 4)
print(m(input))

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (0, 8) at dimension 3 of input 4

So the questions are:

  • whether such extended padding will reach the WebNN API from these frameworks (seems possibly not)
  • whether the backends implementing WebNN are capable of it supporting them. I know DML can, but I couldn't deduce XNNPack's nor CoreML's capability readily. 🤔 Do you know?

If the answers to both of these are no, then limiting the bounds seems reasonable (even if from a pure theory POV, it should just work as expected).

@huningxin
Copy link
Contributor

@fdwr

whether the backends implementing WebNN are capable of it supporting them. I know DML can, but I couldn't deduce XNNPack's nor CoreML's capability readily. 🤔 Do you know?

@philloooo already shared the CoreML's constraints in #739 that

If mode is “reflect” then beginning and ending paddings can be at most input size-1 (same as tensorflow).
If mode is “replicate” (aka edge) then beginning and ending paddings can be at most input size.

For TFLite, although it doesn't report error, but it gives different results than DirectML, e.g.

a = [0, 1, 2, 3]
pad(a, [0], [8], {mode: 'symmetric'})
// DML gives [0, 1, 2, 3, 3, 2, 1, 0, 0, 1, 2, 3]
// TFL gives [0, 1, 2, 3, 3, 2, 1, 0, 0, 0, 0, 0]

pad(a, [0], [8], {mode: 'reflection'})
// DML gives [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1]
// TFL gives [0, 1, 2, 3, 2, 1, 0, 0, 0, 0, 0, 0]

If the answers to both of these are no, then limiting the bounds seems reasonable

Limiting the padding size would give the consistent results across backends. The proposal makes sense to me.

@fdwr
Copy link
Collaborator

fdwr commented Mar 27, 2025

For TFLite, although it doesn't report error, but it gives different results

😳 Indeed, I'd rather give an error than silently dubious results.

Limiting the padding size would give the consistent results across backends.

Seems safer for now. If a model arises in the future that warrants supporting extended wrapping (mirrored tiling, like WebGPU and WebGL do), we can relax this restriction and emulate it where necessary...

@fdwr
Copy link
Collaborator

fdwr commented Apr 28, 2025

Decision from the 2025-04-24 W3C meeting was that we would:

  • delete symmetric
  • restrict padding < dimensions for reflection

Q: Should padding size restriction also apply to constant fills? (it seems unnecessary, but I don't know the limitations of TLLiteRT and CoreML.

@huningxin
Copy link
Contributor

Q: Should padding size restriction also apply to constant fills? (it seems unnecessary, but I don't know the limitations of TLLiteRT and CoreML.

According to my test, WebNN TFLite and CoreML backends can set large padding size for constant mode.

For example,

a = [1, 2, 3]
pad(a, [100], [100], {mode: 'constant'}) // works on both TFLite and CoreML

BTW, CoreML backend has padding size constraints (<= input size) for edge mode.

a = [1, 2, 3]
pad(a, [3], [0], {mode: 'edge'}) // works
b = [1, 2, 3]
pad(b, [4], [0], {mode: 'edge'}) // fails with model compilation error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants