Verify the sizes of the sub-sections within the name section #375

sbc100 · 2017-03-29T21:49:41Z

No description provided.

KarlSchimpf

lgtm

pipcet · 2017-03-29T23:04:27Z

src/binary-reader.cc

@@ -1649,10 +1652,14 @@ static void read_custom_section(Context* ctx, uint32_t section_size) {
        ++i;
        break;
      default:
-        /* unknown subsection, skip rest of section */
-        ctx->offset = ctx->read_end;
+        /* unknown subsection, skip it */


This change will make us accept "name" sections in which the "function" subsection isn't the first subsection, which violates the spec. Not a bad thing, but a change in behavior.

Oh I see. They need to be in order. I'll add a check for that too

Actually doesn't this existing code already allow this? There doesn't seem to be any check for functions coming before locals. This change just meas that subsections we don't know about can appear anywhere, and will have their sizes validated, and will not terminate the parsing of the rest of the section.

Another thing we might want to do is to check that subsection data is actually contained in the subsection (by setting ctx->read_end to the subsection end temporarily, for example); just like read_function_body verifies a function body does not extend past its indicated length.

Good point.

The existing code allows the consumer to check the sections are in the right order (it doesn't verify that the function/local indices are specified in order either, but again the consumer can check that).

Now that I think about it, I feel that stricter is better, and we shouldn't go out of our way to be lenient.

I must confess I've resorted a few times to cutting out the name section of wasm modules in a text editor, so a switch to ignore (some) custom sections would make sense.

pipcet · 2017-03-29T23:13:32Z

src/binary-reader.cc

@@ -1605,6 +1605,9 @@ static void read_custom_section(Context* ctx, uint32_t section_size) {
      uint32_t subsection_size;
      in_u32_leb128(ctx, &name_type, "name type");
      in_u32_leb128(ctx, &subsection_size, "subsection size");
+      size_t subsection_end = ctx->offset + subsection_size;
+      if (subsection_end > ctx->read_end)
+        RAISE_ERROR("invalid sub-section size: extends past end");


Is RAISE_ERROR the right thing to do? Invalid custom sections do not cause the entire module to become invalid. That's why I ignored optional information entirely, though I agree now that checking the subsection sizes is much cleaner.

Maybe a separate callback for recoverable errors would be better?

Its a good question, for custom sections that the binary-reader knows about, how should it handle errors. My feeling is that if we are going to parse a section we should validate its contents. Perhaps a flag to disable all-parsing of custom sections if people don't what this? But perhaps we should add this if and when somebody needs it.

Well, most wasm modules out there have invalid name sections according to the current spec, I think. I think switching to C++ exceptions could help (we're currently using longjmp!), particularly if such exceptions clearly encode "you can continue reading the next section at offset x and the next subsection at offset y".

Yeah, it's kind of a weird case. I think sometimes we'll want to ignore invalid custom sections and sometimes we'll want to fail if they're invalid. In either case we'll want to bail on parsing the section if any part fails. I guess it should be up to the BinaryReader implementation what happens, like @pipcet says.

For the old "name" sections in wild, are we not blowing up already? It seems like we should be.

I think probably flag such as --no-custom or --no-debug-names or --skip-section=name would be for this kind of situation were you have binaries out there that use a certain custom section name but its not compliant with our current tooling convention. Ideally we would rename the section in this case (weren't we supposed to rename this one to utf8_names?) to avoid this kind of problem.

WebAssembly/design#1016 seems to have good support and makes all strings utf8. If that's the case then perhaps calling it utf8-name is redundant.

But strange that BinaryEncoding.md has the new content but not the new section name. Seems like that was a mistake.

If we really want to support old binaries with the previous name section we would need to build a parser that supports both, but we don't want to do that do we?

Do you happen to know what the current behavior is if this parser sees an old name section today? I do seem to remember that it somehow skips over it. Perhaps I could make that explicit in the code somehow.

Actually I just tested the current "master" and it already blows up on the old style name section:

$ ./out/wasmdump -x ./debug-names-1.wasm *ERROR*: @0x0000003d: unable to read string: name

So this change doesn't actually make it worse.

Yeah, the current behavior is to bail out on any error, even in custom sections. I think we should do something fancier (as discussed above), but that doesn't have to be in this PR. Opened #378 for further discussion.

Okay, it's probably of theoretical interest only, but I'm pretty sure I've proved that no "name" section is ambiguous: it's either a valid old-style name section, or a valid new-style one, or neither, but never both. I'd still prefer the approach of forgetting the old-style name sections ever existed, but we also have the option of first verifying that a name section splits into valid subsections; if it doesn't, it's an old-style section (and the converse holds, too), so we can still parse it.

pipcet · 2017-03-29T23:39:16Z

src/binary-reader.cc

@@ -1600,6 +1600,7 @@ static void read_custom_section(Context* ctx, uint32_t section_size) {
              section_name.length) == 0) {
    CALLBACK_SECTION(begin_names_section, section_size);
    uint32_t i = 0;
+    uint32_t previous_read_end = ctx->read_end;


size_t, I think.

Yep, good catch.

binji · 2017-03-30T00:42:55Z

test/run-tests.py

@@ -702,6 +705,9 @@ def main(args):
  parser.add_argument('--no-roundtrip',
                      help='don\'t run roundtrip.py on all tests',
                      action='store_false', default=True, dest='roundtrip')
+  parser.add_argument('-p', '--print-cmd',


these changes snuck in from the other PR

sbc100 · 2017-03-30T17:12:11Z

If its OK with you all we can add a flag in a separate PR when we address #378

pipcet · 2017-03-30T17:26:04Z

My preference is to do that: merge this first, then #376, and then #378 should be possible to fix by putting a filter class between the BinaryReader and the context.

Verify the sizes of the sub-sections within the name section

519ac52

sbc100 requested a review from binji March 29, 2017 21:49

add test

ddd2966

KarlSchimpf approved these changes Mar 29, 2017

View reviewed changes

binji approved these changes Mar 29, 2017

View reviewed changes

pipcet reviewed Mar 29, 2017

View reviewed changes

don't read past end of sub-section

67a29c8

pipcet reviewed Mar 29, 2017

View reviewed changes

fix type

3509f20

binji mentioned this pull request Mar 30, 2017

Handling errors in custom sections #378

Closed

Add more checks

9070b55

binji reviewed Mar 30, 2017

View reviewed changes

sbc100 added 2 commits March 30, 2017 10:10

revert

1141d94

Merge remote-tracking branch 'origin/master' into check_subsections

8d7f855

sbc100 merged commit 2e9e6ce into master Mar 30, 2017

sbc100 deleted the check_subsections branch March 30, 2017 21:20

Verify the sizes of the sub-sections within the name section #375

Verify the sizes of the sub-sections within the name section #375

Uh oh!

Conversation

sbc100 commented Mar 29, 2017

Uh oh!

KarlSchimpf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

binji Mar 30, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbc100 commented Mar 30, 2017

Uh oh!

pipcet commented Mar 30, 2017

Uh oh!

Uh oh!

binji Mar 30, 2017 •

edited

Loading