Explicitly recognise asm_float_zero #5046

aidanhs · 2017-03-17T19:42:49Z

With the bitcode I have, llvm (in -O3) generates (before any js opts) a return as follows:

return +(+0); with -s WASM=0
return Math_fround((Math_fround(+0))); with -s WASM=1 (because PRECISE_F32 is always enabled with wasm)

Emscripten then performs JS opts (presumably to prepare for the emterpreter) in three steps:

asmPreciseF32 asm eliminate simplifyExpressions emitJSON (native)
asmPreciseF32 asm receiveJSON localCSE safeLabelSetting emitJSON (non-native)
asmPreciseF32 asm receiveJSON optimizeFrounds registerizeHarder asmLastOpts (native)

+(+0) simplifies to +0, but, courtesy of optimizeFrounds, Math_fround((Math_fround(+0))); simplifies to the variable f0. Now the emterpretify step is unable to figure out what the return type is (when detecting return type, the asmData isn't ready yet so isn't passed to detectType and the variable type can't be detected). This PR explicitly matches against the global float zero.

So the report in #5031 is more accurately expressed as "-s EMTERPRETIFY=1 does not work with PRECISE_F32=1", it's just that wasm made it obvious. I'm somewhat surprised that no code you've tested has exhibited this issue, but maybe code that returns a Math_fround call is relatively rare and so is emterpreter testing with binaryen?

kripken · 2017-03-17T22:49:59Z

Thanks! Looks correct. We should just add a test for this. I suggest adding it to test_emterpretify in test_core.py (i.e. make a new test cpp file instead of the hello world we use, and add a return of a float - mark it no-inline to be sure).

I'm somewhat surprised that no code you've tested has exhibited this issue

I'm surprised fuzzing didn't catch it - perhaps the fuzzer doesn't return floats? Otherwise, PRECISE_F32 + emterpreter don't have much overlap in the suite, and we do have some wasm + emterpreter tests, which as you say, would catch this if we returned a float, but they happen to not have that...

aidanhs · 2017-04-06T00:56:04Z

It turns out my explanation for why this is happening is wrong. After I was unable to come up with a test, I looked again - optimizeFrounds deliberately does not optimize Math_fround(0) to f0 in returns (and even if it did, the optimization would not happen as I describe above because there is a leading + in front of the 0 which would inhibit it).

Instead, it turns out that this is an optimization performed by registerizeHarder! Here's a simplified piece of asm.js:

function _combine_soft_light_c($0,$1,$2,$3) {
 $0 = Math_fround($0);
 $1 = Math_fround($1);
 $2 = Math_fround($2);
 $3 = Math_fround($3);
 var $4 = 0, $5 = 0, $6 = 0, $7 = 0, $8 = 0, $9 = 0, $or$cond = 0, label = 0, sp = 0;
 var $10 = 0, $11 = 0, $12 = 0, $13 = 0, $20 = Math_fround(0), $21 = Math_fround(0), $22 = Math_fround(0), $23 = Math_fround(0), $24 = Math_fround(0);
 sp = STACKTOP;
 STACKTOP = STACKTOP + 48|0;
 $4 = sp + 36|0;
 $5 = sp + 32|0;
 $6 = sp + 28|0;
 $7 = sp + 24|0;
 $8 = sp + 20|0;
 $9 = sp + 16|0;
 $10 = sp + 12|0;
 $11 = sp + 8|0;
 $12 = sp + 4|0;
 $13 = sp;
 $20 = Math_fround(HEAPF32[$13>>2]);
 $21 = Math_fround(HEAPF32[$9>>2]);
 $22 = Math_fround(HEAPF32[$8>>2]);
 $23 = Math_fround(HEAPF32[$7>>2]);
 $24 = $23 <= $22;
 if ($24) {
  $90 = Math_fround($23);
  STACKTOP = sp;return (Math_fround($90));
 } else {
  $90 = Math_fround($22);
  STACKTOP = sp;return (Math_fround($90));
 }
 return Math_fround((Math_fround(+0)));
}

You can optimize this directly with ./optimizer x.js asmPreciseF32 asm eliminate simplifyExpressions optimizeFrounds registerizeHarder to get

function _combine_soft_light_c(f1, f2, f4, f5) {
 f1 = Math_fround(f1);
 f2 = Math_fround(f2);
 f4 = Math_fround(f4);
 f5 = Math_fround(f5);
 var i3 = 0;
 i3 = STACKTOP;
 STACKTOP = STACKTOP + 48 | 0;
 f1 = Math_fround(HEAPF32[i3 + 20 >> 2]);
 f2 = Math_fround(HEAPF32[i3 + 24 >> 2]);
 if (f2 <= f1) {
  $90 = Math_fround(f2);
  STACKTOP = i3;
  return Math_fround($90);
 } else {
  $90 = Math_fround(f1);
  STACKTOP = i3;
  return Math_fround($90);
 }
 return f0;
}

I guess registerizeHarder figures that the final return is dead, so it can replace it with whatever (possibly it shouldn't, given that optimizeFrounds tries so hard not to). So in fact, the combination of conditions required to see this issue is "have a float-returning function with an if/else at the end of the function with each branch returning something and being sufficiently complex that llvm does not attempt a simplification to just use the single dead return statement at the end of the function to return the value from the two branches (i.e. avoiding if (x) { $ret = fround(1) } else { $ret = fround(2) } return fround($ret);").

As you might imagine with the description above, it's actually pretty hard to come up with a test case here because llvm on O3 takes great pleasure in rearranging if conditions. Looking at the actual asm.js that tickles the bug, there's no reason I can see that LLVM hasn't done exactly that - I guess it's just exceeded some code size threshold.

Click to see unreduced asm.js for function

function _combine_soft_light_c($0,$1,$2,$3) {
 $0 = Math_fround($0);
 $1 = Math_fround($1);
 $2 = Math_fround($2);
 $3 = Math_fround($3);
 var $10 = 0, $11 = 0, $12 = 0, $13 = 0, $14 = Math_fround(0), $15 = Math_fround(0), $16 = Math_fround(0), $17 = Math_fround(0), $18 = Math_fround(0), $19 = Math_fround(0), $20 = Math_fround(0), $21 = Math_fround(0), $22 = Math_fround(0), $23 = Math_fround(0), $24 = Math_fround(0), $25 = Math_fround(0), $26 = Math_fround(0), $27 = Math_fround(0), $28 = Math_fround(0), $29 = Math_fround(0);
 var $30 = Math_fround(0), $31 = 0, $32 = Math_fround(0), $33 = 0, $34 = Math_fround(0), $35 = 0, $36 = Math_fround(0), $37 = Math_fround(0), $38 = Math_fround(0), $39 = Math_fround(0), $4 = 0, $40 = Math_fround(0), $41 = Math_fround(0), $42 = Math_fround(0), $43 = Math_fround(0), $44 = Math_fround(0), $45 = Math_fround(0), $46 = Math_fround(0), $47 = Math_fround(0), $48 = Math_fround(0);
 var $49 = Math_fround(0), $5 = 0, $50 = Math_fround(0), $51 = Math_fround(0), $52 = Math_fround(0), $53 = Math_fround(0), $54 = Math_fround(0), $55 = 0, $56 = Math_fround(0), $57 = Math_fround(0), $58 = Math_fround(0), $59 = Math_fround(0), $6 = 0, $60 = Math_fround(0), $61 = Math_fround(0), $62 = Math_fround(0), $63 = Math_fround(0), $64 = Math_fround(0), $65 = Math_fround(0), $66 = Math_fround(0);
 var $67 = Math_fround(0), $68 = Math_fround(0), $69 = Math_fround(0), $7 = 0, $70 = Math_fround(0), $71 = Math_fround(0), $72 = Math_fround(0), $73 = Math_fround(0), $74 = Math_fround(0), $75 = Math_fround(0), $76 = Math_fround(0), $77 = Math_fround(0), $78 = Math_fround(0), $79 = Math_fround(0), $8 = 0, $80 = Math_fround(0), $81 = Math_fround(0), $82 = Math_fround(0), $83 = Math_fround(0), $84 = Math_fround(0);
 var $85 = Math_fround(0), $86 = Math_fround(0), $87 = Math_fround(0), $88 = Math_fround(0), $89 = Math_fround(0), $9 = 0, $90 = Math_fround(0), $or$cond = 0, label = 0, sp = 0;
 sp = STACKTOP;
 STACKTOP = STACKTOP + 48|0;
 $4 = sp + 36|0;
 $5 = sp + 32|0;
 $6 = sp + 28|0;
 $7 = sp + 24|0;
 $8 = sp + 20|0;
 $9 = sp + 16|0;
 $10 = sp + 12|0;
 $11 = sp + 8|0;
 $12 = sp + 4|0;
 $13 = sp;
 HEAPF32[$9>>2] = $0;
 HEAPF32[$10>>2] = $1;
 HEAPF32[$11>>2] = $2;
 HEAPF32[$12>>2] = $3;
 $14 = Math_fround(HEAPF32[$9>>2]);
 $15 = Math_fround(Math_fround(+1) - $14);
 $16 = Math_fround(HEAPF32[$12>>2]);
 $17 = Math_fround($15 * $16);
 $18 = Math_fround(HEAPF32[$11>>2]);
 $19 = Math_fround(Math_fround(+1) - $18);
 $20 = Math_fround(HEAPF32[$10>>2]);
 $21 = Math_fround($19 * $20);
 $22 = Math_fround($17 + $21);
 HEAPF32[$13>>2] = $22;
 $23 = Math_fround(HEAPF32[$13>>2]);
 $24 = Math_fround(HEAPF32[$9>>2]);
 $25 = Math_fround(HEAPF32[$10>>2]);
 $26 = Math_fround(HEAPF32[$11>>2]);
 $27 = Math_fround(HEAPF32[$12>>2]);
 HEAPF32[$5>>2] = $24;
 HEAPF32[$6>>2] = $25;
 HEAPF32[$7>>2] = $26;
 HEAPF32[$8>>2] = $27;
 $28 = Math_fround(HEAPF32[$6>>2]);
 $29 = Math_fround(Math_fround(+2) * $28);
 $30 = Math_fround(HEAPF32[$5>>2]);
 $31 = $29 < $30;
 $32 = Math_fround(HEAPF32[$7>>2]);
 $33 = Math_fround(+-1.17549435E-38) < $32;
 $34 = Math_fround(HEAPF32[$7>>2]);
 $35 = $34 < Math_fround(+1.17549435E-38);
 $or$cond = $33 & $35;
 if ($31) {
  $36 = Math_fround(HEAPF32[$8>>2]);
  $37 = Math_fround(HEAPF32[$5>>2]);
  $38 = Math_fround($36 * $37);
  if ($or$cond) {
   HEAPF32[$4>>2] = $38;
   $89 = Math_fround(HEAPF32[$4>>2]);
   $90 = Math_fround($23 + $89);
   STACKTOP = sp;return (Math_fround($90));
  } else {
   $39 = Math_fround(HEAPF32[$8>>2]);
   $40 = Math_fround(HEAPF32[$7>>2]);
   $41 = Math_fround(HEAPF32[$8>>2]);
   $42 = Math_fround($40 - $41);
   $43 = Math_fround($39 * $42);
   $44 = Math_fround(HEAPF32[$5>>2]);
   $45 = Math_fround(HEAPF32[$6>>2]);
   $46 = Math_fround(Math_fround(+2) * $45);
   $47 = Math_fround($44 - $46);
   $48 = Math_fround($43 * $47);
   $49 = Math_fround(HEAPF32[$7>>2]);
   $50 = Math_fround($48 / $49);
   $51 = Math_fround($38 - $50);
   HEAPF32[$4>>2] = $51;
   $89 = Math_fround(HEAPF32[$4>>2]);
   $90 = Math_fround($23 + $89);
   STACKTOP = sp;return (Math_fround($90));
  }
 }
 if ($or$cond) {
  HEAPF32[$4>>2] = Math_fround(+0);
  $89 = Math_fround(HEAPF32[$4>>2]);
  $90 = Math_fround($23 + $89);
  STACKTOP = sp;return (Math_fround($90));
 }
 $52 = Math_fround(HEAPF32[$8>>2]);
 $53 = Math_fround(Math_fround(+4) * $52);
 $54 = Math_fround(HEAPF32[$7>>2]);
 $55 = $53 <= $54;
 $56 = Math_fround(HEAPF32[$8>>2]);
 $57 = Math_fround(HEAPF32[$5>>2]);
 $58 = Math_fround($56 * $57);
 if ($55) {
  $59 = Math_fround(HEAPF32[$6>>2]);
  $60 = Math_fround(Math_fround(+2) * $59);
  $61 = Math_fround(HEAPF32[$5>>2]);
  $62 = Math_fround($60 - $61);
  $63 = Math_fround(HEAPF32[$8>>2]);
  $64 = Math_fround($62 * $63);
  $65 = Math_fround(HEAPF32[$8>>2]);
  $66 = Math_fround(Math_fround(+16) * $65);
  $67 = Math_fround(HEAPF32[$7>>2]);
  $68 = Math_fround($66 / $67);
  $69 = Math_fround($68 - Math_fround(+12));
  $70 = Math_fround(HEAPF32[$8>>2]);
  $71 = Math_fround($69 * $70);
  $72 = Math_fround(HEAPF32[$7>>2]);
  $73 = Math_fround($71 / $72);
  $74 = Math_fround($73 + Math_fround(+3));
  $75 = Math_fround($64 * $74);
  $76 = Math_fround($58 + $75);
  HEAPF32[$4>>2] = $76;
  $89 = Math_fround(HEAPF32[$4>>2]);
  $90 = Math_fround($23 + $89);
  STACKTOP = sp;return (Math_fround($90));
 } else {
  $77 = Math_fround(HEAPF32[$8>>2]);
  $78 = Math_fround(HEAPF32[$7>>2]);
  $79 = Math_fround($77 * $78);
  $80 = (Math_fround(Math_sqrt((Math_fround($79)))));
  $81 = Math_fround(HEAPF32[$8>>2]);
  $82 = Math_fround($80 - $81);
  $83 = Math_fround(HEAPF32[$6>>2]);
  $84 = Math_fround(Math_fround(+2) * $83);
  $85 = Math_fround(HEAPF32[$5>>2]);
  $86 = Math_fround($84 - $85);
  $87 = Math_fround($82 * $86);
  $88 = Math_fround($58 + $87);
  HEAPF32[$4>>2] = $88;
  $89 = Math_fround(HEAPF32[$4>>2]);
  $90 = Math_fround($23 + $89);
  STACKTOP = sp;return (Math_fround($90));
 }
 return Math_fround((Math_fround(+0)));
}

So, thoughts? I'm not really sure how to create a test case that doesn't heavily depend on details of the LLVM optimizer.

kripken · 2017-04-06T02:56:03Z

Interesting. My guess is that there is special code in registerizeHarder to handle the final return in a function, which is special in asm.js, e.g. it must exist, even if right before it is an if-else which never returns, so sounds related to what you see there. One theory is it is removing it, then re-introducing a new return to ensure one exists, and it does that last part wrong.

For testing, I'd say just add to other.test_js_optimizer. Then you just provide the asm.js before and after the js optimizer is run, and the LLVM optimizer is never in the picture.

aidanhs · 2017-04-06T19:42:38Z

Ok, I've added a couple of tests. test-reduce-dead-float-return-output verifies that optimizeFrounds and registerizeHarder combined reduce a dead return Math_fround((Math_fround(+0))); to return f0;. This is unchanged by my patch.

test-no-reduce-dead-float-return-to-nothing-output verifies that a return f0; is left alone by registerizeHarder (i.e. makes sure that my patch fixes this bug).

kripken · 2017-04-07T01:17:53Z

Great, thanks!

We'll need to bump the version number so the c++ optimizer gets rebuilt. I'll do that later (til then the bots might show an error on that test).

kripken · 2017-04-11T20:42:15Z

Looks like this broke other.test_emterpreter. The error contains Assertion failed: [["binary","+",["name","f0"],["name","f0"]],2,0] which looks relevant as well.

kripken · 2017-04-11T20:44:06Z

Hmm, the test may be invalid, this is the right area,

    do_js_test('float', r'''
function _main() {
  var f = f0;
  f = f0 + f0;
  print(f);
}
''', [], '0\n', floaty=True)

Perhaps we are missing the declaration of f0?

kripken · 2017-04-11T20:54:57Z

Ah no, it's something else. PR coming soon.

…w we detect f0 properly (since #5046), so we do notice float operations in more cases, which is why this went unnoticed before. we do all emterpreter math in doubles anyhow, so this just requires us to accept float ops and emit the same double ops

…cessary as no… (#5129) * support float versions of binary operations, which is necessary as now we detect f0 properly (since #5046), so we do notice float operations in more cases, which is why this went unnoticed before. we do all emterpreter math in doubles anyhow, so this just requires us to accept float ops and emit the same double ops * improve emterpreter makeBinary assertion message

Explicitly recognise asm_float_zero

f041b9c

aidanhs mentioned this pull request Mar 17, 2017

-s WASM=1 not working with emterpreter #5031

Closed

aidanhs mentioned this pull request Mar 18, 2017

asm.js output size increase after update to latest incoming #5049

Closed

Add some tests

9017aaa

kripken merged commit 36a62c6 into emscripten-core:incoming Apr 7, 2017

aidanhs deleted the aphs-detect-global-f0 branch April 7, 2017 01:43

kripken mentioned this pull request Apr 11, 2017

Emterpreter: support float versions of binary operations, which is necessary as no… #5129

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explicitly recognise asm_float_zero #5046

Explicitly recognise asm_float_zero #5046

Uh oh!

aidanhs commented Mar 17, 2017

Uh oh!

kripken commented Mar 17, 2017

Uh oh!

aidanhs commented Apr 6, 2017 •

edited

Loading

Uh oh!

kripken commented Apr 6, 2017

Uh oh!

aidanhs commented Apr 6, 2017

Uh oh!

kripken commented Apr 7, 2017 •

edited

Loading

Uh oh!

kripken commented Apr 11, 2017

Uh oh!

kripken commented Apr 11, 2017

Uh oh!

kripken commented Apr 11, 2017

Uh oh!

Uh oh!

Explicitly recognise asm_float_zero #5046

Explicitly recognise asm_float_zero #5046

Uh oh!

Conversation

aidanhs commented Mar 17, 2017

Uh oh!

kripken commented Mar 17, 2017

Uh oh!

aidanhs commented Apr 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Apr 6, 2017

Uh oh!

aidanhs commented Apr 6, 2017

Uh oh!

kripken commented Apr 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Apr 11, 2017

Uh oh!

kripken commented Apr 11, 2017

Uh oh!

kripken commented Apr 11, 2017

Uh oh!

Uh oh!

aidanhs commented Apr 6, 2017 •

edited

Loading

kripken commented Apr 7, 2017 •

edited

Loading