Skip to content

Conversation

juj
Copy link
Collaborator

@juj juj commented Mar 8, 2022

Remove TextDecoderWrapper, and optimize performance of shared->unshared conversions via compile time analysis, and optimize generated code size.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

var maxPtr = ptr + maxBytesToRead;
for (var end = ptr; !(end >= maxPtr) && HEAPU8[end];) ++end;
return UTF8Decoder.decode(HEAPU8.subarray(ptr, end));
return UTF8Decoder.decode({{{ getUnsharedTextDecoderView('HEAPU8', 'ptr', 'end') }}});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you can just pass a single argument to the function in {{{ }}}. How about something like:

{{{ getUnsharedSubarrayMethod('HEAPU8') }}}(ptr, end)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would save on code size too because in the fallback case the the (ptr, end) is needlessly duplicated in the returned string.

Copy link
Collaborator Author

@juj juj Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you can just pass a single argument to the function in {{{ }}}

Hmm did I miss something here?

Building emcc tests/hello_world.c -o a.js does give

return UTF8Decoder.decode(heapOrArray.subarray(idx, endPtr));

here.

And building in -sTEXTDECODER=2 mode does produce

return UTF8Decoder.decode(heapOrArray.buffer ? heapOrArray.subarray(idx, endPtr) : new Uint8Array(heapOrArray.slice(idx, endPtr)));

So it does seem like multiple arguments work ok, and tests do also pass green?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking you could drop the final two argument to your helper function can't your reduce the output size, and avoid passing extra info to the function that it doesn't actually need to know about.

// Some web functions like TextDecoder.decode() may not work with a view of a       
// SharedArrayBuffer, see https://github.com/whatwg/encoding/issues/172            
// To avoid that, this function allows obtaining an unshared copy of an            
// ArrayBuffer.                                                                    
function getUnsharedTextDecoderView(heap) {                            
 const shared = `${heap}.slice`;                                
  const unshared = `${heap}.subarray`;                           
                                                                                   
  // No need to worry about this in non-shared memory builds                       
  if (!SHARED_MEMORY) return unshared;                                             
                                                                                   
  // If asked to get an unshared view to what we know will be a shared view, or if in -Oz,
  // then unconditionally do a .slice() for smallest code size.                    
  if (SHRINK_LEVEL == 2 || heap == 'HEAPU8') return shared;                        
                                                                                   
  // Otherwise, generate a runtime type check: must do a .slice() if looking at a SAB,
  // or can use .subarray() otherwise.                                             
  return `(${heap}.buffer instanceof SharedArrayBuffer ? ${shared} : ${unshared})`;
}  

Then the (idx, endPtr) part would not be part of the {{{ }}} block. In other words, does the (${start}, ${end}) part need to be part of the runtime helper? I could be wrong .. but it looks like its not needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, now I understand.

I don't think (foo ? bar.slice : bar.subarray)(idx, endPtr) works, since it would splice off the this reference. I.e. the functions .slice and .subarray would get called without bar as this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, fair enough.

lgtm

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could use bind but that is probably more chars overall?

@juj juj enabled auto-merge (squash) March 9, 2022 11:45
@juj juj merged commit 47a0284 into emscripten-core:main Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants