Description
Motivation
I am using the following pattern to efficiently initialize a large buffer inside wasm memory from JavaScript without copies (see #1079 for full motivation):
#[wasm_bindgen]
pub struct WasmMemBuffer {
buffer: Vec<u8>,
}
#[wasm_bindgen]
impl WasmMemBuffer {
#[wasm_bindgen(constructor)]
pub fn new(byte_length: u32, f: &js_sys::Function) -> Self {
let buffer = vec![0; byte_length as usize];
unsafe {
let array = js_sys::Uint8Array::view(&buffer);
f.call1(&JsValue::NULL, &JsValue::from(array))
.expect("The callback function should not throw");
}
Self { buffer }
}
}
fn compute_buffer_hash_impl(data: &[u8]) -> u32 { ... }
#[wasm_bindgen]
pub fn compute_buffer_hash(data: &WasmMemBuffer) -> u32 {
compute_buffer_hash_impl(&data.buffer)
}
let buffer = new WasmMemBuffer(100000, array => {
// "array" wraps a piece of wasm memory. Fill it with some values.
for (let i = 0; i < array.length; i++) {
array[i] = Math.floor(Math.random() * 256);
}
});
let hash = compute_buffer_hash(buffer); // No buffer copy when passing this argument. Yay!
buffer.free();
console.log(hash);
There are two problems with this:
- There's a mutation here that the rust compiler doesn't know about.
Uint8Array::view
takes an immutable slice ref, but then I go ahead and mutate the contents. Lying to the compiler is dangerous. - The vec is zero-initialized before its contents are overwritten, which wastes almost 400ms in my use case: https://perfht.ml/2K0rMKw
If I wanted to eliminate the zero initialization, I could do that using reserve
+ set_len
on the vec. However, rust forbids wrapping uninitialized values in a slice - you have to use raw pointers when initializing uninitialized memory. But the current signature of Uint8Array::view
requires me to make a slice.
Proposed Solution
I propose adding a new method to Uint8Array
(and maybe to the other typed array variants):
pub unsafe fn view_mut_raw(ptr: *mut u8, length: usize)
This would let me change my rust code to the following:
#[wasm_bindgen]
pub struct WasmMemBuffer {
buffer: Vec<u8>,
}
#[wasm_bindgen]
impl WasmMemBuffer {
#[wasm_bindgen(constructor)]
pub fn new(byte_length: u32, f: &js_sys::Function) -> Self {
let mut buffer: Vec<u8> = Vec::new();
buffer.reserve(byte_length as usize);
unsafe {
let array =
js_sys::Uint8Array::view_mut_raw(buffer.as_mut_ptr(),
byte_length as usize);
f.call1(&JsValue::NULL, &JsValue::from(array))
.expect("The callback function should not throw");
buffer.set_len(byte_length as usize);
}
Self { buffer }
}
}
It solves the problems: We've eliminated the zero initialization, we're not constructing a slice around uninitialized memory, and we're not lying to the compiler about the mutation anymore.
Alternatives
I'm not aware of any obvious alternatives.
Activity
alexcrichton commentedon Jul 8, 2019
Seems reasonable to me to add! I'm not 100% convinced we need this for mutability reasons, it seems a long stretch for anything bad to happen. Having this for unininitialized memory though is more compelling to me and seems pretty reasonable!
That3Percent commentedon Jul 18, 2019
In the same vein, I'd like to see a version of
copy_to
for typed arrays that can take a&mut MaybeUninit<[T]>
. Right now the only way to access the memory in a typed array as a slice is through thecopy_to
function, but it doesn't make sense to have to initialize the memory first. Should that be opened as a separate issue?That3Percent commentedon Jul 18, 2019
Oops, I think I meant
&mut [MaybeUninit<T>]
, sorry.lshlyapnikov commentedon Nov 7, 2019
view_mut_raw
has been merged. Do we want thecopy_to
addressed as part of the same issue?alexcrichton commentedon Nov 7, 2019
Seems reasonable to me to add as well!
copy_to_mem_raw
for the typed arrays (Int8Array, Int16Array, etc) #1855lshlyapnikov commentedon Nov 8, 2019
@That3Percent Do you mean you want to be able to pass the uninitialized
MaybeUninit
to thecopy_to_xxx
and let it deal with the allocation?lshlyapnikov commentedon Nov 8, 2019
is there any way to check if
MaybeUninit
has been initialized?That3Percent commentedon Nov 8, 2019
@ibaryshnikov
MaybeUninit
deals with memory that is already allocated, but not yet initialized. So,copy_to_xxx
would not have to deal with any allocation. It would merely fulfill the contract of initializing the memory by writing valid data to it.Here's the workflow as it exists today and how it could be improved by the issue.
Task: Given some
TypedArray
owned by the JavaScript runtime, make it's contents readable as a&[T]
Today:
TypedArray
. This can be done with eg:Vec::.with_capacity
or similarcopy_to_xxx
, which overwrites all the zeroes that were just writtenStep 2 is what we want to get rid of. Why write 0 when it's going to be overridden immediately? But, it's not possible without invoking Undefined Behavior, because it's not valid to even have a
&[u32]
over uninitialized memory, even if unused.Proposed:
TypedArray
. This can be done with eg: second-stack,RawVec
or similar.copy_to_xxx
, which initializes the memory by virtue of writing to it on the other side of the FFI boundary.That3Percent commentedon Nov 8, 2019
@lshlyapnikov
I just took a look at #1855 . I think that what you built there is a decent step toward a more performant version of the existing
to_vec
method. If you look at what that method does today, it executes steps 1-3 that I outlined - allocate, initialize, overwrite. Whereas your code does the more performant version. The API forcopy_to_maybe_uninit
does not make a great deal of sense though. (eg: why take an MaybeUninit at all if the method internally allocates a Vec?). I think you should refactor the code you have intoto_vec
so that everyone can enjoy the performance gains, without the usage of MaybeUninit at all. Thecopy_to_maybe_uninit
method should be modified to accept a (I think)&mut [MaybeUninit<T>]
to move control of the allocation/destination to the caller.lshlyapnikov commentedon Nov 16, 2019
@That3Percent does this match your use case?
https://github.com/rustwasm/wasm-bindgen/pull/1855/files#diff-a4985f313137ee0bff2488dce289c41cR238-R251
MaybeUninit
is pointing to an uninitializedVec
.sut
iscopy_to_unsafe
It is not exactly what you asked for, the signature of the function is:
lshlyapnikov commentedon Nov 16, 2019
@That3Percent I think you can use the existing
copy_to
with uninitialized memory. Why do you want another method? The below works with the existingjs_sys::Uint8Array::copy_to
and uninitialized slice:Do I miss anything? Here is the entire test case:
lshlyapnikov commentedon May 7, 2020
Can anyone review the
copy_to_mem_raw
#1855 (comment)
adrian17 commentedon Apr 8, 2022
This touches more APIs, for example:
glReadPixels
's point is to fill a possibly uninitialized buffer, but given the wasm-bindgen WebGL signature:Currently there's no (legal and sound from Rust's POV) way of avoiding zero-initialization without changing this API.