Skip to content

Determinism of the engine #260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
angrymouse opened this issue Feb 10, 2024 · 18 comments
Closed

Determinism of the engine #260

angrymouse opened this issue Feb 10, 2024 · 18 comments

Comments

@angrymouse
Copy link

Hey! Can quickjs/quickjs-ng act as deterministic sandbox? (So that same code will always give same result with same input, even if code authors try to get different results). Assume that only deterministic and fully synchronous functions are exposed. Is there any way maliciously constructed code can produce non-deterministic result?

@saghul
Copy link
Contributor

saghul commented Feb 11, 2024

I guess the answer is "yes", since one can use Math.random for instance...

@bnoordhuis
Copy link
Contributor

You can monkey-patch Math.random (and the Date constructor, Date.now, etc.) before you start executing code.

I wrote a PoC for V8 a few years ago and, as long as you're not dealing with external resources like files or network connections, it's relatively straightforward.

A gotcha I ran into back then was numerical stability of things like Math.atanh on different systems. V8 at the time called out to libc (like quickjs still does) and different libcs have different precision at the edges, sometimes wildly different.

@angrymouse
Copy link
Author

Let's say I removed all of Math module (and Date and others that could introduce non-determinism).
Can non-determinism still happen with things like float multiplication/division or something like that?

@bnoordhuis
Copy link
Contributor

If you restrict yourself to a single system (os/arch/etc.), I think the answer is 'no'. I can't come up with any counterexamples, at least.

Across systems? Depends on how you define non-determinism.

There can be small observable differences, like the value of Number.MIN_VALUE on systems that don't support subnormals/denormals:

In the IEEE 754-2019 double precision binary representation, the smallest possible value is a denormalized number.

If an implementation does not support denormalized values, the value of Number.MIN_VALUE must be the smallest non-zero positive value that can actually be represented by the implementation.

(from section 21.1.2.9 of the ecmascript specification)

@angrymouse
Copy link
Author

Thank you! That answers my question well.

@juancampa
Copy link
Contributor

There's at least another source of non-determinism that I discovered recently. Shapes hash values are initialized with the value of a pointer: https://github.com/quickjs-ng/quickjs/blob/229b07b9b2c811eaf84db209a1d6f9e2a8a7b0d9/quickjs.c#L4234

On most platforms, pointer values are non-deterministic. One exception is WebAssembly, where linear memory always starts at address 0x0.

@bnoordhuis
Copy link
Contributor

That's not observable from JS though (or shouldn't be.)

@Lohann
Copy link

Lohann commented Jun 23, 2024

Just for the fact that this library uses the hardware for compute IEEE-754 float, and all numbers in javascript are IEEE-754 float, this is already a non-determinism factor:
https://gafferongames.com/post/floating_point_determinism/

for one reason or another it is considered very difficult to get exactly the same result from floating point calculations on two different machines. People even report different results on the same machine from run to run, and between debug and release builds. Other folks say that AMDs give different results to Intel machines, and that SSE results are different from x87.

@angrymouse
Copy link
Author

Reopening due to issues mentioned above

@angrymouse angrymouse reopened this Jun 23, 2024
@Lohann
Copy link

Lohann commented Jun 24, 2024

I haven't tested using quickjs because I don't have the dev tools in my two machines, but:

For discover the platform endianness:

let uInt32 = new Uint32Array([0x11223344]);
let uInt8 = new Uint8Array(uInt32.buffer);
 
if (uInt8[0] === 0x44) {
    console.log('Little Endian');
} else if (uInt8[0] === 0x11) {
    console.log('Big Endian');
} else {
    console.log('unknown endianness!');
}

The following code get different result between arm, x86 etc..

let nan = new Float32Array([0.0, 1.0, NaN, 0.0]);
nan[1] = nan[1] / nan[3];
nan[0] = nan[0] / nan[3];
nan[3] = nan[0] / nan[0];
let uint8 = new Uint8Array(nan.buffer);
console.log(Array.from(uint8));
// apple silicon: [0, 0, 192, 127, 0, 0, 128, 127, 0, 0, 192, 127, 0, 0, 192, 127]
// amd x86_64:    [0, 0, 192, 255, 0, 0, 128, 127, 0, 0, 192, 127, 0, 0, 192, 255]

@chqrlie
Copy link
Collaborator

chqrlie commented Jun 24, 2024

I haven't tested using quickjs because I don't have the dev tools in my two machines, but:

For discover the platform endianness:

let uInt32 = new Uint32Array([0x11223344]);
let uInt8 = new Uint8Array(uInt32.buffer);
 
if (uInt8[0] === 0x44) {
    console.log('Little Endian');
} else if (uInt8[0] === 0x11) {
    console.log('Big Endian');
} else {
    console.log('unknown endianness!');
}

This is a feature IMHO.

The following code get different result between arm, x86 etc..

let nan = new Float32Array([0.0 / 0.0, NaN, 0.0, 0.0]);
nan[2] /= nan[1];
nan[3] /= nan[0];
let uint8 = new Uint8Array(nan.buffer);
console.log(Array.from(uint8));
// apple silicon: [0, 0, 192, 127, 0, 0, 192, 127, 0, 0, 192, 127, 0, 0, 192, 127]
// amd x86_64:    [0, 0, 192, 255, 0, 0, 192, 127, 0, 0, 192, 127, 0, 0, 192, 255]

Interesting! Testing for nans for every numeric operation seems wasteful though. How does v8 handle this?

@saghul
Copy link
Contributor

saghul commented Jun 24, 2024

I think the requirements for their use case deviate from a "traditional" JS engine.

Endinanness might not be a problem if you are always deploying on the usual suspect architectures, the float stuff looks like a different story.

I suppose that one way around both is to run QuickJS compiled to WASM. That way these 2 elements would be deterministic, as determined by the underlying WASM engine, right?

@Lohann
Copy link

Lohann commented Jun 24, 2024

That way these 2 elements would be deterministic, as determined by the underlying WASM engine, right?

Nope, just the endianness as wasm enforce Little Endian, but wasm doesn't guarantee float determinism across different architectures:
https://github.com/WebAssembly/design/blob/master/Rationale.md#nan-bit-pattern-nondeterminism

@saghul
Copy link
Contributor

saghul commented Jun 24, 2024

Today I learned :-)

I guess using a soft-float replacement might be the only option.

@guest271314
Copy link
Contributor

If you can intercept the test you can influence the result

let uInt32 = new Uint32Array([0x11223344]);
let ab = new ArrayBuffer(4);
let view = new DataView(ab);
view.setUint32(0, uInt32, false);
let uInt8 = new Uint8Array(ab);
if (uInt8[0] === 0x44) {
    console.log('Little Endian');
} else if (uInt8[0] === 0x11) {
    console.log('Big Endian');
} else {
    console.log('unknown endianness!');
}

@guest271314
Copy link
Contributor

Re floating point numbers, if a baseline is created for all target platforms then those floating point numbers are spread to an Array it is possible to achieve precision and determine which platform the floating point numbers are created in. See this answer for this question Number (integer or decimal) to array, array to number (integer or decimal) without using strings. I was working on solving OEIS A217626 directly by determining the exact multiple of the number 9 to determine the Nth lexicographic permutation. It's a closed-loop system where we reduce any integer or decimal to an Array of individual numbers, and thereby can add or subtract each index of the Array, carry over from right to left, and thus achieve precision without overflow.

In pertinent part, in code

if (!int) {
    let e = ~~a;
    d = a - e;
    do {
        if (d < 1) ++i;
        d *= 10;
    } while (!Number.isInteger(d));
}

@bnoordhuis
Copy link
Contributor

I'm closing this because I don't think there's anything actionable for us maintainers here. And if there is, it's probably better to create new, more focused issues.

W.r.t:

it is considered very difficult to get exactly the same result from floating point calculations on two different machines. People even report different results on the same machine from run to run, and between debug and release builds. Other folks say that AMDs give different results to Intel machines, and that SSE results are different from x87

I'm skeptical about the "run to run" claim. Sure, CPU and compiler bugs exist, but the one time I had someone report something like that to me, it turned out to be a third-party library that called fesetround() without changing it back.

@bnoordhuis bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Oct 26, 2024
@Lohann
Copy link

Lohann commented Nov 6, 2024

I'm skeptical about the "run to run" claim.

@bnoordhuis I read this in this article (which was focused on reproducible physics for games), I was able to reproduce a few of those non-deterministic results, I believe the difference in "run to run" is when the compiler decides to use SIMD for some operations (like AVX-512), and regular xmm* register for others, but I think this is more unlikely to happen in an interpreter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants