Skip to content

[#359] extension: fix typing #361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions ext/mini_racer_extension/mini_racer_extension.c
Original file line number Diff line number Diff line change
Expand Up @@ -470,7 +470,7 @@ static void des_map_end(void *arg)
pop(arg);
}

static void des_object_ref(void *arg, uint32_t id)
static void des_object_ref(void *arg, long id)
Copy link
Author

@gi gi Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruby's C API rb_ary_entry uses the type long for the offset: (ref)

Since this value is not used anywhere else in this function, it makes sense to forward this type to any callers within this library.

Even if the passed in values will never be negative within this library, the types should still align.

If long is 32 bits, then a uint32_t argument passed through could overflow.

{
DesCtx *c;
VALUE v;
Expand Down Expand Up @@ -607,7 +607,7 @@ static int serialize1(Ser *s, VALUE refs, VALUE v)
if (sign < 0)
v = rb_big_mul(v, LONG2FIX(-1));
rb_big_pack(v, limbs, countof(limbs));
ser_bigint(s, limbs, countof(limbs), sign);
ser_bigint(s, limbs, sizeof(limbs[0]), countof(limbs), sign);
break;
case T_FIXNUM:
ser_int(s, FIX2LONG(v));
Expand Down Expand Up @@ -651,7 +651,7 @@ static int serialize(Ser *s, VALUE v)
return serialize1(s, rb_hash_new(), v);
}

static struct timespec deadline_ms(int ms)
static struct timespec deadline_ms(int64_t ms)
Copy link
Author

@gi gi Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter ms should be of type int64_t to match the type of the arguments from the two invocations which pass in the Context struct's idle_gc and timeout fields, both of which are declare int64_t. This avoids issues when int is 32 bits.

However, this value is used in operations with struct timespec, which declares:

time_t tv_sec;
long tv_nsec;

So, perhaps it might even be declared of type time_t?

{
static const int64_t ns_per_sec = 1000*1000*1000;
struct timespec t;
Expand Down
36 changes: 22 additions & 14 deletions ext/mini_racer_extension/serde.c
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#include <err.h>
#include <limits.h>
#include <math.h>
#include <stdbool.h>
#include <stdint.h>
Expand Down Expand Up @@ -33,7 +34,7 @@ static void des_object_begin(void *arg);
static void des_object_end(void *arg);
static void des_map_begin(void *arg);
static void des_map_end(void *arg);
static void des_object_ref(void *arg, uint32_t id);
static void des_object_ref(void *arg, long id);
// des_error_begin: followed by des_object_begin + des_object_end calls
static void des_error_begin(void *arg);
static void des_error_end(void *arg);
Expand All @@ -42,7 +43,7 @@ static void des_error_end(void *arg);
// have to worry about allocation failures for small payloads
typedef struct Buf {
uint8_t *buf;
uint32_t len, cap;
size_t len, cap;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These variables are assigned values of type size_t, so it makes sense to declare them as such. The type of uint32_t could be too narrow (and is on my machine).

uint8_t buf_s[48];
} Buf;

Expand All @@ -54,13 +55,18 @@ typedef struct Ser {
static const uint8_t the_nan[8] = {0,0,0,0,0,0,0xF8,0x7F}; // canonical nan

// note: returns |v| if v in [0,1,2]
static inline uint32_t next_power_of_two(uint32_t v) {
static inline size_t next_power_of_two(size_t v) {
Copy link
Author

@gi gi Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sole invocation of this function passes in an argument of type size_t, so this function should declare the parameter of the same type to simplify things.

The alternative would be to refactor the invocation, but that has a ripple effect because the value is the parameter to int buf_grow(Buf *b, size_t n), which I am sure we don't want to change the type of n.

v -= 1;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
#if SIZE_MAX >= 0xFFFFFFFFU
v |= v >> 16;
#endif
#if SIZE_MAX >= 0xFFFFFFFFFFFFFFFFULL
v |= v >> 32;
#endif
v += 1;
return v;
}
Expand Down Expand Up @@ -240,24 +246,26 @@ static void ser_num(Ser *s, double v)
}

// ser_bigint: |n| is in bytes, not quadwords
static void ser_bigint(Ser *s, const uint64_t *p, size_t n, int sign)
static void ser_bigint(Ser *s, const void *p, size_t stride, size_t n, int sign)
Copy link
Author

@gi gi Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two invocations of this function:

  • ser_bigint(s, limbs, sizeof(limbs[0]), countof(limbs), sign);
  • ser_bigint(s, &t, sizeof(t), sizeof(t), sign);

limbs is declared unsigned long limbs[65];
t is declared uint64_t t;

So, the types of the pointer arguments to ser_bigint are

  • unsigned long *
  • uint64_t *

If long is 32 bits, then these don't match up and there could be serialization issues.

I saw a few options:

  • try to align the types in each of the callers
  • duplicate the function for each type
  • refactor the function to be agnostic to the pointer's word size

The latter was the easiest and most flexible.

{
static const uint8_t zero[16] = {0};

if (*s->err)
return;
if (n % 8) {
if (n % stride) {
snprintf(s->err, sizeof(s->err), "bad bigint");
return;
}
w_byte(s, 'Z');
// chop off high all-zero words
n /= 8;
n /= stride;
while (n--)
if (p[n])
if (memcmp((const uint8_t *)p + n*stride, zero, stride))
break;
if (n == (size_t)-1) {
w_byte(s, 0); // normalized zero
} else {
n = 8*n + 8;
n = (n + 1) * stride;
w_varint(s, 2*n + (sign < 0));
w(s, p, n);
}
Expand All @@ -276,7 +284,7 @@ static void ser_int(Ser *s, int64_t v)
return ser_num(s, v);
t = v < 0 ? -v : v;
sign = v < 0 ? -1 : 1;
ser_bigint(s, &t, sizeof(t), sign);
ser_bigint(s, &t, sizeof(t), sizeof(t), sign);
} else {
w_byte(s, 'I');
w_zigzag(s, v);
Expand Down Expand Up @@ -324,26 +332,26 @@ static void ser_object_begin(Ser *s)
}

// |count| is the property count
static void ser_object_end(Ser *s, uint32_t count)
static void ser_object_end(Ser *s, size_t count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one invocation of this function in serialize1 passes in a size_t value. It's passed to w_varint which is declared as void w_varint(Ser *s, uint64_t v) with a uint64_t.

Thus, it makes sense to align this function's parameter type with the invocation argument's type of size_t which can easily be widened to a 64-bit uint64_t if size_t happens to be 32 bits.

{
w_byte(s, '{');
w_varint(s, count);
}

static void ser_object_ref(Ser *s, uint32_t id)
static void ser_object_ref(Ser *s, size_t id)
Copy link
Author

@gi gi Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be able to be reverted. However, the invocations of this method are the return value of Ruby'c rb_fix2long function, which obviously returns a long.

Should it be declared as a long instead?
Would you like to cast the invocation arguments to a uint32_t?

Note: the des_object_ref was updated to use long to match Ruby's C API: https://github.com/rubyjs/mini_racer/pull/361/files#r2233122575.

{
w_byte(s, '^');
w_varint(s, id);
}

static void ser_array_begin(Ser *s, uint32_t count)
static void ser_array_begin(Ser *s, size_t count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same argument as ser_object_end: https://github.com/rubyjs/mini_racer/pull/361/files#r2234665993.

The one invocation of this function [(that is not a constant 2)] in serialize1 passes in a size_t value. It's passed to w_varint which is declared as void w_varint(Ser *s, uint64_t v) with a uint64_t.

Thus, it makes sense to align this function's parameter type with the invocation argument's type of size_t which can easily be widened to a 64-bit uint64_t if size_t happens to be 32 bits.

{
w_byte(s, 'A'); // 'A'=dense, 'a'=sparse
w_varint(s, count); // element count
}

// |count| is the element count
static void ser_array_end(Ser *s, uint32_t count)
static void ser_array_end(Ser *s, size_t count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same argument as ser_object_end: https://github.com/rubyjs/mini_racer/pull/361/files#r2234665993.

The one invocation of this function [(that is not a constant 2)] in serialize1 passes in a size_t value. It's passed to w_varint which is declared as void w_varint(Ser *s, uint64_t v) with a uint64_t.

Thus, it makes sense to align this function's parameter type with the invocation argument's type of size_t which can easily be widened to a 64-bit uint64_t if size_t happens to be 32 bits.

{
w_byte(s, '$');
w_varint(s, 0); // property count, always zero
Expand Down Expand Up @@ -569,7 +577,7 @@ static int des1(char (*err)[64], const uint8_t **p, const uint8_t *pe,
return bail(err, "negative zero bigint");
if (pe-*p < (int64_t)u)
goto too_short;
des_bigint(arg, *p, u, 1-2*t);
des_bigint(arg, *p, u, (int) (1-2*t));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter only designates the sign here: 1 or -1, but the type of t is uint64_t. Thus, instead of making the parameter of des_bigint unnecessarily wide, it's easy enough to just cast it down explicitly to avoid any potential compiler warnings that can turn into errors.

*p += u;
break;
case 'R': // RegExp, deserialized as string
Expand Down
Loading