Skip to content

Implement rb_hash_bulk_insert #3715

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Compatibility:
* Implement `rb_str_strlen()` (#3697, @Th3-M4jor).
* Support `Time.new` with String argument and error when invalid (#3693, @rwstauner).
* Implement `rb_enc_interned_str()` (#3703, @Th3-M4jor).
* Implement `rb_hash_bulk_insert()` (#3705, @Th3-M4jor).

Performance:

Expand Down
2 changes: 1 addition & 1 deletion lib/cext/include/truffleruby/truffleruby-abi-version.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,6 @@
// $RUBY_VERSION must be the same as TruffleRuby.LANGUAGE_VERSION.
// $ABI_NUMBER starts at 1 and is incremented for every ABI-incompatible change.

#define TRUFFLERUBY_ABI_VERSION "3.2.4.6"
#define TRUFFLERUBY_ABI_VERSION "3.2.4.7"

#endif
15 changes: 15 additions & 0 deletions spec/ruby/optional/capi/ext/hash_spec.c
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,20 @@ VALUE hash_spec_compute_a_hash_code(VALUE self, VALUE seed) {
return ULONG2NUM(h);
}

VALUE hash_spec_rb_hash_bulk_insert(VALUE self, VALUE array_len, VALUE array, VALUE hash) {
VALUE* ptr;

if (array == Qnil) {
ptr = NULL;
} else {
ptr = RARRAY_PTR(array);
}

long len = FIX2LONG(array_len);
rb_hash_bulk_insert(len, ptr, hash);
return Qnil;
}

void Init_hash_spec(void) {
VALUE cls = rb_define_class("CApiHashSpecs", rb_cObject);
rb_define_method(cls, "rb_hash", hash_spec_rb_hash, 1);
Expand Down Expand Up @@ -162,6 +176,7 @@ void Init_hash_spec(void) {
rb_define_method(cls, "rb_hash_size", hash_spec_rb_hash_size, 1);
rb_define_method(cls, "rb_hash_set_ifnone", hash_spec_rb_hash_set_ifnone, 2);
rb_define_method(cls, "compute_a_hash_code", hash_spec_compute_a_hash_code, 1);
rb_define_method(cls, "rb_hash_bulk_insert", hash_spec_rb_hash_bulk_insert, 3);
}

#ifdef __cplusplus
Expand Down
55 changes: 55 additions & 0 deletions spec/ruby/optional/capi/hash_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,61 @@
end
end

describe "rb_hash_bulk_insert" do
it 'inserts key-value pairs into the hash' do
arr = [:a, 1, :b, 2, :c, 3]
hash = {}

@s.rb_hash_bulk_insert(arr.length, arr, hash)

hash.should == {a: 1, b: 2, c: 3}
end

it 'overwrites existing keys' do
arr = [:a, 4, :b, 5, :c, 6]
hash = {a: 1, b: 2}

@s.rb_hash_bulk_insert(arr.length, arr, hash)

hash.should == {a: 4, b: 5, c: 6}
end

it 'uses the last key in the array if it appears multiple times' do
arr = [:a, 1, :b, 2, :a, 3]
hash = {}

@s.rb_hash_bulk_insert(arr.length, arr, hash)

hash.should == {a: 3, b: 2}
end

it 'allows the array to be NULL if the length is zero' do
hash = {}

@s.rb_hash_bulk_insert(0, nil, hash)

hash.should == {}
end

it 'does not include any keys after the given length' do
arr = [:a, 1, :b, 2, :c, 3, :d, 4]
hash = {}

@s.rb_hash_bulk_insert(arr.length - 2, arr, hash)

hash.should == {a: 1, b: 2, c: 3}
end

it 'does not modify the hash if the length is zero' do
arr = []
hash = {a: 1, b: 2}

@s.rb_hash_bulk_insert(arr.length, arr, hash)

hash.should == {a: 1, b: 2}
end
end

Copy link
Member

@andrykonchin andrykonchin Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add the following cases:

  • length isn't even - there should be error raised (I suppose)
  • how duplicated keys (provided in the array) are handled - I suppose they should just override each other
 * What happens for  duplicated keys?  Well it silently discards  older ones to
 * accept the newest (rightmost) one.  This behaviour also mimics repeated call
 * of rb_hash_aset().
  • it's allowed to pass NULL as array if length argument = 0 (as far as it's explicitly stated in a documentation comment
 * @note        `argv` is allowed to be NULL as long as `argc` is zero.

Copy link
Contributor Author

@Th3-M4jor Th3-M4jor Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add tests for the other cases.

As for when the length isn't even, it appears to be that if RUBY_DEBUG is not defined MRI accepts an array with an odd number of elements and will read one past the end. Should we allow the same to maintain compatibility?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Yes, it makes sense to do the same and to allow passing odd length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't add a spec for that odd case, because specs should also pass with RUBY_DEBUG, notably ruby-dev-builder depends on that.
@andrykonchin Could you add a commit removing that spec?

We could file an issue at https://bugs.ruby-lang.org/, I think it would make sense, it doesn't seem OK the C API reads past the length it's given.

describe "rb_hash_size" do
it "returns the size of the hash" do
hsh = {fast: 'car', good: 'music'}
Expand Down
8 changes: 8 additions & 0 deletions src/main/c/cext/hash.c
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,14 @@ void rb_hash_foreach(VALUE hash, int (*func)(VALUE key, VALUE val, VALUE arg), V
polyglot_invoke(RUBY_CEXT, "rb_hash_foreach", rb_tr_unwrap(hash), func, (void*)arg);
}

void rb_hash_bulk_insert(long n, const VALUE *values, VALUE hash) {
void* unwrapped_hash = rb_tr_unwrap(hash);

for (long i = 0; i < n; i += 2) {
polyglot_invoke(unwrapped_hash, "[]=", rb_tr_unwrap(values[i]), rb_tr_unwrap(values[i + 1]));
}
}

VALUE rb_hash_size(VALUE hash) {
return RUBY_INVOKE(hash, "size");
}
Expand Down
Loading