Skip to content

More documentation needed for NativePointer #48023

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
raphaelrobert opened this issue Dec 28, 2021 · 11 comments
Open

More documentation needed for NativePointer #48023

raphaelrobert opened this issue Dec 28, 2021 · 11 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi

Comments

@raphaelrobert
Copy link

Firstly, thanks for all the excellent work you do!

I have recently started playing with the new NativePointer introduced in https://dart-review.googlesource.com/c/sdk/+/211340. I read through all the documentation and issues I could find, but I still have open questions.

The problem I'm trying to solve is the following:

  • I want to allocate memory on the native heap (Rust in my case)
  • I want to pass ownership of that allocation to Dart as a pointer
  • On the Dart side, I don't expect to be able to do much with the pointer, except use it again in native code without freeing it
  • I want Dart to call a finalizer when the pointer is no longer needed on the Dart side

Why is this important? I think having finalizers and a well-understood memory management would allow deep integration between Dart and other languages.

All of the above seems to be doable with NativePointer, but I came across a few things that are not straightforward. This is possibly a lack of understanding on my side, and/or a lack of documentation.

Here are my findings:

  • Returning the NativePointer from Rust (through Dart_PostCObject) causes Dart to see this an int. The int contains the raw value of the pointer. I haven't seen this documented anywhere and I was a bit surprised by it.
  • The int can be used in subsequent calls to Rust, however, it's really just a reference. On the Rust side I have to call mem::forget() before returning, otherwise Dart will segfault. This makes sense, since the ownership is now fully on the Dart side.
  • In theory, the callback gets called when Dart does GC. However, I did not manage to reliably trigger this. I ran the Dart code with dart run --observe to have access to the debugger where GC can be manually triggered. This had no effect whatsoever. I also tried to allocate several GB on the Rust side and reported the size back to Dart in order to create greater memory pressure. Dart counts the Rust heap in the RSS part of the memory, not its own heap (which actually makes sense). Question: does the reported size have any effect at all?
  • Dart's GC strategy is quite obscure, I expected a manual trigger to call the finalizer, especially when the int was completely out of scope. I even tried to do all of it in a dedicated isolate, to no avail. Question: What exactly needs to be done to convince Dart the native pointer is no longer needed so that the finalizer gets called during GC?
  • The only time when the callback was actually called, was when I inadvertently threw an exception after having received the int. In that instance The Rust callback was executed correctly, I inspected the memory before dropping the object and everything looked good.

In summary, I feel we are close to having a great mechanism. Hopefully it's just a matter of better understanding/documentation.

@lrhn lrhn transferred this issue from dart-lang/language Dec 28, 2021
@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi labels Dec 28, 2021
@raphaelrobert
Copy link
Author

Upon further investigation, I could trigger GC just fine by making many allocations on the Dart heap and letting them go out of scope immediately. However, this did not cause the Dart VM to call the finalizer callback for the objects allocated on the native heap.

The following screenshot shows the state after allocating ~10GB in Dart and allocating ~10GB in Rust:
Screenshot 2021-12-29 at 14 43 58

GC cleaned up everything on the Dart side, but not on the native heap. What surprises me still, is that the native allocations are counted towards RSS and not Dart/Flutter Native. So the question boils down to: Is the Dart VM counting this incorrectly? Or am I reporting it wrongly?

@dcharkes @mraleph do you have any insights?

@fzyzcjy
Copy link
Contributor

fzyzcjy commented Dec 29, 2021

Vote for its importance as well. This is a critical problem and I am also looking forward to it. Without using this feature, https://github.com/fzyzcjy/flutter_rust_bridge, the open source library between Dart/Flutter and Rust, cannot implement the opaque pointers and the expressiveness is much lower.

@mraleph
Copy link
Member

mraleph commented Dec 29, 2021

[Quick answer: because most of us are on end-of-year holidays].

The API added by @aam only would call a finalizer you supply if the message is not delivered for some reason (e.g. isolate shutting down).

FWIW I think the API looks a bit confusing especially in contrast with external typed data one which would attach the finalizer to the object allocated on the receiving end.

Also the clash of naming (native pointer vs FFI Pointer) is also a bit confusing.

@raphaelrobert
Copy link
Author

Thanks for the quick answer and no rush, enjoy the holiday season!

For when folks are back:
This explanation fits the observation really well. It seems I misinterpreted the purpose of a NativePointer.
The way it works now means that that memory can never be deallocated (as mentioned, the Dart VM segfaults when trying to do that on the native side). In essence, it behaves like static memory.
I'm now wondering what's next though. Will this become possible when the Dart-side finalizers have landed? Should this be a new feature request? Or should we just try to work with what's there and come up with ways to still make this useful?

@aam
Copy link
Contributor

aam commented Dec 29, 2021

@raphaelrobert wrote

It seems I misinterpreted the purpose of a NativePointer.

Do you know you can attach finalizer to native bytes when you send them via Dart_PostCObject as external typed data? See #47270 for example. Dart will end up with typed data object associated with the native bytes, the finalizer will be invoked once typed data object is gc'ed.

@mraleph wrote

The API added by @aam only would call a finalizer you supply if the message is not delivered for some reason (e.g. isolate shutting down).

cc @a-siva who actually added this.

@raphaelrobert
Copy link
Author

Do you know you can attach finalizer to native bytes when you send them via Dart_PostCObject as external typed data? See #47270 for example. Dart will end up with typed data object associated with the native bytes, the finalizer will be invoked once typed data object is gc'ed.

Thanks, yes, that's the functionality I am after with the exception that the data shouldn't be typed or exposed to Dart. I assumed NativePointer could be used as an opaque pointer for data that is only to be inspected by native code.
It might be possible to misuse external typed data for this (e.g. a byte array the size of whatever the size of the native object is), but it feels like a bit of a hack (especially when the exact memory footprint is not exactly known, as is usually the case with Rust).

@aam
Copy link
Contributor

aam commented Dec 29, 2021

I see. Having dart vm know about the size of externally allocated memory associated with dart objects can be useful for gc purposes, can help vm do better gc if it knows which potentially gc'able objects are holding on to external memory.

You might address misuse concern by limiting visibility of external typed data received from receive port: you can put it immediately inside another opaque dart object of your own and widely use that instead.

@raphaelrobert
Copy link
Author

I see. Having dart vm know about the size of externally allocated memory associated with dart objects can be useful for gc purposes, can help vm do better gc if it knows which potentially gc'able objects are holding on to external memory.

So the size can already be reported when sending the native pointer to Dart:

struct {
      intptr_t ptr;
      intptr_t size;
      Dart_HandleFinalizer callback;
    } as_native_pointer;

This could be used by the GC. I understand that it's not the case right now, and also that the callback serves a different purpose than the finalizer that gets attached to external typed data. My question is: would it be possible to align the behavior so that native pointer objects would be treated similarly to external typed data? This wouldn't preclude the original use case of the callback in case the isolate shuts down too early. Two changes would be required for that:

  • intptr_t size should be considered by the GC to determine memory pressure
  • The callback function should be attached as a finalizer to the Dart object (which currently gets exposed as an int, so it might not be straightforward)

That being said, since I still don't fully understand the current purpose of native pointers my suggestions might not make sense. @a-siva maybe you can shed some light on that?

You might address misuse concern by limiting visibility of external typed data received from receive port: you can put it immediately inside another opaque dart object of your own and widely use that instead.

Right, that would be the obvious workaround.

@aam
Copy link
Contributor

aam commented Dec 29, 2021

That being said, since I still don't fully understand the current purpose of native pointers my suggestions might not make sense.

There is no NativePointer dart class available for dart programs, you can't have an instance of such a thing in dart program's heap - as you saw it shows up as int in dart, which is not something that is gc'ed.
dart internal NativePointer C++ entity was introduced to associate finalizer with pointers(to io-related runtime-specific structures) so they are cleaned up if dart program fails to clean them up(due to abnormal termination for example). https://dart-review.googlesource.com/c/sdk/+/211340's description talks about it.

TypedData on the other hand is an example of dart class that you can have dart instances of, and have finalizers associated with. Those finalizers will be triggered by dart vm gc once those instances become unreachable.

@fzyzcjy
Copy link
Contributor

fzyzcjy commented Dec 30, 2021

Here are my two cents about using TypedData to store a "pointer": You know, in Rust/C++/C/..., an object usually refers to some other objects. For example (let's write in C so people without Rust background can understand):

struct MyObject {
  char* name;
};

MyObject* obj = malloc(...);
obj->name = create_a_very_very_long_string_that_is_1GB_big();

Here, if I want to interpret MyObject as a Uint8List, it can only be a few bytes long, since sizeof(MyObject)=4 (or 1? anyway only a few bytes). But we all know, this MyObject indeed occupied 1GB memory because of its name field.

Dart VM should know that this object occupies 1GB memory, such that it can determine its GC pace. However, if we interpret it as a Uint8List with size=1000000000, it is definitely wrong - e.g. when accessing index 123456, it is invalid memory and we can segfault or something.

@raphaelrobert
Copy link
Author

Thanks for all the insights @aam! It appears NativePointer really serves a different purpose. This leaves us with the option to (mis)use external typed data as a workaround. It has some downsides though:

  • The content is inspectable (and modifiable) in Dart
  • The size might be incorrect, if it contains chained pointers to further allocations. The size can only be reported for contiguous chunks of memory.
  • The native code cannot really get a pointer back to the Uint8List, which was a strong prerequisite

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi
Projects
None yet
Development

No branches or pull requests

5 participants