Skip to content

[ffi] Creating an external string with a Dart function from within Dart causes a crash. #50452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
modulovalue opened this issue Nov 12, 2022 · 8 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi type-question A question about expected behavior or functionality

Comments

@modulovalue
Copy link
Contributor

I'm trying to create an external string so I can manage the storage of a String outside of Darts heap.

Consider the following example which allocates some memory and calls into Dart_NewExternalUTF16String:

import 'dart:ffi';
import 'package:ffi/ffi.dart' as ffi;

void main() {
  const chars = 5;
  print(
    "> " +
        (DynamicLibrary.executable().lookupFunction<C_Side, Dart_Side>(
          'Dart_NewExternalUTF16String',
        )(
          () {
            final mem = ffi.malloc.allocate<Uint16>(chars * sizeOf<Uint16>());
            mem.asTypedList(chars).fillRange(0, chars, 68);
            return mem;
          }(),
          chars,
          nullptr,
          0,
          Pointer.fromFunction<Dart_HandleFinalizer_Fn>(
            my_finalizer,
          ),
        ) as String),
  );
}

void my_finalizer(
  final Pointer<Void> a,
  final Pointer<Void> b,
) {
  print(
    "Running finalizer.",
  );
}

typedef Dart_HandleFinalizer_Fn = Void Function(Pointer<Void>, Pointer<Void>);

typedef Dart_HandleFinalizer = Pointer<NativeFunction<Dart_HandleFinalizer_Fn>>;

typedef C_Side = Handle Function(
  Pointer<Uint16>,
  Int64,
  Pointer<NativeType>,
  Int64,
  Dart_HandleFinalizer,
);

typedef Dart_Side = Object Function(
  Pointer<Uint16>,
  int,
  Pointer<NativeType>,
  int,
  Dart_HandleFinalizer,
);

I get a string as expected i.e. the print expression outputs 'DDDDD', but the program crashes afterwards with:

> DDDDD

===== CRASH =====
si_signo=Segmentation fault: 11(11), si_code=1, si_addr=0x1069fa000
version=2.19.0-255.2.beta (beta) (Tue Oct 4 13:45:53 2022 +0200) on "macos_x64"
pid=12890, thread=775, isolate_group=main(0x7fe88f018400), isolate=(nil)(0x0)
os=macos, arch=x64, comp=no, sim=no
isolate_instructions=1040e1f40, vm_instructions=1040e1f40
pc 0x00000001069fa000 fp 0x00007ffeebb83820 Unknown symbol
pc 0x000000010427c1dd fp 0x00007ffeebb83890 dart::Isolate::LowLevelCleanup(dart::Isolate*)+0x26d
pc 0x000000010427d900 fp 0x00007ffeebb842d0 dart::Isolate::Shutdown()+0x1c0
pc 0x0000000104855abb fp 0x00007ffeebb84810 Dart_ShutdownIsolate+0xcb
pc 0x00000001040ba163 fp 0x00007ffeebb848b0 dart::bin::RunMainIsolate(char const*, char const*, dart::bin::CommandLineOptions*)+0x2e3
pc 0x00000001040baed2 fp 0x00007ffeebb849f0 dart::bin::main(int, char**)+0x662
pc 0x00000001040bbd69 fp 0x00007ffeebb84a00 main+0x9
pc 0x00007fff736c7085 fp 0x00007ffeebb84a10 start+0x1
-- End of DumpStackTrace
Abort trap: 6

I remember reading somewhere that native finalizers can't call into Dart code. Is that what causes this error?
Does this restriction still apply? (I presume this doesn't have to be case since there are Finalizers now in Dart that support dart functions.)
Or am I using this API incorrectly?

@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-question A question about expected behavior or functionality library-ffi labels Nov 12, 2022
@dcharkes
Copy link
Contributor

I remember reading somewhere that native finalizers can't call into Dart code. Is that what causes this error?
Does this restriction still apply?

This still applies. Native Finalizers are run eagerly on garbage collection and run on isolate shutdown. They cannot run Dart code because they run eagerly during GCs in synchronous code.

Finalizers do not run on isolate shutdown (we're assuming you're cleaning up Dart things in finalizers and if an isolate shuts down, all the Dart objects are gone). Moreover, Finalizers that run Dart code are postponed till the next yield point (e.g. await) in your code, so that running them doesn't break the synchronous code execution semantics in Dart. (For example if you load a field from an object twice in a row, the second load is optimized away. However, if you would have a GC in between them and a finalizer that sets that field to a different value, that optimization would be wrong.)

If you want to free your native string, you should define your finalization function in C/C++ and use a native finalizer.

@modulovalue
Copy link
Contributor Author

Thank you, that makes sense to me.

I'd like to manage the storage that underlies the external string from within Dart code (multiple isolates will need to create external strings over the same native memory, so I plan to implement some GC scheme myself) and this motivates the following question:

To report to my Dart code whether an external string has been finalized, I'd have to leave some information behind from within native code, and periodically poll that information from within Dart code, correct? or do you perhaps see a better option that involves less native code?

@dcharkes
Copy link
Contributor

dcharkes commented Nov 14, 2022

multiple isolates will need to create external strings over the same native memory

That sounds like you need a ref-count in C++ and a Dart_HandleFinalizer attached for each isolate. That finalizer will then decrement the count and if its 0 free the native string. (Make sure to use atomics for increasing and decreasing the ref-count.)

To report to my Dart code whether an external string has been finalized.

The Dart_HandleFinalizer is run when Dart collects the external-string wrapper object, so you cant access the string from Dart anymore. Why would you need to report to the Dart code that the external string has been finalized?

@modulovalue
Copy link
Contributor Author

modulovalue commented Nov 14, 2022

I've tried to clarify my plan with the following diagram.


         Isolate A owns this natively allocated memory.
                              ^      
                              |
             ,----------------'----------------,
             |                                 |
             v                                 v
            [ 'F', 'o', 'o', ' ', 'B', 'a', 'r' ] <- (natively allocated array of e.g. uint16)
             ^         ^    ^                  ^
             |         |    |                  |
             '-------,-|----'                  |
                     | '-------------,---------'
                     |               |
                     |               |
                     |               |
                     |               '> * Isolate C tells isolate A that it 
                     |                    would like to depend on this region
                     |                  * Isolate C creates an external string 
                     |                    from this region. 
                     |
                     '> * Isolate B tells isolate A that it 
                          would like to depend on this region
                        * Isolate B creates an external string 
                          from this region. 

* Isolate A reduces the reference counter for `'o', ' ', 'B', 'a', 'r'`
  when either 
   - Isolate C reports it doesn't need it anymore (i.e. the finalizer 
     for the external string was invoked in Isolate C) or
   - when Isolate C exits.

* Isolate A reduces the reference counter for `'F', 'o', 'o'`
  when either 
   - Isolate B reports it doesn't need it anymore (i.e. the finalizer 
     for the external string was invoked in Isolate B) or 
   - when Isolate B exits.
   
* Isolate A may free 'Foo Bar' when neither itself, nor any other Isolate depends on it anymore.

@dcharkes
Copy link
Contributor

Thanks for the diagram @modulovalue! This clarifies your goal.

that involves less native code

I believe it might be the wrong design goal to try to manage native memory with Dart code instead of in native code.

Assuming that all memory is one contiguous region that cannot be freed with sub-regions, it doesn't matter that a specific isolate only depends on sub-region. So still a normal ref-count on the whole region would be simpler than trying to keep track of sub-regions. Is this assumption correct @modulovalue?

I strongly suggest doing the refcounting and freeing in native code.

(If you really want to do refcounting in Dart: The Dart_HandleFinalizers of the helper isolates should use Dart_PostCObject to send a message to the main isolate. The main isolate should have a HashMap</*external*/ String, int> refCount, and when the refCount reaches zero drop it from the HashMap. Having it in the hashmap keeps the string alive, so then when removing it from the hashmap it can be GCed if it is no longer in use in the main isolate. And the the Dart_HandleFinalizer on the string from main isolate should do the actual native free. But with the native ports, this will be more native code than refcounting in C in the first place.)

@modulovalue
Copy link
Contributor Author

Thank you for the help so far dcharkes!

Assuming that all memory is one contiguous region that cannot be freed with sub-regions,

I believe your assumption is not correct for my particular use case. The reason why I'm saying that is that I have a Piece table where sub-regions may depend on other sub-regions from within a single isolate.

(Perhaps my descriptions so far were too oversimplified (as I don't expect people to be familiar with piece tables), but I plan to share the whole piece table between isolates to have fast access to a version tree of changes. I'll unfortunately need to store ref counters over regions, but I have the necessary infrastructure for that already in place in the form of interval trees, all that I'm missing now is the ability for zero-cost strings across isolates backed by native memory and reliable ref counting).

@modulovalue
Copy link
Contributor Author

The Dart_HandleFinalizers of the helper isolates should use Dart_PostCObject to send a message to the main isolate.

Thank you, I haven't considered that yet, I'll try that.

@modulovalue
Copy link
Contributor Author

It worked, thank you @dcharkes!

Your suggestion has lead me down the right path:
I'm now using Dart_PostInteger to pass around a pointer to a struct from the native finalizer on the C side to a RawReceivePort on the Dart side.


Since this issue does not report any unexpected behavior, but a possible inconvenience, and because dcharkes has provided a workaround, I'm going to close it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi type-question A question about expected behavior or functionality
Projects
None yet
Development

No branches or pull requests

3 participants