Skip to content

[perf] AOT performance regression when "fully instantiated" constant functions are present. #53571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
modulovalue opened this issue Sep 20, 2023 · 3 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. P3 A lower priority bug or feature request triaged Issue has been triaged by sub team type-performance Issue relates to performance or code size

Comments

@modulovalue
Copy link
Contributor

modulovalue commented Sep 20, 2023

Consider the following results from the micro benchmark at the bottom of this issue description:

Notice how the "on function" variants are significantly slower AOT than JIT.

If we remove the "instantiated" variants from the benchmark, then the measurements are closer to what one would expect.

The presence of the "instantiated" variants causes the "on function • ... • generic" variants to regress heavily when running on the VM AOT (~50ms JIT vs. ~200ms AOT). When the "instantiated" variants are removed from the benchmark, the regression disappears.

Dart SDK version: 3.2.0-edge.88b07ba64e07fb0eef9c777769917c65fdf1832e (be) (Tue Sep 19 21:57:39 2023 +0000) on "macos_arm64"

AOT

dart compile exe --enable-experiment=inline-class dispatch_bench.dart

./dispatch_bench.exe

via class • on class with instantiated mixin: 8ms
via extension type • on class with instantiated mixin: 10ms

via class • on function • with closure • generic: 212ms
via extension type • on function • with closure • generic: 171ms

via class • on function • with const function • generic: 219ms
via extension type • on function • with const function • generic: 172ms

via class • on function • with const function • instantiated: 213ms
via extension type • on function • with const function • instantiated: 172ms

via class • on function • with delegate • generic: 231ms
via extension type • on function • with delegate • generic: 230ms

via class • on function • with delegate • instantiated: 231ms
via extension type • on function • with delegate • instantiated: 229ms

via class • on function • manually monomorphized: 6ms
via extension type • on function • manually monomorphized: 9ms

JIT

dart --enable-experiment=inline-class dispatch_bench.dart

via class • on class with instantiated mixin: 26ms
via extension type • on class with instantiated mixin: 23ms

via class • on function • with closure • generic: 65ms
via extension type • on function • with closure • generic: 64ms

via class • on function • with const function • generic: 56ms
via extension type • on function • with const function • generic: 55ms

via class • on function • with const function • instantiated: 56ms
via extension type • on function • with const function • instantiated: 55ms

via class • on function • with delegate • generic: 27ms
via extension type • on function • with delegate • generic: 23ms

via class • on function • with delegate • instantiated: 20ms
via extension type • on function • with delegate • instantiated: 18ms

via class • on function • manually monomorphized: 23ms
via extension type • on function • manually monomorphized: 22ms
void main() {
  const size = 10000000;
  final datasetClass = List.generate(size, (final a) => SomeClass(foo: a));
  final datasetExtensionType = List.generate(size, (final a) => SomeExtensionType(a));
  final sw = Stopwatch();
  void measure(
    final String name,
    final void Function() fn,
  ) {
    sw.reset();
    sw.start();
    fn();
    sw.stop();
    print("$name: ${sw.elapsedMilliseconds}ms");
    
  }
  final viaClass = RunClass();
  final viaExtensionType = RunExtensionType();
  measure(
    "via class • on class with instantiated mixin",
    () => viaClass.execute(datasetClass),
  );
  measure(
    "via extension type • on class with instantiated mixin",
    () => viaExtensionType.execute(datasetExtensionType),
  );
  print("");
  measure(
    "via class • on function • with closure • generic",
    () => run<SomeClass>(datasetClass, (final a) => a.foo),
  );
  measure(
    "via extension type • on function • with closure • generic",
    () => run<SomeExtensionType>(datasetExtensionType, (final a) => a.foo),
  );
  print("");
  measure(
    "via class • on function • with const function • generic",
    () => run<SomeClass>(datasetClass, SomeClass.fooSelector),
  );
  measure(
    "via extension type • on function • with const function • generic",
    () => run<SomeExtensionType>(datasetExtensionType, SomeExtensionType.fooSelector),
  );
  print("");
  measure(
    "via class • on function • with const function • instantiated",
    () => runInstantiatedToSomeClass(datasetClass, SomeClass.fooSelector),
  );
  measure(
    "via extension type • on function • with const function • instantiated",
    () => runInstantiatedToSomeExtensionType(datasetExtensionType, SomeExtensionType.fooSelector),
  );
  print("");
  measure(
    "via class • on function • with delegate • generic",
    () => runWithDelegate<SomeClass>(datasetClass, const SumDelegateClassImpl()),
  );
  measure(
    "via extension type • on function • with delegate • generic",
    () => runWithDelegate<SomeExtensionType>(datasetExtensionType, const SumDelegateExtensionTypeImpl()),
  );
  print("");
  measure(
    "via class • on function • with delegate • instantiated",
    () => runWithDelegateInstantiatedToSomeClass(datasetClass, const SumDelegateClassImpl()),
  );
  measure(
    "via extension type • on function • with delegate • instantiated",
    () => runWithDelegateInstantiatedToSomeExtensionType(datasetExtensionType, const SumDelegateExtensionTypeImpl()),
  );
  print("");
  measure(
    "via class • on function • manually monomorphized",
    () => runClass(datasetClass),
  );
  measure(
    "via extension type • on function • manually monomorphized",
    () => runExtensionType(datasetExtensionType),
  );
  print("");
}

class RunExtensionType with Run<SomeExtensionType> {
  @override
  int sum(final SomeExtensionType v) => v.foo;
}

class RunClass with Run<SomeClass> {
  @override
  int sum(final SomeClass v) => v.foo;
}

extension type SomeExtensionType(int foo) {
  static int fooSelector(
    final SomeExtensionType a,
  ) => a.foo;
}

class SomeClass {
  static int fooSelector(
    final SomeClass a,
  ) => a.foo;
  
  final int foo;

  const SomeClass({
    required this.foo,
  });
}

mixin Run<T> {
  int sum(
    final T v,
  );

  int execute(
    final List<T> tree,
  ) {
    int total = 0;
    for (final a in tree) {
      total += sum(a);
    }
    return total;
  }
}

const runInstantiatedToSomeClass = run<SomeClass>;

const runInstantiatedToSomeExtensionType = run<SomeExtensionType>;

int run<T>(
  final List<T> data,
  final int Function(T) sum,
) {
  int total = 0;
  for (final a in data) {
    total += sum(a);
  }
  return total;
}

const runWithDelegateInstantiatedToSomeClass = runWithDelegate<SomeClass>;

const runWithDelegateInstantiatedToSomeExtensionType = runWithDelegate<SomeExtensionType>;

int runWithDelegate<T>(
  final List<T> data,
  final SumDelegate<T> delegate,
) {
  int total = 0;
  for (final a in data) {
    total += delegate.sum(a);
  }
  return total;
}

abstract class SumDelegate<T> {
  int sum(T v);
}

class SumDelegateClassImpl implements SumDelegate<SomeClass> {
  const SumDelegateClassImpl();

  int sum(SomeClass v) => v.foo; 
}

class SumDelegateExtensionTypeImpl implements SumDelegate<SomeExtensionType> {
  const SumDelegateExtensionTypeImpl();

  int sum(SomeExtensionType v) => v.foo; 
}

int runClass(
  final List<SomeClass> data,
) {
  int total = 0;
  for (final a in data) {
    total += a.foo;
  }
  return total;
}

int runExtensionType(
  final List<SomeExtensionType> data,
) {
  int total = 0;
  for (final a in data) {
    total += a.foo;
  }
  return total;
}
@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-performance Issue relates to performance or code size labels Sep 20, 2023
@a-siva
Copy link
Contributor

a-siva commented Sep 20, 2023

//cc @alexmarkov

@alexmarkov
Copy link
Contributor

The "instantiated" variants of the benchmark take tear-offs of run and runWithDelegate methods, which hinders the AOT compiler's ability to infer actual types of their parameters. As a result, list iteration within those methods is not fully inlined and it is much slower.

Improvements in the AOT compiler which would help with this case are tracked in #39692.

@alexmarkov
Copy link
Contributor

Unfortunately #39692 was not enough to optimize this benchmark, so I'm going to reopen this issue. It looks like we might need to add escape analysis to figure out all possible callers of tear-offs in this case.

@alexmarkov alexmarkov reopened this Aug 2, 2024
@alexmarkov alexmarkov added P3 A lower priority bug or feature request triaged Issue has been triaged by sub team and removed closed-duplicate Closed in favor of an existing report labels Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. P3 A lower priority bug or feature request triaged Issue has been triaged by sub team type-performance Issue relates to performance or code size
Projects
None yet
Development

No branches or pull requests

4 participants