Skip to content

Start working on a benchmark for the spread desugaring. #210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 8, 2019

Conversation

munificent
Copy link
Member

I'm not entirely sure if this measures what we want it to, but I wanted
to get something started.

I'll put some numbers over on #209.

I'm not entirely sure if this measures what we want it to, but I wanted
to get something started.
Copy link
Member

@leafpetersen leafpetersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mraleph
Copy link
Member

mraleph commented Feb 7, 2019

Please consider using benchmark_harness package

@munificent
Copy link
Member Author

Please consider using benchmark_harness package

Is that the preferred way to benchmark dart2js as well?

@munificent munificent merged commit 50527ce into master Feb 8, 2019
@munificent munificent deleted the benchmark-spread branch February 8, 2019 21:34
@leafpetersen
Copy link
Member

Please consider using benchmark_harness package

@mraleph Please consider fixing benchmark_harness package.... :)

@lrhn
Copy link
Member

lrhn commented Feb 11, 2019

The question here is what we are measuring for: The quickest way to iterate over a list, or the quickest way to iterate over an iterable that might be a list?

So, I made a variant of the benchmark which also has to handle iterables (See below). I also increase the limitMs and count elements added, not spread operations.

The result is that addAll tend to beat the other cases from ~10-20 elements, and plain iterate is faster for small lists. We don't have a corpus of spreads to optimize based on, but I'd expect 1..20 being the most prevalent cases (otherwise unknown data with a limited length probably follows some sort of power law).

The only other operation that gets close to beating iteration is "resize and set". Even there addAll is smarter because it checks for efficientLengthIterable, not just List, and resizes for all of those (hits the set, not the where-iterable).

My conclusion from that benchmark is to use iteration, except that we recognize platform types where we can do better, and where nobody can detect that we cheat.

import 'dart:collection';

const trialMs = 500;
const lengths = [0, 1, 2, 5, 10, 20, 50, 100, 1000];

var csv = StringBuffer();

class CustomList<T> extends ListBase<T> {
  final List<T> _inner;

  int get length => _inner.length;

  set length(int value) => _inner.length = value;

  CustomList(this._inner);

  T operator [](int index) => _inner[index];

  void operator []=(int index, T value) => _inner[index] = value;
}

void main() {
  for (var length in lengths) {
    var baseline = runBench("iterate", length, iterate);
    runBench("List for", length, addList, baseline);
    runBench("resize and set", length, resizeAndSet, baseline);
    runBench("addAll()", length, addAll, baseline);
    runBench("forEach()", length, forEach, baseline);
    print("");
  }

  //  print("");
  //  print(csv);
}

bool _kTrue(Object o) => true;

double runBench(
    String name, int length, void Function(List<String>, List<String>) action,
    [double baseline]) {
  var from = <String>[];
  for (var i = 0; i < length; i++) {
    from.add(String.fromCharCode(i % 26 + 65));
  }

  var froms = [from, CustomList(from), from.where(_kTrue), from.toSet()];

  var rate = benchBest(froms, action);

  if (baseline == null) {
    print("${length.toString().padLeft(4)} ${name.padRight(15)} "
        "${rate.toStringAsFixed(2).padLeft(10)} adds/ms "
        "                 ${'-' * 20}");
  } else {
    var comparison = rate / baseline;
    var bar = "=" * (comparison * 20).toInt();
    if (comparison > 4.0) bar = "!!!";
    print("${length.toString().padLeft(4)} ${name.padRight(15)} "
        "${rate.toStringAsFixed(2).padLeft(10)} adds/ms "
        "${comparison.toStringAsFixed(2).padLeft(6)}x baseline $bar");
  }

  csv.writeln("$length,$name,$rate");
  return rate;
}

/// Runs [bench] a number of times and returns the best (highest) result.
double benchBest(List<Iterable<String>> froms,
    void Function(List<String>, List<String>) action) {
  var best = 0.0;
  for (var i = 0; i < 4; i++) {
    var result = bench(froms, action);
    if (result > best) best = result;
  }

  return best;
}

/// Spreads each list in [froms] into the middle of a list using [action].
///
/// Returns the number of times it was able to do this per millisecond, on
/// average. Higher is better.
double bench(List<Iterable<String>> froms,
    void Function(Iterable<String>, List<String>) action) {
  var elapsed = 0;
  var count = 0;
  var watch = Stopwatch()..start();
  do {
    for (int j = 0; j < 5000; j++) {
      for (var i = 0; i < froms.length; i++) {
        var from = froms[i];
        var to = <String>["a", "b"];

        action(from, to);
        to.add("b");
        to.add("c");

        count += to.length;
      }
    }
    elapsed = watch.elapsedMilliseconds;
  } while (elapsed < trialMs);

  return count / elapsed;
}

void iterate(Iterable<String> from, List<String> to) {
  for (var e in from) {
    to.add(e);
  }
}

void addList(Iterable<String> from, List<String> to) {
  var length = from.length;
  if (from is List<String>) {
    for (var i = 0; i < length; i++) {
      to.add(from[i]);
    }
  } else {
    for (var s in from) to.add(s);
  }
}

void resizeAndSet(Iterable<String> from, List<String> to) {
  var length = from.length;
  var j = to.length;
  to.length = to.length + length;
  if (from is List<String>) {
    for (var i = 0; i < length; i++) {
      to[j] = from[i];
    }
  } else {
    for (var s in from) to[j++] = s;
  }
}

void addAll(Iterable<String> from, List<String> to) {
  to.addAll(from);
}

void forEach(Iterable<String> from, List<String> to) {
  var temp = to;
  from.forEach((s) {
    temp.add(s);
  });
  temp = null;
}

@munificent
Copy link
Member Author

The question here is what we are measuring for: The quickest way to iterate over a list, or the quickest way to iterate over an iterable that might be a list?

I was going for "the quickest way to iterate over something we statically know is a List".

Benchmarking the case where we check to see if it's a list at runtime is another good point in the space to explore.

@leafpetersen
Copy link
Member

FYI: as part of the discussion on package_benchmark, I hacked this code to add a second call to elapsedMilliseconds on every iteration to get a ballpark of the overhead. It looks like about 30% for the small sizes on the AOT on my mac, assuming I did everything correctly.

This suggests that a non-trivial part of what is being measured here is fixed overhead. I don't think it materially affects the direction of the outcome, but it will dampen any signal you're getting (since more of what you're measuring is invariant across configurations).

@lrhn
Copy link
Member

lrhn commented Feb 12, 2019

The question here was which strategy the implementation must use to do the spread.

"The quickest way to iterate over something we statically know is a List" corresponds to the option of doing something faster if the static type of the spread expression is a list type.

My benchmark corresponds to the option of doing something faster if the run-time type of the spread expression is a list type.

The third option is to not do anything faster, always use Iterable-iteration for list/set spreads, but recognize when we are spreading a platform type and optimize that (which will likely get us close to the addAll speed when all the spread collections are platform ones).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants