Skip to content

Range performance fixes #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 3, 2015
Merged

Conversation

wjt
Copy link
Contributor

@wjt wjt commented Apr 30, 2015

I'm using Annotator on pages with really large tables, highlighting potentially thousands of rows of text. Selecting any amount of text which spans more than one row was extremely expensive. NormalizedRange#limit and #getTextNodes were the main culprits according to Chrome's profiler. These patches make them an order of magnitude faster for me.

Happy to write more tests if you're not convinced that this new logic is right! Also, the existing code is inconsistent in whether it uses $.contains or util.contains; shout if there's some hidden reason to use one or the other that I've not seen.

Agh, while writing this pull request I've noticed that I introduced a use of childNodes in #limit -- based on the comment in #getTextNodes I guess I need to remove that for IE9's sake.

@@ -189,13 +189,33 @@ class Range.NormalizedRange
#
# Returns updated self or null.
limit: (bounds) ->
nodes = $.grep this.textNodes(), (node) ->
node.parentNode == bounds or $.contains(bounds, node.parentNode)
if @commonAncestor == bounds or $.contains(bounds, @commonAncestor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this === for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not written CoffeeScript before yesterday, but the docs suggest that == in CoffeeScript compiles to === in JS.

Copy link
Member

@tilgovi tilgovi May 1, 2015 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wjt
Copy link
Contributor Author

wjt commented May 1, 2015

To no-one's great surprise, using TreeWalker is a bit faster (or no slower) and more concise. I didn't know about it, thanks! I also found a bug in my open-coded implementation of #limit – the tests now cover that.

@wjt
Copy link
Contributor Author

wjt commented May 1, 2015

(Let me know if you would like the intermediate patches squashed away.))

@tilgovi
Copy link
Member

tilgovi commented May 3, 2015

That's excellent. Squash away and I'll be glad to merge it!

@tilgovi
Copy link
Member

tilgovi commented May 3, 2015

If you could also remove the flatten function, which is no longer used now, that would be excellent.

wjt added 5 commits May 3, 2015 10:03
Previously, only the left-hand side was tested; and grandchildren of the
bounds were not tested at all. With [] indicating a text node:

    <h1>[My Heading]</h1>
    <p>
      [My paragraph]
      <span>
        [ conti]
        [nues]
      </span>
    <p>
    <p>[Another paragraph begins]</p>

I wrote an implemetation where, given a range ending in the second
paragraph, limiting it to the first paragraph would incorrectly update
the `@end` to ` conti` not `nues`. The old test case did not catch it.
Traversing the whole tree is unnecessary -- and costly if the
@CommonAncestor is <body> and you have many nodes in your document.

All we actually need to do is update @start and @EnD to point at the
first and last text nodes in the bounds, if they currently fall outside
it.

* master: 1.61s ±2.26% (6 runs sampled) -- but this is a little unfair
  since it was implemented in terms of a very slow getTextNodes call
* this implementation: 0.01ms ±2.91% (85 runs sampled)
The old implementation was really expensive on a range covering many
rows of table. I think it's because an array was being allocated at
every level of the tree, then flattened out into the level above.

Instead here we just do a depth-first search for text nodes and build up
a single array as we go.  In a little benchmark with a range covering
every cell of a 5000×3 table:

* master: 1.57s ±1.14% (6 runs sampled)
* TreeWalker: 17.5ms ±0.74% (72 runs sampled)
Per openannotation/annotator#527 . We actually don't need to use global
at all -- we can get the document from the element we're about to walk.
@wjt wjt force-pushed the range-performance branch from 6a37aac to 90c2209 Compare May 3, 2015 09:10
@wjt
Copy link
Contributor Author

wjt commented May 3, 2015

How's that?

@tilgovi
Copy link
Member

tilgovi commented May 3, 2015

Looks great :)

tilgovi added a commit that referenced this pull request May 3, 2015
@tilgovi tilgovi merged commit 062cd91 into openannotation:master May 3, 2015
@mwidner
Copy link

mwidner commented Nov 17, 2015

@tilgovi @wjt How challenging do you think it'd be to backport this to 1.2.x? I'm seeing the same performance issue there and can't upgrade to 2.0 yet. Thanks.

@tilgovi
Copy link
Member

tilgovi commented Nov 17, 2015

@mwidner there's a v1.2.x branch on annotator so we have a place to backport fixes and make maintenance releases. Do you think you could extract this into a PR there?

@tilgovi
Copy link
Member

tilgovi commented Nov 17, 2015

It shouldn't be too hard because when this change happened the files were still coffee and still had the same names and paths, so it might be as simple as taking a copy of the annotator repo, branching off v1.2.x, adding this repo as a remote and cherry-picking these changes.

@mwidner
Copy link

mwidner commented Nov 17, 2015

Done! Thanks for the pointers.

@wjt wjt deleted the range-performance branch May 7, 2017 09:50
@wjt wjt restored the range-performance branch May 7, 2017 09:50
@wjt wjt deleted the range-performance branch January 19, 2019 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants