Optimization that in my tests shows 2.7 sec -> 2.2 sec #312

psprint · 2016-05-11T19:35:24Z

To play around with this, clone my repo with extracted parsers:

https://github.com/psprint/zsh-tools/

and compare files: parse_len.zsh (current code) and parse_new.zsh (the optimization from this PR). Example run:

time ./parse_len.zsh parse_bash.zsh > out_len.txt

My make test doesn't work, however I compared outputs of the extracted parsers and they are the same.

More, it didn't feel that the indexing is the problem when I coded. This speed up is for large file (I parse the parser itself), not sure if pasting a one liner will show any difference, but it should.

psprint · 2016-05-11T19:40:09Z

If someone would code $REPLY-using type for upstream (no fork), then we would gain 0.8 sec and go below 1.5 sec in the test.

PS. As for the one liners (I mean: pasting a screen wide, i.e. not very short command), the type fork might have more impact there. There are three loop runs for as simple command as "mplayer -fs 15.avi" meaning three forks.

Inspired by #312.

danielshahaf · 2016-05-11T20:50:38Z

Summary from IRC:

Would be nice to have the extracted parser in-tree, instead of in a separate repository and manually upadted
$REPLY-using type might become moot after Avoid forks to improve performance (especially on Cygwin) #298 is merged
I wonder if this is clearer:

diff --git a/highlighters/main/main-highlighter.zsh b/highlighters/main/main-highlighter.zsh
index a0f8dba..b001422 100644
--- a/highlighters/main/main-highlighter.zsh
+++ b/highlighters/main/main-highlighter.zsh
@@ -199,6 +199,10 @@ _zsh_highlight_main_highlighter()
   #
   local this_word=':start:' next_word
   integer in_redirection
+  # Processing buffer
+  local proc_buf="$buf"
+  # Starting position in previous loop run
+  integer prev_end_pos=0
   for arg in ${interactive_comments-${(z)buf}} \
              ${interactive_comments+${(zZ+c+)buf}}; do
     if (( in_redirection )); then
@@ -234,11 +238,11 @@ _zsh_highlight_main_highlighter()
       # indistinguishable from 'echo foo echo bar' (one command with three
       # words for arguments).
       local needle=$'[;\n]'
-      integer offset=${${buf[start_pos+1,len]}[(i)$needle]}
+      integer offset=${proc_buf[(i)$needle]}
       (( start_pos += offset - 1 ))
       (( end_pos = start_pos + $#arg ))
     else
-      ((start_pos+=(len-start_pos)-${#${${buf[start_pos+1,len]}##([[:space:]]|\\[[:space:]])#}}))
+      ((start_pos+=(len-start_pos)-${#${proc_buf##([[:space:]]|\\[[:space:]])#}}))
       ((end_pos=$start_pos+${#arg}))
     fi

@@ -454,6 +458,9 @@ _zsh_highlight_main_highlighter()
       this_word=':start:'
     fi
     start_pos=$end_pos
+    # start_pos-prev_start_pos is: how many chars did we advance in last loop run
+    proc_buf="${proc_buf[end_pos - prev_end_pos + 1, -1]}"
+    prev_end_pos=end_pos
     (( in_redirection == 0 )) && this_word=$next_word
   done
 }

danielshahaf · 2016-05-11T20:51:56Z

There are arrlen improvements discussed upstream , but they haven't been merged to master (let alone released) so we can't rely on them.

psprint · 2016-05-12T08:14:35Z

I don't think it's cleaner. Looking at start_pos and prev_start_pos is: how many characters did we advance. I'm not sure how to see this in end_pos and prev_end_pos. The goal might be to move the chop-off block to the end of main-highlighter function, that's fine, but I'm more puzzled by end_pos than by the start_pos. The code works correct, however it runs for 2.9s. Tested multiple times, it's 2.2s vs 2.9s. A puzzle, as the code should be equivalent. Maybe location at the end changes something in heap management, dunno (yet).

psprint · 2016-05-12T08:29:05Z

I've extended the comment on the chop-off line. That said, we might be able to better catch "what's already processed" and change the line, dunno

danielshahaf · 2016-05-12T10:18:02Z

The goal might be to move the chop-off block to the end of main-highlighter function, that's fine, but I'm more puzzled by end_pos than by the start_pos.

The goal, as I said on IRC, is code clarity. And I'm surprised that you're confused by the use of $end_pos: your version of the patch simply does end_pos - start_pos + 1 using the values of these two parameters in the previous iteration. (This is due to

zsh-syntax-highlighting/highlighters/main/main-highlighter.zsh

Line 456 in 62f1c10

start_pos=$end_pos

, which you yourself pointed out yesterday on IRC.)

Incidentally, simply pointing out the above (either in the code itself, or in code comments) would probably achieve the code clarity I'm seeking.

Anyway, let's hash out these micro refactorings on IRC, that's a better medium than issue comments.

No idea about the 2.2s v. 2.9s difference; it's probably worth investigating, although it may be an upstream issue.

I've extended the comment on the chop-off line. That said, we might be able to better catch "what's already processed" and change the line, dunno

I don't understand the last sentence?

danielshahaf added a commit that referenced this pull request May 11, 2016

tests: New test to capture off-by-ones.

62f1c10

Inspired by #312.

danielshahaf added the Improvement label May 11, 2016

danielshahaf added the performance label May 12, 2016

psprint mentioned this pull request May 12, 2016

Optimization that in my tests shows 2.7s -> 2.2s #315

Closed

psprint closed this May 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimization that in my tests shows 2.7 sec -> 2.2 sec #312

Optimization that in my tests shows 2.7 sec -> 2.2 sec #312

Uh oh!

psprint commented May 11, 2016

Uh oh!

psprint commented May 11, 2016 •

edited

Loading

Uh oh!

danielshahaf commented May 11, 2016

Uh oh!

danielshahaf commented May 11, 2016

Uh oh!

psprint commented May 12, 2016 •

edited

Loading

Uh oh!

psprint commented May 12, 2016

Uh oh!

danielshahaf commented May 12, 2016

Uh oh!

Uh oh!

Optimization that in my tests shows 2.7 sec -> 2.2 sec #312

Optimization that in my tests shows 2.7 sec -> 2.2 sec #312

Uh oh!

Conversation

psprint commented May 11, 2016

Uh oh!

psprint commented May 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielshahaf commented May 11, 2016

Uh oh!

danielshahaf commented May 11, 2016

Uh oh!

psprint commented May 12, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

psprint commented May 12, 2016

Uh oh!

danielshahaf commented May 12, 2016

Uh oh!

Uh oh!

psprint commented May 11, 2016 •

edited

Loading

psprint commented May 12, 2016 •

edited

Loading