Improve performance of search in the list of the magic strings. #1441

robertsipka · 2016-11-17T14:46:34Z

Benchmark	RSS (bytes)	Perf (sec)
3d-cube.js	57344 -> 57344 : 0.000%	0.904 -> 0.896 : +0.930%
3d-raytrace.js	151552 -> 155648 : -2.703%	1.073 -> 1.070 : +0.308%
access-binary-trees.js	53248 -> 53248 : 0.000%	0.564 -> 0.559 : +0.745%
access-fannkuch.js	20480 -> 20480 : 0.000%	2.238 -> 2.195 : +1.904%
access-nbody.js	28672 -> 28672 : 0.000%	1.082 -> 1.068 : +1.258%
bitops-3bit-bits-in-byte.js	16384 -> 16384 : 0.000%	0.590 -> 0.584 : +1.037%
bitops-bits-in-byte.js	16384 -> 16384 : 0.000%	0.879 -> 0.870 : +1.012%
bitops-bitwise-and.js	16384 -> 16384 : 0.000%	1.048 -> 1.039 : +0.835%
bitops-nsieve-bits.js	155648 -> 155648 : 0.000%	1.426 -> 1.411 : +1.079%
controlflow-recursive.js	81920 -> 81920 : 0.000%	0.396 -> 0.388 : +2.042%
crypto-aes.js	90112 -> 90112 : 0.000%	0.984 -> 0.971 : +1.381%
crypto-md5.js	163840 -> 163840 : 0.000%	0.670 -> 0.668 : +0.237%
crypto-sha1.js	110592 -> 110592 : 0.000%	0.656 -> 0.650 : +0.840%
date-format-tofte.js	40960 -> 40960 : 0.000%	0.801 -> 0.765 : +4.474%
date-format-xparb.js	40960 -> 40960 : 0.000%	0.410 -> 0.402 : +2.002%
math-cordic.js	20480 -> 20480 : 0.000%	1.324 -> 1.308 : +1.207%
math-partial-sums.js	16384 -> 16384 : 0.000%	0.735 -> 0.736 : -0.182%
math-spectral-norm.js	24576 -> 24576 : 0.000%	0.574 -> 0.569 : +0.975%
string-base64.js	143360 -> 139264 : +2.857%	1.683 -> 1.579 : +6.208%
string-fasta.js	32768 -> 32768 : 0.000%	1.252 -> 1.195 : +4.503%
Geometric mean:	+0.012%	+1.652%

Binary sizes (bytes)
7131243:152612
e38bf80:152580

zherczeg · 2016-11-17T14:48:32Z

Impressive!

zherczeg · 2016-11-17T19:54:51Z

jerry-core/lit/lit-magic-strings.inc.h

- * ascii and non-ascii groups.
+ * These strings must be ascii strings and needs to be defined in
+ * lexicographical order. If non-ascii strings will be ever needed,
+ * a divider will be added to separate ascii and non-ascii groups.


I don't think we need any dividers. So this could be removed from the comment.

zherczeg · 2016-11-17T19:55:01Z

jerry-core/lit/lit-magic-strings.c

+      }
+      else if (id_size > string_size)
+      {
+        last = middle -1;


Space before 1

zherczeg · 2016-11-17T19:55:34Z

jerry-core/lit/lit-magic-strings.c


-      return true;
+    int compare = memcmp (magic_strings[middle], string_p, string_size);
+    if (compare == 0)


newline before if

zherczeg · 2016-11-17T19:57:36Z

jerry-core/lit/lit-magic-strings.c

+    if (compare == 0)
+    {
+      lit_utf8_size_t id_size = lit_zt_utf8_string_size (magic_strings[middle]);
+      if (id_size == string_size)


I would simply check that magic_strings[middle][string_size] == '\0' which is faster than getting the string size.

zherczeg · 2016-11-17T19:58:33Z

jerry-core/lit/lit-magic-strings.inc.h

- * ascii and non-ascii groups.
+ * These strings must be ascii strings and needs to be defined in
+ * lexicographical order. If non-ascii strings will be ever needed,
+ * a divider will be added to separate ascii and non-ascii groups.


But I would also add that NULL character cannot be part of magic strings, because it must be the terminator character of all magic strings.

robertsipka · 2016-11-18T03:51:48Z

Thanks, I've updated this patch.

zherczeg · 2016-11-18T05:43:51Z

jerry-core/lit/lit-magic-strings.c

-    if (lit_compare_utf8_string_and_magic_string (string_p, string_size, id))
+    int middle = ((first + last) / 2); /**< mid point of search */
+
+    int compare = memcmp (magic_strings[middle], string_p, string_size);


I just realized that this is unsafe if the input string contains \0 which is allowed.

We should figure out something clever, not just a brute force size check.

Why don't you use 'strcmp' ?

LaszloLango · 2016-11-18T07:29:12Z

jerry-core/lit/lit-magic-strings.c

@@ -145,22 +145,45 @@ lit_is_utf8_string_magic (const lit_utf8_byte_t *string_p, /**< utf-8 string */
                          lit_utf8_size_t string_size, /**< string size in bytes */
                          lit_magic_string_id_t *out_id_p) /**< [out] magic string's id */
 {
-  /* TODO: Improve performance of search */
+  static const lit_utf8_byte_t * const magic_strings[] JERRY_CONST_DATA =


This array is duplicate of the array in lit_get_magic_string_utf8. Will the compiler recognize that they are the same? This might increase the binary size if doesn't.

LaszloLango · 2016-11-18T07:33:44Z

jerry-core/lit/lit-magic-strings.c

    {
-      *out_id_p = id;
+      if (magic_strings[middle][string_size] == '\0')


If we use 'strcmp', then we don't need this sanity check.

robertsipka · 2016-11-20T20:17:05Z

Thanks for the comments, I've updated this patch rebased with master.

LaszloLango · 2016-11-21T07:36:50Z

jerry-core/lit/lit-magic-strings.c

 */
-bool
+int
 lit_compare_utf8_string_and_magic_string (const lit_utf8_byte_t *string_p, /**< utf-8 string */


This function is a duplicate of 'strcmp', I don't see any difference.

There is two crucial difference, which requires the existence of this function:

the input string can contains \0, which is allowed.

The actual size of the input string is different from the received. We only extracts a part of this string, and received the size of it.

All of the magic strings are ASCII and we use this to improve performance in various places of the code. This means magic string cannot contain '\0'.

Use 'strncmp' then.

The strncmp is not good either. Lets take for example the 'call' magic string. We received a string which start with a 'c' character, and the size is 1. The results will be zero, and in our cases it means we find a magic string.

LaszloLango

LGTM

zherczeg · 2016-11-23T14:33:35Z

LGTM

robertsipka · 2016-11-23T14:34:01Z

I made some improvements and achieve similar performance and made it safer than the first draft.
@LaszloLango: Please, check it again.

Benchmark	RSS (bytes)	Perf (sec)
3d-cube.js	57344 -> 57344 : 0.000%	0.870 -> 0.872 : -0.193%
3d-raytrace.js	180224 -> 208896 : -15.909%	1.056 -> 1.049 : +0.653%
access-binary-trees.js	53248 -> 53248 : 0.000%	0.565 -> 0.561 : +0.757%
access-fannkuch.js	20480 -> 20480 : 0.000%	2.186 -> 2.189 : -0.141%
access-nbody.js	28672 -> 28672 : 0.000%	1.077 -> 1.075 : +0.172%
bitops-3bit-bits-in-byte.js	16384 -> 16384 : 0.000%	0.593 -> 0.584 : +1.536%
bitops-bits-in-byte.js	16384 -> 16384 : 0.000%	0.876 -> 0.875 : +0.209%
bitops-bitwise-and.js	16384 -> 16384 : 0.000%	1.061 -> 1.054 : +0.617%
bitops-nsieve-bits.js	155648 -> 155648 : 0.000%	1.433 -> 1.412 : +1.466%
controlflow-recursive.js	81920 -> 81920 : 0.000%	0.394 -> 0.394 : +0.007%
crypto-aes.js	90112 -> 90112 : 0.000%	0.963 -> 0.954 : +0.960%
crypto-md5.js	163840 -> 163840 : 0.000%	0.671 -> 0.671 : +0.050%
crypto-sha1.js	110592 -> 110592 : 0.000%	0.662 -> 0.651 : +1.662%
date-format-tofte.js	40960 -> 40960 : 0.000%	0.780 -> 0.747 : +4.189%
date-format-xparb.js	40960 -> 40960 : 0.000%	0.410 -> 0.400 : +2.353%
math-cordic.js	20480 -> 20480 : 0.000%	1.316 -> 1.309 : +0.508%
math-partial-sums.js	16384 -> 16384 : 0.000%	0.754 -> 0.730 : +3.180%
math-spectral-norm.js	24576 -> 24576 : 0.000%	0.586 -> 0.579 : +1.136%
string-base64.js	139264 -> 139264 : 0.000%	1.699 -> 1.556 : +8.434%
string-fasta.js	32768 -> 32768 : 0.000%	1.274 -> 1.216 : +4.591%
Geometric mean:	-0.741%	+1.629%

Binary sizes (bytes)
cf7b7a1:152576
9a480ac:152544

JerryScript-DCO-1.0-Signed-off-by: Robert Sipka [email protected]

LaszloLango

Still LGTM

LaszloLango added enhancement An improvement performance Affects performance labels Nov 17, 2016

LaszloLango added this to the Engine optimization & enhancement milestone Nov 17, 2016

robertsipka force-pushed the improve_performance_of_search branch 2 times, most recently from fa390a4 to 8bf7ecd Compare November 17, 2016 16:34

zherczeg reviewed Nov 17, 2016

View reviewed changes

jerry-core/lit/lit-magic-strings.c

}

else if (id_size > string_size)

{

last = middle -1;

Copy link

Member

zherczeg Nov 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space before 1

zherczeg reviewed Nov 17, 2016

View reviewed changes

robertsipka force-pushed the improve_performance_of_search branch from 8bf7ecd to 9168e3d Compare November 18, 2016 03:51

zherczeg reviewed Nov 18, 2016

View reviewed changes

LaszloLango reviewed Nov 18, 2016

View reviewed changes

robertsipka force-pushed the improve_performance_of_search branch from 9168e3d to 24ba199 Compare November 20, 2016 20:04

LaszloLango requested changes Nov 21, 2016

View reviewed changes

LaszloLango approved these changes Nov 21, 2016

View reviewed changes

robertsipka force-pushed the improve_performance_of_search branch 2 times, most recently from 8f2645d to 9a480ac Compare November 23, 2016 14:17

robertsipka force-pushed the improve_performance_of_search branch 2 times, most recently from df3be3c to 11af13b Compare November 23, 2016 14:55

Improve performance of search in the list of the magic strings.

545d260

JerryScript-DCO-1.0-Signed-off-by: Robert Sipka [email protected]

robertsipka force-pushed the improve_performance_of_search branch from 11af13b to 545d260 Compare November 23, 2016 16:46

LaszloLango approved these changes Nov 24, 2016

View reviewed changes

zherczeg merged commit b2e1223 into jerryscript-project:master Nov 24, 2016

robertsipka deleted the improve_performance_of_search branch March 16, 2017 08:18

Improve performance of search in the list of the magic strings. #1441

Improve performance of search in the list of the magic strings. #1441

Uh oh!

Conversation

robertsipka commented Nov 17, 2016

Uh oh!

zherczeg commented Nov 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robertsipka commented Nov 18, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robertsipka commented Nov 20, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robertsipka Nov 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LaszloLango left a comment

Choose a reason for hiding this comment

Uh oh!

zherczeg commented Nov 23, 2016

Uh oh!

robertsipka commented Nov 23, 2016

Uh oh!

LaszloLango left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

robertsipka Nov 21, 2016 •

edited

Loading