-
Notifications
You must be signed in to change notification settings - Fork 28
Implement BaseNode.visit API #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here's a WIP I did to investigate some performance optimizations in compare-locales, the comment is obviously wrong: def traverse(node, fun):
"""Postorder-traverse this node and apply `fun` to all child nodes.
Traverse this node depth-first applying `fun` to subnodes and leaves.
Children are processed before parents (postorder traversal).
Return a new instance of the node.
"""
def visit(value):
"""Call `fun` on `value` and its descendants."""
if isinstance(value, ftl.BaseNode):
traverse(value, fun)
return
if isinstance(value, list):
map(visit, value)
else:
fun(value)
if not fun(node):
return
# If signaled by the visitor, descend into this node
props = vars(node).items()
for name, value in props:
visit(value) cc @jotes, as he was looking for something light-weight. |
@Pike Thanks, I'll take a look after bug 1407016 lands on staging/prod. |
@stasm @Pike I've created a small benchmark to get some baseline numbers in order to verify if my changes will affect the performance. It looks like the lack of cloning improves the performance by 5-10% on my machine. Thanks in advance for all your support :-) |
There's two things here: One is that microbenchmarks are always tricky. A bit more realistic could be to run the tests over fluent.js/tools/perf/workload-low.ftl. The other thing is that my example code has an early return. For example, my word-counter code on my branch looks like def count_words(self):
if self._word_count is None:
self._word_count = 0
def count_words(node):
if isinstance(node, ftl.TextElement):
self._word_count += len(node.value.split())
return isinstance(node, (ftl.Message, ftl.Term, ftl.Attribute, ftl.VariantList, ftl.Pattern, ftl.Placeable, ftl.SelectExpression, ftl.Variant))
traverse(self.root_node, count_words)
return self._word_count If you'd adapt that code to run over the workload resource to count all words, I'd expect to see a noticeable difference. Ride-along question, you added a call to |
@Pike thanks! I'll change my branchmark to use |
My goal for filing this issue was to offer an API suitable for applying the visitor pattern to the AST. This could be helpful for many different use-cases which need to walk the tree of nodes. I don't know however if it would be faster than what @Pike suggested in #68 (comment). Here's a rough draft of what I had in mind. Alternatively, rather than using diff --git a/fluent/syntax/ast.py b/fluent/syntax/ast.py
index 7ecd182..ff3db7e 100644
--- a/fluent/syntax/ast.py
+++ b/fluent/syntax/ast.py
@@ -43,16 +43,28 @@ def scalars_equal(node1, node2, ignored_fields):
class BaseNode(object):
"""Base class for all Fluent AST nodes.
All productions described in the ASDL subclass BaseNode, including Span and
Annotation. Implements __str__, to_json and traverse.
"""
+ def visit(self, visitor, state=None, cont=None):
+ try:
+ fn = getattr(visitor, self.__class__.__name__)
+ fn(self, state, cont)
+ except AttributeError:
+ pass
+
+ def walk(self, visitor, state=None):
+ def cont(node, state):
+ return node.visit(visitor, state, cont)
+ return cont(self, state)
+
def traverse(self, fun):
"""Postorder-traverse this node and apply `fun` to all child nodes.
Traverse this node depth-first applying `fun` to subnodes and leaves.
Children are processed before parents (postorder traversal).
Return a new instance of the node.
""" Then, it would be possible to write custom visitors, specialized for their particular purposes. For instance, the following visitors prints names of variables. from fluent.syntax import ast, parse
# Methods for Terms, SelectExpressions and Variants omitted for clarity.
class Visitor(object):
def Resource(self, node, state, cont):
for entry in node.body:
cont(entry, state)
def Message(self, node, state, cont):
cont(node.value, state)
def Pattern(self, node, state, cont):
for element in node.elements:
cont(element, state)
def Placeable(self, node, state, cont):
cont(node.expression, state)
def VariableReference(self, node, state, cont):
print(node.id.name)
resource = parse("hello = Hi, {$username}!")
resource.walk(Visitor())
# → "username" One possible gain is that this approach avoids a lot of |
@stasm 🙇♂️ I wrote similar visitor in my past, I'll ping you tomorrow with PoC of PR. |
I'd prefer a generic visitor to be there to subclass, like in https://docs.python.org/2/library/ast.html#ast.NodeVisitor. Which might get back into introspection to make maintenance easier. There are two parts to that: One is if iteration is opt-in or opt-out. I think that opt-out is easier to implement for subclassing. The other is that of implementation difficulty and maintenance of clients. With opt-in, you need to understand each node on the way to where you're going, with opt-out, you don't. Also, in case of changes to the AST, you need to rewrite your code for each change to each node you're interested in. |
That's a great suggestion. Even in my minimal example the |
BaseNode.traverse
can be costly because it clones the entire tree under the node it ran on. We should add a new method just for non-destructive visiting. This will be useful for theequals
methods, as well as for thecompare-locales
checks.The text was updated successfully, but these errors were encountered: