Skip to content

Provide actual literal_eval() function. Adapted from standard library's. #1347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

tristanlatr
Copy link
Contributor

Provide actual literal_eval() function. Adapted from standard library's.
Use that function in inference.py instead of the dummy _to_literal() function.

Type of Changes

Type
✨ New feature
🔨 Refactoring (tiny bit)

@Pierre-Sassoulas Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Jan 13, 2022
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.10.0 milestone Jan 13, 2022
Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tristanlatr. However, I don't really see why it is necessary to maintain our own helper function? If I replace helpers.literal_eval in the tests for ast.literal_eval it works as well?

Wouldn't it also work if we accepted str for _to_literal as well and just passed the string instead of node.as_string() to ast.literal_eval?

@@ -114,6 +114,8 @@ def __init__(
class AstroidSyntaxError(AstroidBuildingError):
"""Exception class used when a module can't be parsed."""

# this should probably be a SyntaxError subclass.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also re-raise it from TypeError, ValueError and LookupError. So it doesn't seem to only be a SyntaxError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, this is out of the scope if this PR anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the comment then not also be out of scope?

self.assertRaises(ValueError, helpers.literal_eval, "2+3")

def test_literal_eval_complex(self):
# Issue #4907
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which issue is this referring to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python's

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a reference to it then. Referring to issues from other projects/repo without any context can be confusing I think.

if isinstance(node_or_string, str):
_node = builder.parse(node_or_string.lstrip(" \t")).body
if len(_node) != 1:
raise ValueError(f"expected only one expression, found {len(_node)}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this covered? I don't see a test for it, or am I overlooking something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a test

if isinstance(node_or_string, str):
_node = builder.parse(node_or_string.lstrip(" \t")).body
if len(_node) != 1:
raise ValueError(f"expected only one expression, found {len(_node)}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise ValueError(f"expected only one expression, found {len(_node)}")
raise ValueError(f"Expected only one expression, got {len(_node)}")


def _raise_malformed_node(node):
msg = "malformed node or string"
lno = getattr(node, "lineno", None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be willing to check if this is indeed necessary for astroid nodes? I would assume we initialise lineno on every node, even on Module after #1262.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For nodes created from scratch it might be necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shouldn't be the case. The init of each node should handle setting lineno. If they do not so this would be a bug and should be fixed.

and node.func.name == "set"
and node.args == node.keywords == []
):
return set()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't tested. Seeing as this is apparently problematic in ast (see python/cpython@4fcf5c1) I wonder how we handle dict()? Perhaps we should add a test as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set() is a special case because we can't create empty set other wise.
I don;t think we should add this kind of support for dict() or list() since we can use {} and [].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah. Still, if we're going to maintain our own function shouldn't we add some obvious extra functionality while we're at it?

Copy link
Contributor Author

@tristanlatr tristanlatr Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point, we could handle empty calls to list, dict, tuple, if it's only that, I'm ok implementing it.

But in general, I think we should stay the closest possible from ast module, and let users call this function with a possible inferred node.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it makes sense to add those three as well. There is a big chance that we will run out of sync with the "upstream" version of this due to limited maintenance time. Makes sense imo to add some slight improvements while we're looking at it.

@tristanlatr
Copy link
Contributor Author

If I replace helpers.literal_eval in the tests for ast.literal_eval it works as well?

No, astroid nodes and ast nodes are not 100% compatible.

Wouldn't it also work if we accepted str for _to_literal as well and just passed the string instead of node.as_string() to ast.literal_eval?

Yes but the cool thing with helpers.literal_eval is that we can evaluate NodeNG without converting them to string before (which is hightly innefficient)!

I don't really see why it is necessary to maintain our own helper function?

I'ts necessary if we don't want to convert back to string all NodeNG we might want to evaluate...

@DanielNoord
Copy link
Collaborator

If I replace helpers.literal_eval in the tests for ast.literal_eval it works as well?

No, astroid nodes and ast nodes are not 100% compatible.

The tests you added do work with ast.literal_eval.

Wouldn't it also work if we accepted str for _to_literal as well and just passed the string instead of node.as_string() to ast.literal_eval?

Yes but the cool thing with helpers.literal_eval is that we can evaluate NodeNG without converting them to string before (which is hightly innefficient)!

I'ts necessary if we don't want to convert back to string all NodeNG we might want to evaluate...

Hmm, that's indeed a benefit. However, I wonder if this warrants having to maintain this ourselves. We only use _to_literal in the inference of nodes.Compare. I'll leave this decision to others though. Perhaps it would be good if we had some actual data just how inefficient _to_literal is right now.

@tristanlatr
Copy link
Contributor Author

I agree it could work by doing something like:

def _dummy_literal_eval(node: Union[str, NodeNG]) -> Any:
    return ast.literal_eval(node if isinstance(node, str) else node.as_string())

But maybe some speed data comparaison will make you prefer my proposition:

>>> import timeit
>>> timeit.timeit('literal_eval(node)', setup='import json;from pathlib import Path;from astroid.helpers import literal_eval;from astroid.builder import extract_node;node = extract_node(repr(json.load(Path("./10kSamplejson.json").open("r"))))', number=20)
0.9220764310000007
>>> timeit.timeit('_dummy_literal_eval(node)', setup='import json;from pathlib import Path;from astroid.helpers import _dummy_literal_eval;from astroid.builder import extract_node;node = extract_node(repr(json.load(Path("./10kSamplejson.json").open("r"))))', number=20)
7.558820483999995

I'v used this big testing json: the JSON file is here https://docs.google.com/uc?export=download&id=1RAp9w12hZkmLt2vJcAjdT-UtC3KpFyq4

@DanielNoord
Copy link
Collaborator

I'd like someone else's opinion on this. I'm -0.5 to it.

Obviously this would be a performance improvement and it might make sense to add this. However, since we're using this only in one place ourselves and we're basically copying the function from ast I keep wondering whether the improvement outweighs potentially missing out on any "upstream" fixes. Especially considering the clear lack of active (and more importantly experienced) maintainers we're currently experiencing.
I'm certainly not going to block this as I do see the benefit, but I'd like someone else's opinion on this matter before merging/finalising this!

(Thanks @tristanlatr for the ping, I should have made this clearer 😄)

@Pierre-Sassoulas
Copy link
Member

Opened #1361 to discuss this.

Comment on lines +378 to +385
if node.func.name == "set":
return set()
if node.func.name == "tuple":
return tuple()
if node.func.name == "list":
return list()
if node.func.name == "dict":
return dict()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if node.func.name == "set":
return set()
if node.func.name == "tuple":
return tuple()
if node.func.name == "list":
return list()
if node.func.name == "dict":
return dict()
if node.func.name == "set":
return set()
elif node.func.name == "tuple":
return tuple()
elif node.func.name == "list":
return list()
elif node.func.name == "dict":
return dict()
else:
_raise_malformed_node(node)

Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @DanielNoord on this on. It isn't clear to me why this would be useful. Even if it provides a small performance benefit, the cost of maintaining it is higher IMO. Especially considering literal_eval is only used in the inference of Compare nodes. With that kind of limited scope, it's unlikely that we would see any performance improvements.

If you do have a use-case for it, I would be interested to hear it. Maybe as response here? #1361 (comment)

@cdce8p
Copy link
Member

cdce8p commented Jan 31, 2022

I would like to close this one. As I explained in #1361 (comment), I don't think the helper would provide value for astroid at the current time. Furthermore, it would increase the maintenance burden if a new Python version is released and something changes.

IMO every time there is an alternative in the stdlib, it's better to use that than to implement your own version for it. In this case the workaround with ast.literal_eval(node.as_string()) is good enough.

@DanielNoord
Copy link
Collaborator

Seeing as this discussion seems to have died down I think I'm going to close this. There seems to be somewhat of a consensus (from the maintainer's side) that this is not something we would want to add right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion 🤔 Enhancement ✨ Improvement to a component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants