Skip to content

Create AssignName nodes for ClassDef and FunctionDef #1390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

cdce8p
Copy link
Member

@cdce8p cdce8p commented Feb 11, 2022

Description

A first attempt to add better position information for names in ClassDef and FunctionDef.
Could be used to resolve: pylint-dev/pylint#5466

At the moment, this is only a proof of concept!

During my last comments I suggested replacing the name attribute with a dedicated (AssignName) node. This would be a large breaking change. @DanielNoord Helped convince me that backwards compatible solutions would be better and much easier to implement in pylint. So instead of changing an existing attribute, I've added a new one name_node.

To get the actual position within the source code, I've used the tokenize module. The code itself works already. The required change in pylint would be to replace the ClassDef and FunctionDef nodes in error messages with the new name_node.

An example how it will look, here with the missing-docstring error.

Screen Shot 2022-02-11 at 01 13 11

While looking at that, I was wondering if it would be better to highlight a bit more, i.e. include class / def? After all, in general those are errors on the class / function and not the name itself. The parsing is already in place, so it would be quite easy to modify.

If we decide to that, it's probably better to store the position information inside an attribute instead of a dedicated node.

We could also think about the implementation in pylint again. My initial idea to replace all nodes in add_message calls might not be the best way to do things after all. I think @DanielNoord suggested it (maybe in a different context), we could probably modify the add_message function itself to use the new position attribute for ClassDef and FunctionDef nodes.

/CC: @Pierre-Sassoulas @DanielNoord

Edit
The alternative:

Screen Shot 2022-02-11 at 02 04 22

Type of Changes

Type
✨ New feature

Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea a lot. Smart use of tokenize.

Perhaps some pre-mature comments, but thought I would make them while I was looking anyway.

Two additional comments:

  1. With respect to pylint, I indeed think an isintance check on add_message makes the most sense. This make sure we never forget to pass name_node when we're dealing with these types of nodes, which is likely to happen down the line. Performance impact should be negligible.
  2. I'm wondering if we should include everything until the : in this name_node? Technically, that is all part of the assignment right? And we might need access to the arguments or typing included down the line. Perhaps we should create an Assign node for the complete assignment and then class ClassName as AssignName within that? Or that might be overcomplicating too much.

self,
manager: AstroidManager,
parser_module: Optional[ParserModule] = None,
data: Optional[str] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the typing you added in astroid/builder.py this should be str right? I don't see any other place where TreeRebuilder is instantiated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, inside astroid it will always be str. Technically, you could consider TreeRebuilder part of the public interface however. Adding a new required argument would then be a breaking change.

Defining it as Optional isn't too bad considering everything. The name_node / position needs to be Optional, so a small check if not self._data doesn't really hurt.

if not self._data:
return None
end_lineno: Optional[int] = getattr(node, "end_lineno", None)
if node.body:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need to see how this interacts with docstrings etc. Those are part of the body in ast but not in astroid. You probably thought about this though :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node here is the ast one. So that should work. AFAIK each function / class body needs at least one node to be valid Python (either a docstring, pass, or something else). In any case if body should somehow be empty, it would fallback to the end_lineno of the node itself. Or if that also didn't work None, which is essentially self._data[node.lineno - 1 : ].

search_token_name = "def"
else:
search_token_name = "class"
token_found: bool = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bool necessary here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, I find it helpful never the less :)

@Pierre-Sassoulas
Copy link
Member

I think it's better to highlight class and def too, as Marc said not doing that make you think there's a problem with the class or function name. Another commenter made a point in the original thread that if we can't pinpoint the problem, a zero length error at the beggining of the class probably make more sense.

The missing docstring example is illustrating that well : Highlighting the whole class make the most sense but is too disruptive, highlighting another part of the class makes you think there is a problem in this part, but highlighting the real problem is impossible as it's something that is missing in the class. So imo highlighting a zero length warning at the beggining of the node is the right compromise (for message that are a full line long or more).

Also I think defining the boundary of the warnings is a decision to make in pylint and not in astroid. Astroid should permit to recover line and column boundary of the node (and of element inside the node) , then add_message display nodes the way it see fits.

@cdce8p
Copy link
Member Author

cdce8p commented Feb 11, 2022

With respect to pylint, I indeed think an isintance check on add_message makes the most sense. This make sure we never forget to pass name_node when we're dealing with these types of nodes, which is likely to happen down the line. Performance impact should be negligible.

Agreed. Although we could probably add an option to add_message to overwrite that behavior. Some errors should be emitted for the whole function. E.g. invalid-overridden.

We could also think about if it would make sense adding that position attribute to the Node base class. Although initially only FunctionDef and ClassDef will use it, other block nodes could benefit too. I was especially thinking about match and matchcases. Instead of the isinstance check, we could then just use is not None.

I'm wondering if we should include everything until the : in this name_node? Technically, that is all part of the assignment right? And we might need access to the arguments or typing included down the line. Perhaps we should create an Assign node for the complete assignment and then class ClassName as AssignName within that? Or that might be overcomplicating too much.

I don't think that makes sense. The arguments are already included in other child nodes and Assign / AssignName should only be a name, not include something else. As for the error range, consider that especially for function definitions it's common to split them over multiple lines. Highlighting all of them doesn't really make sense.

I think it's better to highlight class and def too, as Marc said not doing that make you think there's a problem with the class or function name. Another commenter made a point in the original thread that if we can't pinpoint the problem, a zero length error at the beggining of the class probably make more sense.

Although zero length ranges / positions work, they aren't ideal for highlighting. At least VS Code chooses to only highlight a single char in that case. Let's try including the keywords with the name first. We all seem to like that one.

Also I think defining the boundary of the warnings is a decision to make in pylint and not in astroid. Astroid should permit to recover line and column boundary of the node (and of element inside the node) , then add_message display nodes the way it see fits.

That would be the case either way. The issue at the moment is that pylint doesn't have access to the exact keyword - name position / range. That's what this PR would solve.

@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.10.0 milestone Feb 11, 2022
@Pierre-Sassoulas Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Feb 11, 2022
@cdce8p
Copy link
Member Author

cdce8p commented Feb 11, 2022

Closing it in favor of #1393 to preserve the idea and discussion here.

@Pierre-Sassoulas
Copy link
Member

Let's try including the keywords with the name first. We all seem to like that one.

Please note that the comment I was talking about is already highly up-voted for a 5 days old comment. It says:

Old behavior (highlight just the beginning of the affected line) was concise and effective enough. Please at least provide an option to change this behavior via settings. Thanks!

I asked kovla what he thinks about the proposition, let's see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better error ranges for classes and functions
3 participants