Associate comment of first statement properly in a Module
In a local scope, e.g. inside a function, class, if condition. The comment is associated with the statement correctly.
In [10]: code = """
...: def f():
...: # abc
...: import d
...: """
In [11]: cst.parse_module(code)
Out[11]:
Module(
body=[
FunctionDef(
name=Name(
value='f',
lpar=[],
rpar=[],
),
...
body=IndentedBlock(
body=[
SimpleStatementLine(
body=[
Import(
names=[
ImportAlias(
name=Name(
value='d',
lpar=[],
rpar=[],
),
asname=None,
comma=MaybeSentinel.DEFAULT,
),
],
semicolon=MaybeSentinel.DEFAULT,
whitespace_after_import=SimpleWhitespace(
value=' ',
),
),
],
leading_lines=[
EmptyLine(
indent=True,
whitespace=SimpleWhitespace(
value='',
),
comment=Comment(
value='# abc',
),
newline=Newline(
value=None,
),
),
],
trailing_whitespace=TrailingWhitespace(
whitespace=SimpleWhitespace(
value='',
),
comment=None,
newline=Newline(
value=None,
),
),
),
],
...
)
However, in a module, the comment is parsed as module headers. So some codemods may insert statement between the comment and the first statement, e.g. EnsureImportPresentCommand .
In [8]: code = """
...: # abc
...: import d
...: """
In [9]: cst.parse_module(code)
Out[9]:
Module(
body=[
SimpleStatementLine(
body=[
Import(
names=[
ImportAlias(
name=Name(
value='d',
lpar=[],
rpar=[],
),
asname=None,
comma=MaybeSentinel.DEFAULT,
),
],
semicolon=MaybeSentinel.DEFAULT,
whitespace_after_import=SimpleWhitespace(
value=' ',
),
),
],
leading_lines=[],
trailing_whitespace=TrailingWhitespace(
whitespace=SimpleWhitespace(
value='',
),
comment=None,
newline=Newline(
value=None,
),
),
),
],
header=[
EmptyLine(
indent=True,
whitespace=SimpleWhitespace(
value='',
),
comment=None,
newline=Newline(
value=None,
),
),
EmptyLine(
indent=True,
whitespace=SimpleWhitespace(
value='',
),
comment=Comment(
value='# abc',
),
newline=Newline(
value=None,
),
),
],
footer=[],
encoding='utf-8',
default_indent=' ',
default_newline='\n',
has_trailing_newline=True,
)
It's probably not easy to differentiate whether a comment is for the first statement or not to handle it properly when parsing a module. It could be solved by building a helper to post processing the CST by giving some hints (e.g. # lint-ignore comment should be associate with the first statement).
CC @zsol @thatch
What if we take as headers only comments starting with #! or #\s*@ and also comments that are not followed by a statement?, in other words any of the following cases:
#!/usr/bin/env python
# @oncall xxx
# @oncall xxx
# some comments with no immediate statements after
# going to the header.
# statement comment
import xxx
What if we take as headers only comments starting with
#!or#\s*@and also comments that are not followed by a statement?, in other words any of the following cases:
That sounds like a great idea! We can try implement this and run on existing code to see if it works well.
I don't know that there's a perfect, general, easily-configurable way to handle this. For context, I'm assuming this is trying to solve the special case of https://github.com/Instagram/Fixit/pull/143 in libcst itself, which would be great.
My proposal would be not specialcase @ because of @manual autodeps. Just #! (in the first line) and PEP 263 comments (in the first two lines). Are you already working on a PR for this?
Was pointed at another example today where the first line was a # pyre-fixme... comment being considered module header.