LibCST icon indicating copy to clipboard operation
LibCST copied to clipboard

Associate comment of first statement properly in a Module

Open jimmylai opened this issue 5 years ago • 4 comments

In a local scope, e.g. inside a function, class, if condition. The comment is associated with the statement correctly.

In [10]: code = """
    ...: def f():
    ...:   # abc
    ...:   import d
    ...: """

In [11]: cst.parse_module(code)
Out[11]:
Module(
    body=[
        FunctionDef(
            name=Name(
                value='f',
                lpar=[],
                rpar=[],
            ),
            ...
            body=IndentedBlock(
                body=[
                    SimpleStatementLine(
                        body=[
                            Import(
                                names=[
                                    ImportAlias(
                                        name=Name(
                                            value='d',
                                            lpar=[],
                                            rpar=[],
                                        ),
                                        asname=None,
                                        comma=MaybeSentinel.DEFAULT,
                                    ),
                                ],
                                semicolon=MaybeSentinel.DEFAULT,
                                whitespace_after_import=SimpleWhitespace(
                                    value=' ',
                                ),
                            ),
                        ],
                        leading_lines=[
                            EmptyLine(
                                indent=True,
                                whitespace=SimpleWhitespace(
                                    value='',
                                ),
                                comment=Comment(
                                    value='# abc',
                                ),
                                newline=Newline(
                                    value=None,
                                ),
                            ),
                        ],
                        trailing_whitespace=TrailingWhitespace(
                            whitespace=SimpleWhitespace(
                                value='',
                            ),
                            comment=None,
                            newline=Newline(
                                value=None,
                            ),
                        ),
                    ),
                ],
                ...
)

However, in a module, the comment is parsed as module headers. So some codemods may insert statement between the comment and the first statement, e.g. EnsureImportPresentCommand .

In [8]: code = """
   ...: # abc
   ...: import d
   ...: """

In [9]: cst.parse_module(code)
Out[9]:
Module(
    body=[
        SimpleStatementLine(
            body=[
                Import(
                    names=[
                        ImportAlias(
                            name=Name(
                                value='d',
                                lpar=[],
                                rpar=[],
                            ),
                            asname=None,
                            comma=MaybeSentinel.DEFAULT,
                        ),
                    ],
                    semicolon=MaybeSentinel.DEFAULT,
                    whitespace_after_import=SimpleWhitespace(
                        value=' ',
                    ),
                ),
            ],
            leading_lines=[],
            trailing_whitespace=TrailingWhitespace(
                whitespace=SimpleWhitespace(
                    value='',
                ),
                comment=None,
                newline=Newline(
                    value=None,
                ),
            ),
        ),
    ],
    header=[
        EmptyLine(
            indent=True,
            whitespace=SimpleWhitespace(
                value='',
            ),
            comment=None,
            newline=Newline(
                value=None,
            ),
        ),
        EmptyLine(
            indent=True,
            whitespace=SimpleWhitespace(
                value='',
            ),
            comment=Comment(
                value='# abc',
            ),
            newline=Newline(
                value=None,
            ),
        ),
    ],
    footer=[],
    encoding='utf-8',
    default_indent='    ',
    default_newline='\n',
    has_trailing_newline=True,
)

It's probably not easy to differentiate whether a comment is for the first statement or not to handle it properly when parsing a module. It could be solved by building a helper to post processing the CST by giving some hints (e.g. # lint-ignore comment should be associate with the first statement).

CC @zsol @thatch

jimmylai avatar Oct 20 '20 19:10 jimmylai

What if we take as headers only comments starting with #! or #\s*@ and also comments that are not followed by a statement?, in other words any of the following cases:

#!/usr/bin/env python
# @oncall xxx
# @oncall xxx
# some comments with no immediate statements after
# going to the header.

# statement comment
import xxx

Kronuz avatar Oct 20 '20 19:10 Kronuz

What if we take as headers only comments starting with #! or #\s*@ and also comments that are not followed by a statement?, in other words any of the following cases:

That sounds like a great idea! We can try implement this and run on existing code to see if it works well.

jimmylai avatar Oct 20 '20 19:10 jimmylai

I don't know that there's a perfect, general, easily-configurable way to handle this. For context, I'm assuming this is trying to solve the special case of https://github.com/Instagram/Fixit/pull/143 in libcst itself, which would be great.

My proposal would be not specialcase @ because of @manual autodeps. Just #! (in the first line) and PEP 263 comments (in the first two lines). Are you already working on a PR for this?

thatch avatar Nov 22 '20 18:11 thatch

Was pointed at another example today where the first line was a # pyre-fixme... comment being considered module header.

thatch avatar Nov 30 '20 21:11 thatch