cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Readline completion of module names in import statements

Open vadmium opened this issue 10 years ago • 19 comments

BPO 25419
Nosy @vadmium, @serhiy-storchaka, @MojoVampire
Dependencies
  • bpo-16182: readline: Wrong tab completion scope indices in Unicode terminals
  • Files
  • complete-import.patch
  • complete-import.v2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-10-16.11:16:48.120>
    labels = ['type-feature', 'library']
    title = 'Readline completion of module names in import statements'
    updated_at = <Date 2016-05-25.08:03:01.876>
    user = 'https://github.com/vadmium'
    

    bugs.python.org fields:

    activity = <Date 2016-05-25.08:03:01.876>
    actor = 'martin.panter'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2015-10-16.11:16:48.120>
    creator = 'martin.panter'
    dependencies = ['16182']
    files = ['40794', '42986']
    hgrepos = []
    issue_num = 25419
    keywords = ['patch']
    message_count = 6.0
    messages = ['253072', '253095', '253104', '253115', '253132', '266320']
    nosy_count = 3.0
    nosy_names = ['martin.panter', 'serhiy.storchaka', 'josh.r']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue25419'
    versions = ['Python 3.6']
    

    Linked PRs

    • gh-129329
    • gh-133358
    • gh-134181
    • gh-134680
    • gh-138268
    • gh-138943
    • gh-139461
    • gh-143436
    • gh-143438

    vadmium avatar Oct 16 '15 11:10 vadmium

    As mentioned in bpo-25209, here is a patch to complete module and attribute names in import statements. This is a rewrite of some personal code I have had kicking around for a few years, but I have added extra features (parsing “import x as alias”, supporting “from .relative import x”).

    All the existing completions should continue to work as long as the statement immediately preceding the cursor does not look like an “import” statement. When an import statement is detected, these other completions are disabled.

    When alternative completions are displayed, my code appends a dot (.) indicator to package names, but this is not inserted in the input line. Maybe people think this is a good idea or maybe not.

    This illustrates what it looks like in action. Text in square brackets was automatically completed:

    Python 3.6.0a0 (qbase qtip readline/complete-import.patch tip:65d2b721034a, Oct 16 2015, 10:02:3) 
    [GCC 5.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import io, h<Tab><Tab>
    hashlib  heapq    hmac     html.    http.    
    >>> import io, htm[l].<Tab><Tab>
    html.entities  html.parser    
    >>> import io, html.e[ntities]
    >>> from html import e<Tab><Tab>
    entities  escape    
    >>> from html import es[cape]
    

    The code works around bpo-16182 (byte vs code point indexing) in a very simplistic way. It probably won’t give useful completions (if any) for non-ASCII input involving imports.

    The patch currently does not complete built-in module names (sys.builtin_module_names). This should not be hard to add, but I need to decide whether the support should be added specially in rlcompleter, or generally in pkgutil.iter_modules() instead.

    I also have code to complete the following, which I could try to make patches for if there is interest:

    • keyword argument names to function calls, e.g. open("file", enc[oding=]
    • attributes on constructor return types, e.g. ValueError("msg").__t[raceback__]

    vadmium avatar Oct 16 '15 11:10 vadmium

    I'm reviewing the patch, but it will take a time.

    Wouldn't be simpler to use regular expressions instead of tokenizer?

    For now Completer doesn't depends on the readline module, nor other global state. It can be used for completing in other environment, for example to complete words in IDE. Patched Completer retrieves additional information from the readline module. This can break existing code. It would be nice to decouple Completer from readline. In contrary to user .pythonrc.py file, we are free to invent new interfaces. May be add methods to the Completer class that provides needed additional information (nothing by default), and add Completer's subclass ReadlineCompleter that implements these methods using readline?

    Found presumable bugs:

    "import sy" doesn't suggest completion "sys".

    "import os.p" doesn't suggest completion "os.path".

    serhiy-storchaka avatar Oct 16 '15 20:10 serhiy-storchaka

    Is there a reason ipython's import completer couldn't be borrowed in its entirety? At work, I use a lightly adapted version of the code from ipython to do completion when I'm using the plain interactive interpreter (for whatever reason), and it works just fine.

    Josh: Thanks for pointing out I Python. I haven’t used it much myself, but it does seem to do a similar thing to my proposal. Looks like the relevant code may be around module_completion() at <https://github.com/ipython/ipython/blob/ab929fe/IPython/core/completerlib.py#L209>.

    The “sys” module is one of the builtin modules that I mentioned above. I plan to discuss changing pkgutil.iter_modules() to include it, in a separate bug report.

    The “os.path” case is more awkward. The “os” module is not actually a package. I believe “import os.path” only works because executing the “os” module modifies sys.modules. My code currently avoids importing non-packages, because I thought it would be annoying to accidentally run a script via tab completion (anything not protected by ‘if __name__ == "__main__" ’). On the other hand, I Python happily completes “os.path” (and many more non-submodule names). It also can be tricked into running scripts, e.g. if you do “import idlelib.__main__.” and press Tab. But maybe this is not a real problem, and I should stop being paranoid.

    I tend to avoid regular expressions if practical. But Serhiy you may be right that some simple string matching rules would reduce the need for tokenizing. It looks like I Python only has a few simple rules for the entire input line being “import x” and “from x import y”. The disadvantage is less accurate understanding of more complicated syntax, like “from x import y; from z import (a, bb as b, . . .”. It is a tradeoff between simpler code that only supports basic functionality versus complex code that supports more complete functionality.

    I hear your points about decoupling from Readline and backwards compatibility. I will consider the overall architecture more in a future update. It would be good to allow this stuff to be used in e.g. Idle (though I wouldn’t know where to wire it in myself).

    vadmium avatar Oct 17 '15 02:10 vadmium

    Added few cursory comments on Rietveld.

    "sys" and "os.path" cases are not critical and we can ignore them if it complicates the code. But preserving Completer independence from readline looks more important to me. We can just add methods get_line_buffer(), get_begidx(), get_endidx() and get_completion_type() to Completer, but may be better to add more high level methods.

    serhiy-storchaka avatar Oct 17 '15 15:10 serhiy-storchaka

    I moved all the calls targetting the readline module into a ReadlineCompleter subclass. However the logic for parsing “import” statements still exists in the base Completer class in private methods. An overview of the two classes:

    class Completer:
        def complete(self, text, state):
            self._get_matches(text)
        def _get_matches(text):
            # Only completes global and object.attr names, like before
        def _code_matches(self, code, ...):
            # Completes import statements, otherwise returns (None, ...)
    
    class ReadlineCompleter(Completer):  # New public class
        def complete(self, text, state):
            # Moved Yury’s Tab insertion logic here
            return super().complete(...)
        def _get_matches(text):
            code = readline.get_line_buffer()[:readline.get_endidx()]
            self._code_matches(code)
            super()._get_matches(text)  # Fallback to existing behaviour
    

    Perhaps the _code_matches() and related methods could be turned into more general public APIs, e.g. complete_code(code) -> list of modules, attributes, globals, etc. But that would affect global_matches() and attr_matches(), which I have not touched so far.

    Other changes:

    • Limit underscore-prefixed completions, consistent with bpo-25011; see new _filter_identifiers() method
    • Changed the demo in the documentation; attributes like __doc__ are omitted by default
    • Removed workaround for non-ASCII input in favour of fixing bpo-16182

    vadmium avatar May 25 '16 08:05 vadmium

    I changed my mind. I thought it would be easier to implement this with simple regular expressions and string operations, but handling the corner cases makes it difficult. I now think that using the tokenize module is the only correct solution.

    My apologies, @vadmium. Are you still interesting in continuing this work?

    serhiy-storchaka avatar Mar 28 '24 19:03 serhiy-storchaka

    I don’t want to spend much time on this, but I’m happy for other people to build on my code.

    vadmium avatar May 12 '24 05:05 vadmium