fairseq2 Add support for CLI tab completions

What does this PR do? Please describe: This PR extends the CLI with support for tab completions. Initially, I tried using argcomplete, but it was slow and ran a Fairseq2 process each time I pressed tab, which was far from ideal.

I then discovered shtab, a package that generates shell tab completion scripts efficiently. According to its README:

What: Automatically generate shell tab completion scripts for Python CLI apps

Why: Speed & correctness. Alternatives like argcomplete and pyzshcomplete are slow and have side-effects

How: shtab processes an argparse.ArgumentParser object to generate a tab completion script for your shell

After implementing shtab, tab completion works beautifully. Here are some examples:

$ fairseq2 [TAB]
assets        llama         lm            mt            wav2vec2_asr

$ fairseq2 ll[TAB]
$ fairseq2 llama

It’s especially useful for things like presets. Instead of listing presets and copying them manually, you can now use tab completion:

$ fairseq2 lm [TAB]
chatbot               generate              instruction_finetune  preference_finetune   
$ fairseq2 lm c[TAB]
$ fairseq2 lm chatbot

It also works with arguments:

$ fairseq2 lm chatbot --[TAB]
--cluster               --help                  --model                 --temperature           --top-p                 
--dtype                 --max-gen-len           --seed                  --tensor-parallel-size

Tab completions are specific to each shell and require the installation of corresponding scripts in the shell's directory.

Initially, I intended to extend the project by adding post-install scripts to automatically set up tab completion scripts (b63af9842eaba1521c00dd3f42ec3169b40a9f02). However, I discovered that this approach only worked with zip and tarball distributions, while fairseq2 uses wheel builds. As a result, I reverted that change.

Currently, you can enable tab completions manually by using --print-completions {bash|zsh|tcsh}. The output of this command can be saved to the shell's completion scripts directory for persistence or evaluated directly for the current session.

For example:

eval "$(fairseq2 --print-completion bash)"

Please note: each time the CLI interfaces are modified, the shell completion scripts need to be updated.

Does your PR introduce any breaking changes? If yes, please list them: None.

Check list:

[X] Was the content of this PR discussed and approved via a GitHub issue? (no need for typos or documentation improvements)
[X] Did you read the contributor guideline?
[X] Did you make sure that your PR does only one thing instead of bundling different changes together?
[ ] Did you make sure to update the documentation with your changes? (if necessary)
[ ] Did you write any new necessary tests?
[X] Did you verify new and existing tests pass locally with your changes?
[ ] Did you update the CHANGELOG? (no need for typos, documentation, or minor internal changes)

Aug 11 '24 14:08 ahmedsaed

Most shells provide path completions by default. However, when using custom shell completions, this default behavior is overridden. To specify that an argument should use path completions, you need to use shtab.FILE for file completions and shtab.DIRECTORY for directory completions.

For example:

parser.add_argument(
    "input_dir",
    type=Path,
    help="Checkpoint directory",
).complete = shtab.DIRECTORY  # type: ignore[attr-defined]

parser.add_argument(
    "--config-file",
    dest="config_files",
    metavar="CONFIG_FILE",
    type=Path,
    nargs="*",
    help="YAML configuration file(s)",
).complete = shtab.FILE  # type: ignore[attr-defined]

This configuration ensures that input_dir will provide directory completions and --config-file will provide file completions. I have updated the whole codebase to correctly handle path completions.

I also had to ignore the type hints as I can't seem to find a better way to handle them.

Aug 11 '24 15:08 ahmedsaed

Neat! Happy to merge this once ready. Any reason why it is draft at the moment?

Aug 12 '24 20:08 cbalioglu

Neat! Happy to merge this once ready. Any reason why it is draft at the moment?

I was waiting for further instructions on how to address the post-install issue. If we are not going to handle the shell completions setup automatically (i.e. post install code). Should I update some docs like INSTALL_FROM_SOURCE.md with instructions on how to setup and enable shell completions?

Aug 12 '24 23:08 ahmedsaed

Should I add support for PowerShell and fish shell (if possible)?

Currently supported shells are bash, zsh and tcsh. Which supports almost all Linux systems and macos

Aug 13 '24 15:08 ahmedsaed

I was waiting for further instructions on how to address the post-install issue. If we are not going to handle the shell completions setup automatically (i.e. post install code). Should I update some docs like INSTALL_FROM_SOURCE.md with instructions on how to setup and enable shell completions?

Considering that our CLI commands are populated dynamically during runtime based on the packages installed in the Python environment, the safest would be to let users manually update their bashrc/zhrc once they have their environment fully set up. So adding some brief instructions in README (right after "Installing from Source") makes sense to me.

Should I add support for PowerShell and fish shell (if possible)?

I don't think they are necessary. PowerShell is mainly used on Windows. I haven't seen anyone using it on Linux/macOS. As long as we support bash and zsh, the rest is not really much relevant nowadays.

Aug 20 '24 13:08 cbalioglu

I have added a section in the Readme file on setting up shell completions in https://github.com/facebookresearch/fairseq2/pull/741/commits/4af2be410802a609e16ec79dd46a704efcf583cd. This PR should be ready for merging now.

Aug 20 '24 19:08 ahmedsaed