feat: Add CLI commands for browsing and searching OpenML flows (models)
Metadata
-
Reference Issue :https://github.com/openml/openml-python/issues/1486
-
New Tests Added: Yes
-
Documentation Updated: No (CLI help text serves as documentation)
-
Change Log Entry: "Add CLI commands for browsing and searching OpenML flows (models):
openml models list,openml models info, andopenml models search"
Details
What does this PR implement/fix?
This PR adds three new CLI subcommands under openml models to improve the user experience of the model catalogue:
-
openml models list- List flows (models) with optional filtering (tag, uploader, pagination, output format) -
openml models info <flow_id>- Display detailed information about a specific flow -
openml models search <query>- Search flows by name with case-insensitive matching
Why is this change necessary? What is the problem it solves?
Currently, users must write Python code to browse or search OpenML flows, even for simple tasks like listing available models or finding a specific model. This creates a barrier to entry and makes the model catalogue less accessible. Adding CLI commands allows users to interact with the model catalogue directly from the command line without writing code.
This directly addresses the ESoC 2025 goal of "Improving user experience of the model catalogue in AIoD and openML".
How can I reproduce the issue this PR is solving and its solution?
Before (requires Python code): import openml flows = openml.flows.list_flows(size=10) for _, row in flows.iterrows(): print(row['name'])
After (CLI commands):
List first 10 flows
openml models list --size 10
Search for RandomForest models
openml models search RandomForest
Get detailed info about a model
openml models info 12345
List models with a specific tag
openml models list --tag sklearn --format table --verboseImplementation Details:
- Added three new functions in
openml/cli.py:models_list(),models_info(), andmodels_search() - Integrated into main CLI parser with proper argument handling
- Added comprehensive test suite (6 test cases) in
tests/test_openml/test_cli.py - Uses existing
openml.flows.list_flows()andopenml.flows.get_flow()functions - no changes to core API - Follows existing CLI patterns (similar to
configurecommand) - All tests use mocked API calls to avoid requiring server connections
Any other comments?
- All pre-commit hooks pass (ruff, mypy, formatting)
- No breaking changes
- Follows project code style and patterns
- Ready for review
Codecov Report
:x: Patch coverage is 49.74093% with 97 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 56.34%. Comparing base (4b1bdf4) to head (6b4c645).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| openml/cli.py | 49.74% | 97 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #1487 +/- ##
============================================
- Coverage 79.90% 56.34% -23.56%
============================================
Files 36 36
Lines 4349 4538 +189
============================================
- Hits 3475 2557 -918
- Misses 874 1981 +1107
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Thanks! Can you please update it to use the name 'flow' for now? We might introduce a new concept 'model' that is somewhat different from a flow, and we should avoid any confusion here.
Also, would it be possible to generalize this to other entities, e.g. datasets?
So far, we were only using this cli for managing configurations. It makes sense to add a CLI, but longer term it would be better to have this as a seperate repo, so that people don't have to install openml-python and developers don't have to run the full openml-python test suite. It would also keep the openml-python repo more focussed.
For now, it would be ok to merge it here (after updating the naming) until it is more complete (with other entities like datasets) and then spin it out as a separate library.
Thanks! Can you please update it to use the name 'flow' for now? We might introduce a new concept 'model' that is somewhat different from a flow, and we should avoid any confusion here.
Also, would it be possible to generalize this to other entities, e.g. datasets?
So far, we were only using this cli for managing configurations. It makes sense to add a CLI, but longer term it would be better to have this as a seperate repo, so that people don't have to install openml-python and developers don't have to run the full openml-python test suite. It would also keep the openml-python repo more focussed.
For now, it would be ok to merge it here (after updating the naming) until it is more complete (with other entities like datasets) and then spin it out as a separate library.
Thanks for the review!...... I have updated the CLI to use “flow” terminology everywhere and added a flows namespace..... I also added a datasets namespace (list/info/search) so the CLI is extensible to other entities. openml/cli.py now exposes openml flows ... and openml datasets ... and the parser is organized so more entities can be added later...... i would be happy if you have any other things in mind on which i can work upon.....