speechmatics-python icon indicating copy to clipboard operation
speechmatics-python copied to clipboard

Turn 'pyannote' into an optional dependency

Open olgeni opened this issue 9 months ago • 0 comments

The current dependency on pyannote significantly increases the installation timeand complexity for speechmatics-python. Specifically, pyannote introduces heavy dependencies (numpy, scipy, and others), which, when built from source, require extensive build times and additional tooling, including a Fortran compiler \o/

├── speechmatics-python v3.0.4
│   ├── docopt v0.6.2
│   ├── httpx[http2] v0.28.1 (*)
│   ├── jiwer v3.1.0
│   │   ├── click v8.1.8
│   │   └── rapidfuzz v3.13.0
│   ├── more-itertools v10.7.0
│   ├── polling2 v0.5.0
│   ├── pyannote-core v5.0.0
│   │   ├── numpy v2.2.5
│   │   ├── scipy v1.15.3
│   │   │   └── numpy v2.2.5
│   │   ├── sortedcontainers v2.4.0
│   │   └── typing-extensions v4.13.2
│   ├── pyannote-database v5.1.3
│   │   ├── pandas v2.2.3
│   │   │   ├── numpy v2.2.5
│   │   │   ├── python-dateutil v2.9.0.post0
│   │   │   │   └── six v1.17.0
│   │   │   ├── pytz v2025.2
│   │   │   └── tzdata v2025.2
│   │   ├── pyannote-core v5.0.0 (*)
│   │   ├── pyyaml v6.0.2
│   │   └── typer v0.15.3
│   │       ├── click v8.1.8
│   │       ├── rich v14.0.0
│   │       │   ├── markdown-it-py v3.0.0
│   │       │   │   └── mdurl v0.1.2
│   │       │   └── pygments v2.19.1
│   │       ├── shellingham v1.5.4
│   │       └── typing-extensions v4.13.2
│   ├── regex v2024.11.6
│   ├── tabulate v0.9.0
│   ├── tenacity v8.2.3
│   ├── toml v0.10.2
│   └── websockets v14.2

Since the pyannote dependency is only used within the asr_metrics module and does not appear necessary for typical usage scenarios of the SDK, making it optional would greatly streamline default installations 😅

So, I tried to make it optional and edited the asr_metrics cli to handle the case when it is missing (and made some tiny fixes).

olgeni avatar May 10 '25 11:05 olgeni