whispercpp icon indicating copy to clipboard operation
whispercpp copied to clipboard

from_pretrained load local model

Open stoneLee81 opened this issue 2 years ago • 3 comments

Describe the bug

code is w = Whisper.from_pretrained('/Users/haowmazs/testdata/whisper.cpp-master/models/ggml-medium.bin')

throw exception RuntimeError: '/Users/haowmazs/testdata/whisper.cpp-master/models/ggml-medium.bin' is not a valid preconverted model. Choose one of ['tiny.en', 'tiny', 'base.en', 'base', 'small.en', 'small', 'medium.en', 'medium', 'large-v1', 'large']

To reproduce

No response

Expected behavior

No response

Environment

Python 3.9.7 whispercpp 0.017

stoneLee81 avatar Jun 29 '23 04:06 stoneLee81

same sa i

lostz avatar Jun 30 '23 09:06 lostz

Seeing the same. At a glance looks like the problem is this piece of code here:

 if model_name not in utils.MODELS_URL and not _os.path.isfile(model_name):
            raise RuntimeError(
                f"'{model_name}' is not a valid preconverted model or a file path. \
                    Choose one of {list(utils.MODELS_URL)}"
            )

The second part of this conditional appears to be failing. It's odd because I tried using the same absolute path with os.path.isfile in a local python shell and it works - even if you import os as _os.

Not a huge issue anyway as you can just download the model with the predefined ones in MODELS_URL but hopefully this adds some context.

OS details in case related: OS: OSX Ventura 13.0.1 Python version: 3.10.0

EDIT: Cache Workaround

In the meantime, you can just populate the cache yourself with your own model.

This library looks in one of two directories for the models based on the existence of the XDG_DATA_HOME env variable. You can put your local model into this directory and it should work as expected.

# move your model into the cache
cp whisper.cpp/models/ggml-base.bin ~/.local/share/whispercpp # OR $XDG_DATA_HOME/whispercpp
from whispercpp import Whisper

# use the local dir with your pretrained whisper model
whisper = Whisper.from_pretrained("base")

shogunpurple avatar Jul 05 '23 07:07 shogunpurple

if anyone still has this issue, it should be because the version on pypi is older than what's on the repo (I noticed the error message is different!)

https://pypi.org/project/whispercpp/#files -> the source code only has

      if model_name not in utils.MODELS_URL:
            raise RuntimeError(
                f"'{model_name}' is not a valid preconverted model. Choose one of {list(utils.MODELS_URL)}"
            )

I fixed the error by installing from git

pip install git+https://github.com/aarnphm/whispercpp.git -vv

marcoacierno avatar Aug 17 '23 22:08 marcoacierno