whispercpp.py icon indicating copy to clipboard operation
whispercpp.py copied to clipboard

ggml-large.bin doesn't exist anymore on hugginface

Open athoune opened this issue 2 years ago • 4 comments

large model doesn't work :

>>> w = Whisper('large')
Downloading ggml-large.bin...
whisper_init_from_file_no_state: loading model from '/Users/mlecarme/.ggml-models/ggml-large.bin'
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init_no_state: failed to load model

You have to pick v1, v2 or v3.

See https://huggingface.co/ggerganov/whisper.cpp/tree/main

athoune avatar Dec 04 '23 19:12 athoune

Thank you, that helped!

>>> import whispercpp 
>>> whispercpp.MODELS["ggml-large-v3.bin"] = "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.bin"
>>> w_large = whispercpp.Whisper('large-v3')
Downloading ggml-large-v3.bin...
whisper_init_from_file_no_state: loading model from '/Users/micseydel/.ggml-models/ggml-large-v3.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 128
whisper_model_load: f16           = 1
whisper_model_load: type          = 5
whisper_model_load: mem required  = 3342.00 MB (+   71.00 MB per decoder)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: model ctx     = 2951.32 MB
whisper_model_load: model size    = 2951.01 MB
whisper_init_state: kv self size  =   70.00 MB
whisper_init_state: kv cross size =  234.38 MB

It seems like I must be doing something wrong though still

>>> result = w_large.transcribe("/Users/micseydel/transcriptions/2024-08-10/Tom Froese and Michael Levin discuss Tom's Irruption theory.mp4")
Loading data..
Transcribing..
whisper_full_with_state: progress =   5%
whisper_full_with_state: progress =  10%
whisper_full_with_state: progress =  15%
whisper_full_with_state: progress =  20%
whisper_full_with_state: progress =  25%
whisper_full_with_state: progress =  30%
whisper_full_with_state: progress =  35%
whisper_full_with_state: progress =  40%
whisper_full_with_state: progress =  45%
whisper_full_with_state: progress =  50%
whisper_full_with_state: progress =  55%
whisper_full_with_state: progress =  60%
whisper_full_with_state: progress =  65%
whisper_full_with_state: progress =  70%
whisper_full_with_state: progress =  75%
whisper_full_with_state: progress =  80%
whisper_full_with_state: progress =  85%
whisper_full_with_state: progress =  90%
whisper_full_with_state: progress =  95%
whisper_full_with_state: progress = 100%
>>> text = w_large.extract_text(result)
Extracting text...
>>> len(text)
0
>>> type(result)
<class 'int'>
>>> result
0
>>> text
[]

micseydel avatar Aug 11 '24 15:08 micseydel

hello,do you have some methods to solve it ?I use ggml-large-v3, but it go wrong when I use it. wrong message is below: python: whisper.cpp/whisper.cpp:1345: bool whisper_encode_internal(whisper_context&, whisper_state&, int, int): Assertion mel_inp.n_mel == n_mels' failed. Aborted (core dumped)`

1907010218 avatar Nov 23 '24 12:11 1907010218

Thank you, that helped!

>>> import whispercpp 
>>> whispercpp.MODELS["ggml-large-v3.bin"] = "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.bin"
>>> w_large = whispercpp.Whisper('large-v3')
Downloading ggml-large-v3.bin...
whisper_init_from_file_no_state: loading model from '/Users/micseydel/.ggml-models/ggml-large-v3.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 128
whisper_model_load: f16           = 1
whisper_model_load: type          = 5
whisper_model_load: mem required  = 3342.00 MB (+   71.00 MB per decoder)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: model ctx     = 2951.32 MB
whisper_model_load: model size    = 2951.01 MB
whisper_init_state: kv self size  =   70.00 MB
whisper_init_state: kv cross size =  234.38 MB

It seems like I must be doing something wrong though still

>>> result = w_large.transcribe("/Users/micseydel/transcriptions/2024-08-10/Tom Froese and Michael Levin discuss Tom's Irruption theory.mp4")
Loading data..
Transcribing..
whisper_full_with_state: progress =   5%
whisper_full_with_state: progress =  10%
whisper_full_with_state: progress =  15%
whisper_full_with_state: progress =  20%
whisper_full_with_state: progress =  25%
whisper_full_with_state: progress =  30%
whisper_full_with_state: progress =  35%
whisper_full_with_state: progress =  40%
whisper_full_with_state: progress =  45%
whisper_full_with_state: progress =  50%
whisper_full_with_state: progress =  55%
whisper_full_with_state: progress =  60%
whisper_full_with_state: progress =  65%
whisper_full_with_state: progress =  70%
whisper_full_with_state: progress =  75%
whisper_full_with_state: progress =  80%
whisper_full_with_state: progress =  85%
whisper_full_with_state: progress =  90%
whisper_full_with_state: progress =  95%
whisper_full_with_state: progress = 100%
>>> text = w_large.extract_text(result)
Extracting text...
>>> len(text)
0
>>> type(result)
<class 'int'>
>>> result
0
>>> text
[]

hello,do you have some methods to solve it ?I use ggml-large-v3, but it go wrong when I use it. wrong message is below: python: whisper.cpp/whisper.cpp:1345: bool whisper_encode_internal(whisper_context&, whisper_state&, int, int): Assertion mel_inp.n_mel == n_mels' failed. Aborted (core dumped)`

1907010218 avatar Nov 23 '24 12:11 1907010218

@1907010218 I was trying to get Whisper working in a Docker container on macOS and abandoned my attempt. I'm not the only one: https://github.com/openai/whisper/discussions/1798#discussioncomment-10300753

micseydel avatar Nov 23 '24 16:11 micseydel