FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

无法理解传入的参数,使用ASR

Open deadash opened this issue 1 year ago • 4 comments

传入的参数有部分不清楚,虽然结果正确但是和其他调用结果不一样

Before asking:

看了python和js源码,相关参数还是没有办法理解

What is your question?

传入的参数chunk_size、chunk_interval分别是什么?怎么通过sample_rate来计算得到。

多次尝试后,结果都是正确的,但是服务器都会有 [error] handle_read_frame error: websocketpp.transport:7 (End of File)

Code

依赖库:

[dependencies]
futures = "0.3"
anyhow = "1.0"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
tokio = { version = "1.39", features = ["full"] }
tokio-tungstenite = "0.23.1"
hound = "3.5.1"

use tokio_stream::StreamExt;
use tokio_tungstenite::connect_async;
use tokio_tungstenite::tungstenite::Message;
use futures::SinkExt;
use std::fs::File;
use std::io::BufReader;


async fn load_and_send_audio(ws_url: &str, wav_path: &str) -> Result<String, Box<dyn std::error::Error>> {
    let wav_name = "demo";

    // Open the WAV file
    let wav_file = File::open(wav_path)?;
    let mut reader = hound::WavReader::new(BufReader::new(wav_file))?;
    let sample_rate = reader.spec().sample_rate;
    let audio_bytes: Vec<u8> = reader.samples::<i16>()
        .map(|s| s.unwrap().to_le_bytes())
        .flatten()
        .collect();

    // Calculate stride and chunk_num
    let stride = ((60.0 * 10.0 / 10.0 / 1000.0) * sample_rate as f64 * 2.0) as usize;

    // Connect to WebSocket server
    let (mut ws_stream, _) = connect_async(ws_url).await?;

    // Prepare the initial message
    let initial_message = json!({
        "mode": "offline",
        "chunk_size": [5, 10, 5],
        "chunk_interval": 10,
        "audio_fs": sample_rate,
        "wav_name": wav_name,
        "wav_format": "pcm",
        "is_speaking": true,
        "hotwords": "",
        "itn": true
    });

    // Send initial message
    ws_stream.send(Message::Text(initial_message.to_string())).await?;

    // Send audio data in chunks
    for chunk in audio_bytes.chunks(stride) {
        ws_stream.send(Message::Binary(chunk.to_vec())).await?;
    }

    // Send the end message
    let end_message = json!({
        "chunk_size": [5, 10, 5],
        "wav_name": wav_name,
        "is_speaking": false,
        "chunk_interval": 10,
        "mode": "offline"
    });
    ws_stream.send(Message::Text(end_message.to_string())).await?;

    // Await and process the response
    let response_text = match ws_stream.next().await {
        Some(Ok(Message::Text(response))) => response,
        _ => return Err("Failed to receive a valid response from WebSocket".into()),
    };

    // Close the WebSocket connection gracefully
    ws_stream.close(None).await?;

    Ok(response_text)
}

例如
    println!("{:?}",
        load_and_send_audio("ws://localhost:10095", "asr_example.wav").await
    );

What have you tried?

我发现python计算采用stride = int(60 * 10 / 10 / 1000 * 16000 * 2), 其中,sample_rate也正好是16000 ,但是为啥这么计算,这个chunk_size按理解应该是就是每次发送stride 这个大小,然后发送这个大小。看 funasr_client_api.py这个代码是这样实现的。然后传参是[5,10,5], 所以这三个到底是啥意思? stride = int(60 * chunk_size[1]/ chunk_interval / 1000 * 16000 * 2)

What's your environment?

  • OS: Windows
  • FunASR Version: funasr-runtime-sdk-cpu-0.4.5
  • Docker version: funasr-runtime-sdk-cpu-0.4.5

deadash avatar Aug 15 '24 09:08 deadash

好像有点理解了,需要offline需要等待消息结束

deadash avatar Aug 15 '24 09:08 deadash

仔细看了返回,也没有final,看代码是直接等待接收就返回了

Ok("{\"is_final\":false,\"mode\":\"offline\",\"stamp_sents\":[{\"end\":5195,\"punc\":\"。\",\"start\":880,\"text_seg\":\"欢 迎 大 家 来 体 验 达 摩 院 推 出 的 语 音 识 别 模 型\",\"ts_list\":[[880,1120],[1120,1380],[1380,1540],[1540,1780],[1780,2020],[2020,2180],[2180,2480],[2480,2600],[2600,2780],[2780,3040],[3040,3240],[3240,3480],[3480,3699],[3699,3900],[3900,4180],[4180,4420],[4420,4620],[4620,4780],[4780,5195]]}],\"text\":\"欢迎大家来体验达摩院推出的语音识别模型。\",\"timestamp\":\"[[880,1120],[1120,1380],[1380,1540],[1540,1780],[1780,2020],[2020,2180],[2180,2480],[2480,2600],[2600,2780],[2780,3040],[3040,3240],[3240,3480],[3480,3699],[3699,3900],[3900,4180],[4180,4420],[4420,4620],[4620,4780],[4780,5195]]\",\"wav_name\":\"demo\"}")

deadash avatar Aug 15 '24 09:08 deadash

hello,我也遇到[error] handle_read_frame error: websocketpp.transport:7 (End of File)的问题了,请问是怎么解决的

hm-li0420 avatar Feb 05 '25 07:02 hm-li0420

+1,我这也是这个情况,您这边搞定了么?

GIEYang avatar Sep 04 '25 06:09 GIEYang