ThangLD201 issues

Results 19 issues of


                                            ThangLD201

Any documentations on training from scratch using custom data in other languages ?

I have a speaker diarization dataset in Vietnamese, where, in every audio file, segments of speakers are already annotated. How should I prepare and process data to be able to...

question

Any documentation on using SRU++ ?

Hello, I've read and really appreciated your team's wonderful works on SRU++. I want to implement this architecture in other tasks, but i'm having problem finding the documentation on SRU++,...

Any examples on serving Speech2Text models from Huggingface, such as Wav2Vec2 ?

### 🚀 The feature As far as I know, there are no examples or documentation on serving Speech2Text models from Huggingface, such as Wav2Vec2. How could I enable serving with...

good first issue

Any support for GPU ?

I notice that doing inference with language model on large amount of texts can be quite slow. In particular, it took me 11 minutes to decode around 4600 lines of...

provided PTX was compiled with an unsupported toolchain

``` ./build/bin/main -m /tmp/ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf -n 128 -t 8 --vram-budget 40 -p "Hi. How are you ?" Log start main: build = 1560 (2217e7f) main: built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0...

Q: Applicability to sequence tagging

Hi @frederick0329, for sequence tagging (e.g. NER) one would need to predict label for each token in the sequence per a test sample. In this case, the loss is averaged...

Conflicting between mle and ranking loss during training

Hi @yixinL7, I was training a BRIO model on a different dataset (RedditTIFU) and observed conflicting trends between the mle and ranking loss. I start from a converged mle checkpoint....

How was pearson correlation calculated in the experiments ?

Hi, thanks for the great works ! There are a bit of details regarding correlation of LaSE in the paper that I did not quite understand. For each target language,...

Out-of-memory

Hi, I'm using a 40GB VRAM memory GPU but get out-of-memory error. The error seems to come from `src/model_bert.py`, Line 339 (alpha_f): ``` _, attention_probs, value_layer = self_outputs output_head_weights =...

De-tokenized text

Hi @afshinrahimi, @yuan-li, do you still keep the raw data (un-tokenized one) ? Also, which tokenizer did you use for this dataset ? I need to work with the raw...