Daisuke Niizumi comments

Results 45 comments of


                                            Daisuke Niizumi

requirements.txt

Hi thank you for writing requirements, I wanted to merge but I'd like to avoid hard limitation of library versions. I'm using PyTorch 1.x actually though it is specified as...

Use of pretrained weigths

Hi, thanks for your interest. I'm glad to hear that the pre-trained weight is fairly suitable for your tasks so far. It looks like an environmental issue, such as the...

Hi, I summarized a guideline based on what I have experienced. https://github.com/nttcslab/m2d/blob/master/Guide_app.md Based on it, quick comments for your use case are: - Pre-training from scratch using 2 8GB GPUs...

Use of pretrained weigths

> A small addendum for others: to setup distributed mode I had to adapt the command line to be CUDA_VISIBLE_DEVICES=0,1 python3 -m torch.distributed.launch --nproc_per_node=2 train_audio.py ... > > Without adding...

Use of pretrained weigths

Regarding your questions about the four options, yes, they are your options. It was nice to figure out the 4th option. And I also recommend the 5th option. First, the...

Use of pretrained weigths

Regarding the loss values, I have included a log of ICBHI 2017 fur-PT here. [examples/logs/log_m2d_x_vit_base-80x200p16x4-230814-Ddffsd50ks5blr0003bs128a2nr.3-e600.out](examples/logs/log_m2d_x_vit_base-80x200p16x4-230814-Ddffsd50ks5blr0003bs128a2nr.3-e600.out) -> Fixed to [example_logs.zip](https://github.com/nttcslab/m2d/releases/download/v0.1.0/example_logs.zip) As you can see, the loss would be around 0.4 in...

Use of pretrained weigths

Please find logs here: [example_logs.zip](https://github.com/nttcslab/m2d/releases/download/v0.1.0/example_logs.zip) I added M2D, M2D-S logs in addition to the M2D-X for ICBHI2017 log. (And, I also updated the guide document.) And it's a great question....

Config files

Hi, thank you for your interest. Quick answer: The config/m2d.yaml is in another repository called EVAR. https://github.com/nttcslab/eval-audio-repr/blob/main/config/m2d.yaml This repository is an evaluation package for audio representations mainly used in our...

Config files

Thank you for your question. We'd like to close this issue now. Please feel free to reopen whenever you need!

Able to do ASR ?

Hi, a quick answer is no. We provide foundation models for general sounds.