Henry Bigelow comments

Results 30 comments of


                                            Henry Bigelow

train command and L2Error is not defined error

Hi @dizwe , apologies for not noting in the README more clearly, but this project is still under construction and unfortunately there is too much that is changing at the...

Feature extractor

Hi, The model is not ready yet. I'm currently training it using the vq-vae bottleneck, and it's only on 5k steps so far, and I've noticed various collapse and other...

Feature extractor

Hi Rasipuram, I am pretty new to this sub-field myself so unfortunately don't know very many repos except the WaveNet ones, which aren't designed to extract features. I'll keep it...

Feature extractor

Hi Zifan, I'm sorry I cannot be more help here, but I never did succeed in training this model. I tried training it for 10 days on a TPU (full...

Update on the pretrained model?

Hi Pranay, Thanks for your interest. Unfortunately, the progress is halted at the moment. I just started a new job, and this was a side project. I do intend to...

histogram_summary -> summary.histogram

Hi Sam, Thanks for letting me know. I'm not surprised actually, since it's a bit of an old model. It's my first pull request, so I needed the practice! Cheers,...

causal-conv1d installation error

I'm not familiar with ninja, but I was able to build `causal_conv1d` from source. First, note that these two commands should produce matching CUDA versions: ```bash python3 -c 'import torch;...

ImportError causal_conv1d_cuda.cpython-310-x86_64-linux-gnu.so undefined symbol

Hi @FloMru, Would [this issue](https://github.com/state-spaces/mamba/issues/55#issuecomment-1858638484) help you? From what I understand, it's required that both torch and causal_conv1d use the same version of CUDA.

Visualization of Delta (post-Softplus) values during the Induction Task

Hi Karami, The experiment I did is in [mamba-recall](https://github.com/hrbigelow/mamba-recall). Hopefully that can get you the answers you need. It's been awhile so I don't remember just now, but if I...

Visualization of Delta (post-Softplus) values during the Induction Task

Hi Mahdi, That's a very good point. Yes indeed I trained only on the recall token prediction. I interpreted the phrase "trained on the induction head task" to mean actually...