Maykeye

Results 31 comments of Maykeye

After running some `git bisect`, it seems problems were introduced at commit 1566d8e34425937e6182d4425e7b85c370912d64 (Add model settings to the Models tab)

I had the same question [yesterday](https://discuss.huggingface.co/t/why-models-llama-in-particular-upcasts-softmax-to-fp32/44787). Can we make it optional? At least softmax BF16 is good enough. And by "good enough" I mean it "not crashes at long context...

Yes and quantized models produce noticeably different results.

Custom types aren't being added as often as they are being used, so having diagnostic would be good. When I mistyped "FlOAT" after copying node that makes const int, I...

``` INFO: Started server process [1310434] INFO: Waiting for application startup. torch found: /home/fella/src/sd/sd/lib/python3.11/site-packages/torch/lib torch set INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) max_tokens=4100...

> Good idea! We haven't experimented with it. To be honest I think these differences tend to get washed out with scale, but maybe not. [Paper in point](https://arxiv.org/pdf/2310.04564.pdf). Switching activation...

> What is the meaning behind them being good places to inject adapters? Long story: arxiv:1902.00751. Short story: if LoRA replaces `XW` with `XW + XAB` and sees itself more...

>Can you explain the use case here. Would this be like if the model is handling topic a, we're using and updating state a for each inference? Yes, manual cache...

Similar. It's about the manual control over the every aspect of cache (and hence state) for model. [The model itself](https://github.com/state-spaces/mamba/blob/009bec5ee37f586844a3fc89c040a9c1a9d8badf/mamba_ssm/models/mixer_seq_simple.py#L233) uses InferenceParms.