vody-am
vody-am
@lvhan028 Google translate has been working well, so yes I will give it a shot :joy:
@lvhan028 while I am here -- is there anything special one has to set to use multiple GPUs? Thus far I have experimented with one container per GPU, but it...
@irexyc limiting batch size to `3` appears to alleviate this issue, your observation is correct. It would be a good idea to make this a configurable parameter.
oh I also have an interest in reading sentence piece tokenizers as well, in order to invoke the SigLIP text transformer in Rust! EDIT: using the library mentioned by Eric...
@erikreed I got it working with the base (not chat) model. The evaluation scripts serve as good examples. E.g.: ```py from transformers import AutoModelForCausalLM, AutoTokenizer import time import torch MODEL_ID...
Perhaps relevant under a separate issue, but I would like to chime in that I could help with measurement and performance work if some issues are created and/or discussion is...
confirming that `go build` with the above patch, produces a working version on M4