Varun Srivastava
Varun Srivastava
Speedup is about 10x for me on an M1. Definitely huge. Not sure how embeddings will compare to inference in terms of GPU optimization but I think there is huge...
exciting news!
Fantastic news! Just played around and it's working well on my M1. Will followup to see if I can help with errors.
@do-me can you tell me how to update the github action as well for my fork of semantic-finder? ty
Will look into adding it if I get a chance this week.
I'll take a look at the dot product today. Have you seen the FAISS library at all? Some of this algorithms might be able to offer terrific implementations for clustering...
I agree a lean database seems quite suitable. It would be nice to make a local-storage JS library with "guarantees" like limits on the total memory that can be used....
Just checking, are the files that [Jhnbsomersxkhi2](https://huggingface.co/Jhnbsomersxkhi2) contributed live anywhere? I think I'll definitely start trying to contribute to that HF repo
For the GET API we can get the size of the default model in terms of number of fp32 and fp16 parameters, and from here we could compute the model,...
Yes that's a great idea! What're the storage limits for Hugging Face? And yes we can get the regular size of the model. Check out the above js query and...