Results 9 comments of chenzujie

i have a similar problem, use websetting.setUserAgent to change the user-agent,and the http/js/css was set with right user-agent,but when play mp4 or mp3 in webview, the user-agent was changing by...

> > AutoTokenizer > > > 可以通过一下方式判断输入的token数量,注意目前超过512的会被截断。 > > > ``` > > > from transformers import AutoTokenizer > > > tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-zh-v1.5') > > > input_l = len(tokenizer.encode("hello"))...

` tokenizer = AutoTokenizer.from_pretrained('./bge-large-zh') input_l = len(tokenizer.encode(str)) print(input_l) ` 这么写看着是可以了

> Are you self hosting it? Will take a look into it. yes, self hosting

> 参考FAQ-2: https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding#frequently-asked-questions 有个疑问,faq-2里 ``` sentences_1 = ["样例数据-1", "样例数据-2"] sentences_2 = ["样例数据-3", "样例数据-4"] ``` 用的是两个数组,我把上面的数据改了改 我试了下得到的结果会是 ``` [[0.8384 0.7036] [0.745 0.8286]] ``` 这个是把样例数据1和3,4 样例数据2和3,4去比较得到4个相似度结果吗,有点没看懂这里数组的用法跟结果要如何对上

> @TChengZ 第一行[0.8384 0.7036]是"样例数据-1"对sentences_2的相似度,第二行是"样例数据-2"对sentences_2的相似度。 另外再咨询下,faq里直接 ``` similarity = embeddings_1 @ embeddings_2.T ``` 这个相似度计算方式和我自己调用余弦cosine计算是一样的吗

``` # -*- coding: utf-8 -*- from FlagEmbedding import FlagModel model = FlagModel('/xxx/bge-m3', query_instruction_for_retrieval="答案比较", use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation def get_similarity(sentences_1,...

> > ``` > > # -*- coding: utf-8 -*- > > from FlagEmbedding import FlagModel > > model = FlagModel('/xxx/bge-m3', > > query_instruction_for_retrieval="答案比较", > > use_fp16=True) # Setting use_fp16...

> > ``` > > # -*- coding: utf-8 -*- > > from FlagEmbedding import FlagModel > > model = FlagModel('/xxx/bge-m3', > > query_instruction_for_retrieval="答案比较", > > use_fp16=True) # Setting use_fp16...