Li Yongkang comments

Results 11 comments of


                                            Li Yongkang

how to generate the datasets of 300W_LP?

Hello, have you solved your problem?I want to know how to deal with data sets downloaded from the Internet and how to split them up.

how to generate the datasets of 300W_LP?

@mumuyanyan

some mistakes in the data.

> Thx for your interest. I have fixed the issue in the mat file for a while. I think you are probably using an old version of the mat file....

traning speed is very slow

> @luyug I think I have figure this prolem out, thanks. but during my experiment , I found that the loss is very difficult to converge, here is my log:...

Gradient update is extremely slow

Hi all, I am facing the same issue: the speed is very slow. I also observed that my GPU memory was only a few GB in use, even though I...

Possibility of including TREC DL 2019 and TREC DL 2020

I’ve created a [code repository](https://github.com/liyongkang123/extended_beir_datasets) to deal with this. Anyone who needs it can use it.

Different results by retriever.encode_and_retrieve and retriever.retrieve

Hi Nandan, I’m not sure if this is the right approach. If there are cases where a `doc_id `is the same as a `query_id `in the corpus, I wonder whether...

Different results by retriever.encode_and_retrieve and retriever.retrieve

Hi Nandan, I also found the following difference: ``` retriever = EvaluateRetrieval(model, score_function=score_function, k_values=[1, 5, 10, 50, 100, 1000]) # "dot" for dot product cos_sim results = retriever.encode_and_retrieve(corpus, queries, encode_output_path=embedding_save_path,...

Different results by retriever.encode_and_retrieve and retriever.retrieve

Hi Nandan, I can try doing it this way. However, I'm not sure when the normalization is performed. We have two strategies: 1. Perform normalization before saving the embedding, then...

Trainer compute_loss signature mismatch with newer transformers version

Thanks, I submitted a pull request #161 here, and I hope it can be merged asap.