qlib icon indicating copy to clipboard operation
qlib copied to clipboard

HIST: Missing part of the code for generating stock2concept data

Open smarkovichgolan opened this issue 3 years ago • 1 comments

❓ Questions and Help

Hello, In the HIST algorithm, part of the code is missing, for generating stock2concept data I.e., the code which generates examples/benchmarks/HIST/data/csi300_stock2concept.npy. Please add it to the repository.

Thank you. We sincerely suggest you to carefully read the documentation of our library as well as the official paper. After that, if you still feel puzzled, please describe the question clearly under this issue.

smarkovichgolan avatar Sep 01 '22 05:09 smarkovichgolan

I'm also working on the re-construction of the stock-concept matrix from external resources.

However, since there would be quite a lot guess works involved, I don't think my re-construction can reproduce the results that were reported in the paper.

pop0121 avatar Sep 15 '22 01:09 pop0121

After reviewing the code line-by-line, I'm quite convinced that the code of HIST in qlib's repo has some errors.

However, the key to reproduce the paper's result lies in rebuilding the predefined concept matrix, which relies on some thrid party data source.

And I believe that the .npy data provided in the repo, which has a single snap-shot of the concept matrix, may not reflect well on how the stocks are connected historically, and may introduce backtest bias.

pop0121 avatar Nov 30 '22 10:11 pop0121

This issue is stale because it has been open for three months with no activity. Remove the stale label or comment on the issue otherwise this will be closed in 5 days

github-actions[bot] avatar Feb 28 '23 12:02 github-actions[bot]