HIST: Missing part of the code for generating stock2concept data
❓ Questions and Help
Hello, In the HIST algorithm, part of the code is missing, for generating stock2concept data I.e., the code which generates examples/benchmarks/HIST/data/csi300_stock2concept.npy. Please add it to the repository.
Thank you. We sincerely suggest you to carefully read the documentation of our library as well as the official paper. After that, if you still feel puzzled, please describe the question clearly under this issue.
I'm also working on the re-construction of the stock-concept matrix from external resources.
However, since there would be quite a lot guess works involved, I don't think my re-construction can reproduce the results that were reported in the paper.
After reviewing the code line-by-line, I'm quite convinced that the code of HIST in qlib's repo has some errors.
However, the key to reproduce the paper's result lies in rebuilding the predefined concept matrix, which relies on some thrid party data source.
And I believe that the .npy data provided in the repo, which has a single snap-shot of the concept matrix, may not reflect well on how the stocks are connected historically, and may introduce backtest bias.
This issue is stale because it has been open for three months with no activity. Remove the stale label or comment on the issue otherwise this will be closed in 5 days