qlib
qlib copied to clipboard
Add support for category data type
Description
See https://github.com/microsoft/qlib/issues/1249. Supports storing data of string or category type.
If a column is of type string or category, then the value of the column will be stored as follows step:
- Stores a list of unique values for this column(path is
qlib_dir/categories/column_name.txt). - Convert the value of this column to the index of the value in the previous step list.
- The column is stored in bin format.
You can query the value of the string or category type column with the following methods:
- Query Index:
D.features(instruments, ["$column"]) - Query Value(use Cat operator):
D.features(instruments, ["Cat($column")])
For specific usage, please refer to the test case(tests/test_category_data.py).
Motivation and Context
How Has This Been Tested?
- [ ] Pass the test by running:
pytest qlib/tests/test_all_pipeline.pyunder upper directory ofqlib. - [x] If you are adding a new feature, test on your own test scripts.
Screenshots of Test Results (if appropriate):
- Pipeline test:
- Your own tests:
Types of changes
- [ ] Fix bugs
- [x] Add new feature
- [ ] Update documentation
Notes
related ISSUE https://github.com/microsoft/qlib/issues/232.