qlib icon indicating copy to clipboard operation
qlib copied to clipboard

highfreq data get calendar failure

Open 2young-2simple-sometimes-naive opened this issue 3 years ago • 3 comments

🐛 Bug Description

The program report ValueError: could not convert string to Timestamp error for obtaining highfreq calendar. If the data has freq of 1 day it works fine.

To Reproduce

import qlib
qlib.init(provider_uri="qlib_data/30min", dataset_cache=None, custom_ops=[], expression_cache=None, region=qlib.config.REG_US)
from qlib.data import D
trade_calendar = D.calendar(start_time="2000-01-01", end_time="2100-12-31", freq="30min")

Expected Behavior

Screenshot

[3723:MainThread](2022-07-07 00:59:52,410) INFO - qlib.Initialization - [config.py:413] - default_conf: client.
[3723:MainThread](2022-07-07 00:59:52,634) INFO - qlib.Initialization - [__init__.py:74] - qlib successfully initialized based on client settings.
[3723:MainThread](2022-07-07 00:59:52,634) INFO - qlib.Initialization - [__init__.py:76] - data_path={'__DEFAULT_FREQ': PosixPath('/data/hf/qlib_data/30min')}
[3723:MainThread](2022-07-07 00:59:53,406) ERROR - qlib.workflow - [utils.py:41] - An exception has been raised[ValueError: could not convert string to Timestamp].
  File "./test.py", line 7, in <module>
    trade_calendar = D.calendar(start_time=START_TIME, end_time="2100-12-31", freq="day", future=False)
  File "/data/hf/.venv/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 1146, in calendar
    return Cal.calendar(start_time, end_time, freq, future=future)
  File "/data/hf/.venv/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 90, in calendar
    _calendar, _calendar_index = self._get_calendar(freq, future)
  File "/data/hf/.venv/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 173, in _get_calendar
    _calendar = np.array(self.load_calendar(freq, future))
  File "/data/hf/.venv/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 659, in load_calendar
    backend_obj = self.backend_obj(freq=freq, future=future).data
  File "/data/hf/.venv/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/storage/file_storage.py", line 132, in data
    np.array(list(map(pd.Timestamp, _calendar))), self._freq_file, self.freq, self.region
  File "pandas/_libs/tslibs/timestamps.pyx", line 1399, in pandas._libs.tslibs.timestamps.Timestamp.__new__
  File "pandas/_libs/tslibs/conversion.pyx", line 408, in pandas._libs.tslibs.conversion.convert_to_tsobject
  File "pandas/_libs/tslibs/conversion.pyx", line 652, in pandas._libs.tslibs.conversion._convert_str_to_tsobject
ValueError: could not convert string to Timestamp

Environment

Linux x86_64 Linux-4.18.0-147.el8.x86_64-x86_64-with-glibc2.2.5 #1 SMP Wed Dec 4 21:51:45 UTC 2019

Python version: 3.8.6 (default, Oct 22 2020, 17:03:03) [GCC 9.3.0]

Qlib version: 0.8.6.99 numpy==1.23.0 pandas==1.4.3 scipy==1.8.1 requests==2.28.1 sacred==0.8.2 python-socketio==5.7.0 redis==4.3.4 python-redis-lock==3.7.0 schedule==1.1.0 cvxpy==1.2.1 hyperopt==0.1.2 fire==0.4.0 statsmodels==0.13.2 xlrd==2.0.1 plotly==5.9.0 matplotlib==3.5.2 tables==3.7.0 pyyaml==6.0 mlflow==1.27.0 tqdm==4.64.0 loguru==0.6.0 lightgbm==3.3.2 tornado==6.2 joblib==1.1.0 fire==0.4.0 ruamel.yaml==0.17.21

Additional Notes

我也遇到了这个问题: File "/home/anaconda3/envs/Qlib/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 90, in calendar _calendar, _calendar_index = self._get_calendar(freq, future) File "/home/anaconda3/envs/Qlib/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 173, in _get_calendar _calendar = np.array(self.load_calendar(freq, future)) File "/home/anaconda3/envs/Qlib/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 672, in load_calendar return [pd.Timestamp(x) for x in backend_obj] File "/home/anaconda3/envs/Qlib/lib/python3.8/site-packages/pyqlib-0.8.6.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 672, in return [pd.Timestamp(x) for x in backend_obj] File "pandas/_libs/tslibs/timestamps.pyx", line 1399, in pandas._libs.tslibs.timestamps.Timestamp.new File "pandas/_libs/tslibs/conversion.pyx", line 408, in pandas._libs.tslibs.conversion.convert_to_tsobject File "pandas/_libs/tslibs/conversion.pyx", line 652, in pandas._libs.tslibs.conversion._convert_str_to_tsobject ValueError: could not convert string to Timestamp

不知道是不是qlib的问题,我用下载好的cn_data_1min, 在yaml里加入了freq: "1min",也是一直报这个错误。。

NotDefinedDN avatar Jul 22 '22 07:07 NotDefinedDN

import qlib from qlib.data import D qlib.init(provider_uri="~/.qlib/qlib_data/cn_data_1min", region="cn") inst = D.list_instruments(D.instruments("all"), freq="1min", as_list=True) df = D.features(inst[:100], ["$close"], freq="1min") 同样的问题 image

我的解决方法:
找到data.py
.\Lib\site-packages\qlib\data\data.py
修改返回值
    def load_calendar(self, freq, future):
        import re   ####加载re
        try:
            backend_obj = self.backend_obj(freq=freq, future=future).data
        except ValueError:
            if future:
                get_module_logger("data").warning(
                    f"load calendar error: freq={freq}, future={future}; return current calendar!"
                )
                get_module_logger("data").warning(
                    "You can get future calendar by referring to the following document: 
                )
                backend_obj = self.backend_obj(freq=freq, future=False).data
            else:
                raise

        return [pd.Timestamp(re.sub('[\[\],\']','',x)) for x in backend_obj]  ####去掉多余字符,让时间戳能正常转化

wangze586 avatar Jul 27 '22 08:07 wangze586

It is fixed now Please refer to the main branch. https://github.com/microsoft/qlib/blob/main/qlib/data/storage/file_storage.py#L105

you-n-g avatar Aug 05 '22 09:08 you-n-g