FinRL icon indicating copy to clipboard operation
FinRL copied to clipboard

YahooFinanceProcessor failure to download the market data

Open randheerDas opened this issue 1 year ago • 3 comments

I am running the latest version of FinRL_PortfolioAllocation_NeurIPS_2020 from the FinRl-Tutorials repository. https://github.com/AI4Finance-Foundation/FinRL-Tutorial

I am facing an issue while executing the code to download market data.

Following is the error:

1 Failed download: ['AXP']: Exception('%ticker%: No price data found, symbol may be delisted (1d 2008-01-01 00:00:00 -> 2008-01-02 00:00:00)')

And this keeps repeating and it never ends.

I feel there is a bug here either in finrl or in yfinance. My old tutorial runs well.

Following is my current configuration:

Python 3.10 , Finrl = finrl-0.3.6 and yfinance-0.2.38

Could someone from the FinRl Team help me on this? I am working towards giving a demonstration to our team.

Screenshot

image

randheerDas avatar May 26 '24 13:05 randheerDas

Since the days you listed are weekends, so it wouldn't have data for sure.

dytsou avatar Jun 08 '24 16:06 dytsou

I have the same issue. Regardless of the dates. Yahoo Finance integration seems to be broken. I have no problem using Yahoo Finance outside of FinRL. Also tried on single tickers, it does not work.

Axacon-eks avatar Jun 11 '24 12:06 Axacon-eks

I'm having this issue too with this notebook:

With AXP it goes through all the years i specify: Image

But when it gets to the next one, it starts to loop with another error:

Image

I will look if i can change it with some temporal fix for my installation this so I'll be able to properly download data and tell if it works

EDIT: Okay false alarm, saw in another thread that it only gives error when the date it's trying to download is weekends or holidays so... there's that

EDIT 2: Somehow, it downloads data but it gets to an error from yfinance saying JSONDecodeError and stays like that for some time but still downloading the data. The issue here for me is that it wasn't able to download all the data. Got 26 tickers out of 30 and the shape of the data after feature engineering was really low:

Image

While this shape is from the tutorial:

Image

Looked up in the yfinance github but a fix for this was back in 2022 which is not working anymore, at least for me.

EDIT 3: I've been with a lower version of yfinance, 2.45 or so, now i'm with last version 2.52 but i'm getting still lower data and api rate limits, the shape is larger nearly 32.000 or so, but still far from the tutorial

EDIT 4: OKay since i'm pretty stupid but have the gift of perseverance, i found kind of a work around by just modifying the yfinance library with a code i saw on a issue from yfinance github by adding a rate limiter with 1 second.

[https://github.com/ranaroussi/yfinance/issues/2125](related issue)

The easiest way to do it is having vscode installed and adding the directory where you have installed your yfinance library. In my case i'm using anaconda for virtual envs, so i have them in python3.10/site-packages. Added that folder to my workspace and i can modify as i please from there. Here i modify the data.py file by adding this at the beggining:

from requests_ratelimiter import LimiterSession, Limiter, RequestRate from pyrate_limiter import Duration, RequestRate

Obviously have requests_ratelimiter and pyrate_limiter installed with pip or whatever you like more

And then modify the init method which is inside of YfData class, just under user_agent_headers , mine now looks like this:

def init(self, session=None):

    self._crumb = None
    self._cookie = None
    # Default to using 'basic' strategy
    self._cookie_strategy = 'basic'
    # If it fails, then fallback method is 'csrf'
    # self._cookie_strategy = 'csrf'

    self._cookie_lock = threading.Lock()
    
    #added new on 27-01-2025 from here
    # Define the rate limit
    history_rate = RequestRate(1, Duration.SECOND)
    limiter = Limiter(history_rate)
    
    # Use LimiterSession instead of requests.Session()
    self._session = session or LimiterSession(limiter=limiter)
    self._session.headers.update(self.user_agent_headers)

    self._set_session(self._session)

After that, what i usually do is restart the kernel on my jupyter notebook and rerun the code. If you work with envs, perhaps you can deactivate and activate again (sometimes deactivate 2 times if using conda just to make sure) because that could give you headaches too.

Sorry for the weird format of the code, can't get it to be all in code format somehow.

I really liked the idea of learning about finrl, so if i'm following tutorials i realy like the idea of having the same results as there. So if you were having this issue like me, hope this helps you out.

Surely with the rate limit to 1 second it will take more time and a bit overkill to download many years of data, but if the tutorial says so i don't mind so i can learn.

Take in account you'll still have the error about a ticker not being found if it's downloading data like from a holiday or weekend. If you're not sure, just check the dates of those errors and will see that those correspond to days where the market is closed.

So yeah it seems Yahoo Finance is limiting.

Thing is, Jupyter goes a bit unresponsive when downloading the data, is eating like 10gb of my ram but that's on me

EDIT 5: For me it goes numb after a while executing and becomes laggy, mostly because of the prints. I added just at the beginning of the cell where you download data the line:

%%capture

and that just doesn't print the ouput.

Had my jupyter notebook crashing a couple of times and doing it now on jupyter lab. It's a msi laptop from 2017 with i5-7300hq and 32 gb ram with ssd. Can't crush much with 4 cores/4 threads folks

v4lt4ru5 avatar Jan 23 '25 23:01 v4lt4ru5