Add sector and industry data
Ho do I get Sector and Industry information on a stock? Normally displayed on a stock's profile page on Yahoo Finance. e.g. in python (yfinance) it would be:
import yfinance
tickerdata = yfinance.Ticker('IFT.NZ')
s = tickerdata.info['sector']
i = tickerdata.info['industry']
I was hoping that some query would result in a HashMap with this kind of data, but I haven't found it yet.
This would be a good addition if it doesn't exist.
I am not aware of any API call that provides these kind of data. Do you know if the yfinance uses a API call or a web page scraper to get these data?
I turned on debug with yfinance:
import yfinance as yf
import requests
import logging
logging.basicConfig(level=logging.DEBUG)
tickerdata = yf.Ticker('IFT.NZ')
tickerdata.info['sector']
and got the following:
DEBUG:yfinance:get_raw_json(): https://query2.finance.yahoo.com/v10/finance/quoteSummary/IFT.NZ
DEBUG:yfinance:Entering get()
DEBUG:yfinance:url=https://query2.finance.yahoo.com/v10/finance/quoteSummary/IFT.NZ
DEBUG:yfinance:params={'modules': 'financialData,quoteType,defaultKeyStatistics,assetProfile,summaryDetail', 'corsDomain': 'finance.yahoo.com', 'formatted': 'false', 'symbol': 'IFT.NZ'}
DEBUG:yfinance: Entering _get_cookie_and_crumb()
DEBUG:yfinance:cookie_mode = 'basic'
DEBUG:yfinance: Entering _get_cookie_and_crumb_basic()
DEBUG:peewee:('CREATE TABLE IF NOT EXISTS "_cookieschema" ("strategy" VARCHAR(255) NOT NULL PRIMARY KEY, "fetch_date" DATETIME NOT NULL, "cookie_bytes" BLOB NOT NULL) WITHOUT ROWID', [])
DEBUG:peewee:('SELECT "t1"."strategy", "t1"."fetch_date", "t1"."cookie_bytes" FROM "_cookieschema" AS "t1" WHERE ("t1"."strategy" = ?) LIMIT ? OFFSET ?', ['basic', 1, 0])
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fc.yahoo.com:443
DEBUG:urllib3.connectionpool:https://fc.yahoo.com:443 "GET / HTTP/1.1" 404 4744
DEBUG:peewee:('DELETE FROM "_cookieschema" WHERE ("_cookieschema"."strategy" = ?)', ['basic'])
DEBUG:peewee:('BEGIN', None)
DEBUG:peewee:('INSERT INTO "_cookieschema" ("strategy", "fetch_date", "cookie_bytes") VALUES (?, ?, ?)', ['basic', '2024-07-09T12:46:50.536081', <memory at 0x7f4a29800c40>])
DEBUG:yfinance:fetched basic cookie = <Cookie A3=d=AQABBHqIjGYCEGsb3P5Y03FULtSqbj3k8qoFEgEBAQHZjWaWZoq6NuUA_eMAAA&S=AQAAAoeKahXerXaTZh6md65dRyI for .yahoo.com/>
DEBUG:yfinance:reusing cookie
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): query1.finance.yahoo.com:443
DEBUG:urllib3.connectionpool:https://query1.finance.yahoo.com:443 "GET /v1/test/getcrumb HTTP/1.1" 200 11
DEBUG:yfinance:crumb = 'h/Kzjue1Bod'
DEBUG:yfinance: Exiting _get_cookie_and_crumb_basic()
DEBUG:yfinance: Exiting _get_cookie_and_crumb()
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): query2.finance.yahoo.com:443
DEBUG:urllib3.connectionpool:https://query2.finance.yahoo.com:443 "GET /v10/finance/quoteSummary/IFT.NZ?modules=financialData%2CquoteType%2CdefaultKeyStatistics%2CassetProfile%2CsummaryDetail&corsDomain=finance.yahoo.com&formatted=false&symbol=IFT.NZ&crumb=h%2FKzjue1Bod HTTP/1.1" 200 2362
DEBUG:yfinance:response code=200
DEBUG:yfinance:Exiting get()
DEBUG:yfinance:Entering get()
DEBUG:yfinance:url=https://query1.finance.yahoo.com/ws/fundamentals-timeseries/v1/finance/timeseries/IFT.NZ?symbol=IFT.NZ&type=trailingPegRatio&period1=1704758400&period2=1720569600
DEBUG:yfinance:params=None
DEBUG:yfinance: Entering _get_cookie_and_crumb()
DEBUG:yfinance:cookie_mode = 'basic'
DEBUG:yfinance: Entering _get_cookie_and_crumb_basic()
DEBUG:yfinance:reusing cookie
DEBUG:yfinance:reusing crumb
DEBUG:yfinance: Exiting _get_cookie_and_crumb_basic()
DEBUG:yfinance: Exiting _get_cookie_and_crumb()
DEBUG:urllib3.connectionpool:https://query1.finance.yahoo.com:443 "GET /ws/fundamentals-timeseries/v1/finance/timeseries/IFT.NZ?symbol=IFT.NZ&type=trailingPegRatio&period1=1704758400&period2=1720569600&crumb=h%2FKzjue1Bod HTTP/1.1" 200 99
DEBUG:yfinance:response code=200
DEBUG:yfinance:Exiting get()
I don't know, but this may give a clue. I tried to follow by writing requests to the same urls but already I got lost in the first couple of requests. I have not looked at the python code yet.
If I nut it out I'll let you know.
I was away for two weeks, but will have a look at it in the next few days.
It's not easy to follow from this log what steps are actually required and necessary and if I try the URLs in this log I just get error messages. I wonder what all that cookie manipulating done for. Maybe it would help to capture the traffic when accessing these metadata.
I actually was able to figure this out. I am working on the implementation. The cookies and crumbs basically are a session management tool. I copied the python repo and added some prints and basically all those endpoints return nothing unless you include the crumb and cookie as cookies in the rqeuest.
I was successfully able to get sector/industry info on APPL along with C-suite information with rust and reqwest alone so adding it to this should be easy now that I know how it works.
Just made a (draft) PR for this
This was just merged here: https://github.com/xemwebe/yahoo_finance_api/pull/58
and can be closed