tiingo-python icon indicating copy to clipboard operation
tiingo-python copied to clipboard

get_dataframe throws KeyError 'date' when one or more tickers has no data to return in the timeframe

Open satyan-g opened this issue 5 years ago • 4 comments

  • Tiingo Python version: 0.13.0
  • Python version: 3.8.2
  • Operating System: OSX

Description

When I try to pull data for a valid ticker, but invalid timeframe (i.e., the ticker did not have pricing in between the specified 'startDate' and 'endDate', it throws a KeyError 'date'.

The graceful behavior for this scenario should be an empty column for RMG. Other it defeats the purpose of making a single call for multiple tickers if we have to validate the valid date-range for each ticker prior to making the call.

What I Did

ticker_history_df = client.get_dataframe(['GOOGL', 'RMG'],
                                         startDate='2018-05-15',
                                         endDate='2018-05-31',
                                         metric_name='adjClose',
                                         frequency='daily')

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/Users/satya/anaconda/lib/python3.5/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3077             try:
-> 3078                 return self._engine.get_loc(key)
   3079             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'date'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-26-8fc023f2d2cc> in <module>()
      3                                          endDate='2018-05-31',
      4                                          metric_name='adjClose',
----> 5                                          frequency='daily')

/Users/satya/Projects/github/tiingo-python/tiingo/api.py in get_dataframe(self, tickers, startDate, endDate, metric_name, frequency, fmt)
    298                 for stock in tickers:
    299                     ticker_series = self._request_pandas(
--> 300                         ticker=stock, params=params, metric_name=metric_name)
    301                     ticker_series = ticker_series.rename(stock)
    302                     prices = pd.concat([prices, ticker_series], axis=1, sort=True)

/Users/satya/Projects/github/tiingo-python/tiingo/api.py in _request_pandas(self, ticker, metric_name, params)
    192             df = pd.DataFrame(response.json())
    193 
--> 194         df.set_index('date', inplace=True)
    195 
    196         if metric_name is not None:

/Users/satya/anaconda/lib/python3.5/site-packages/pandas/core/frame.py in set_index(self, keys, drop, append, inplace, verify_integrity)
   3907                 names.append(None)
   3908             else:
-> 3909                 level = frame[col]._values
   3910                 names.append(col)
   3911                 if drop:

/Users/satya/anaconda/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2686             return self._getitem_multilevel(key)
   2687         else:
-> 2688             return self._getitem_column(key)
   2689 
   2690     def _getitem_column(self, key):

/Users/satya/anaconda/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_column(self, key)
   2693         # get column
   2694         if self.columns.is_unique:
-> 2695             return self._get_item_cache(key)
   2696 
   2697         # duplicate columns & possible reduce dimensionality

/Users/satya/anaconda/lib/python3.5/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
   2487         res = cache.get(item)
   2488         if res is None:
-> 2489             values = self._data.get(item)
   2490             res = self._box_item_values(item, values)
   2491             cache[item] = res

/Users/satya/anaconda/lib/python3.5/site-packages/pandas/core/internals.py in get(self, item, fastpath)
   4113 
   4114             if not isna(item):
-> 4115                 loc = self.items.get_loc(item)
   4116             else:
   4117                 indexer = np.arange(len(self.items))[isna(self.items)]

/Users/satya/anaconda/lib/python3.5/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3078                 return self._engine.get_loc(key)
   3079             except KeyError:
-> 3080                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   3081 
   3082         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'date'

satyan-g avatar Dec 14 '20 05:12 satyan-g

@satyan-g thanks for the detailed bug report!

The graceful behavior for this scenario should be an empty column for RMG. Other it defeats the purpose of making a single call for multiple tickers if we have to validate the valid date-range for each ticker prior to making the call.

This is a good suggestion, I'll do some thinking about how to accommodate this in a future release.

hydrosquall avatar Dec 19 '20 20:12 hydrosquall

Perhaps a minimalistic "fix" could be to insert a check before calling set_index on line 194 of api.py: if df.empty: return pd.Series([], dtype=float) Unfortunately can't test since I hit the max symbol limit for this month...

usnigg avatar Jan 03 '21 16:01 usnigg

Hi all, I recently faced a quite similar issue where line 194 of api.py threw the error: {KeyError}"None of ['date'] are in the columns". It seems respectivley in fact the 'date' column is already set as datetime index. I fixed it with the input from @usnigg and the following:

if df.empty: return pd.Series([], dtype=float) if 'date' in df.columns: df.set_index('date', inplace=True)

maxko37 avatar Mar 03 '21 08:03 maxko37

This error is also thrown if the list of tickers includes recently de-listed stocks due to acquistitions, for example. Try WRI or WORK.
I think if the json is an empty list, then the stock should just be skipped.

rossgbaker2 avatar Aug 09 '21 13:08 rossgbaker2