atr (and others) indicators not working with resample
Hi, I'm using minute level data:
Dataframe sample
Open High Low Close
Date
2020-01-01 22:02:00+00:00 1.32463 1.32464 1.32462 1.32463
2020-01-01 22:03:00+00:00 1.32463 1.32466 1.32462 1.32466
2020-01-01 22:04:00+00:00 1.32466 1.32466 1.32463 1.32463
2020-01-01 22:05:00+00:00 1.32465 1.32466 1.32462 1.32462
2020-01-01 22:06:00+00:00 1.32462 1.32470 1.32462 1.32463
... ... ... ... ...
2020-01-29 23:55:00+00:00 1.30208 1.30208 1.30208 1.30208
2020-01-29 23:56:00+00:00 1.30207 1.30208 1.30207 1.30208
2020-01-29 23:57:00+00:00 1.30208 1.30208 1.30208 1.30208
2020-01-29 23:58:00+00:00 1.30208 1.30208 1.30203 1.30203
2020-01-29 23:59:00+00:00 1.30202 1.30207 1.30202 1.30207
I need the average daily range, so I thought I could just resample the atr to daily frequency. So I followed the documentation:
def init(self):
# Average daily range
self.adr = resample_apply('D', ta.atr, self.data.High, self.data.Low, self.data.Close)
Error output:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File ~/my_strat/.venv/lib/python3.12/site-packages/backtesting/backtesting.py:150, in Strategy.I(self, func, name, plot, overlay, color, scatter, *args, **kwargs)
149 try:
--> 150 value = func(*args, **kwargs)
151 except Exception as e:
File ~/my_strat/.venv/lib/python3.12/site-packages/backtesting/lib.py:322, in resample_apply.<locals>.wrap_func(resampled, *args, **kwargs)
321 # Resample back to data index
--> 322 if not isinstance(result.index, pd.DatetimeIndex):
323 result.index = resampled.index
AttributeError: 'numpy.ndarray' object has no attribute 'index'
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
Cell In[205], line 73
70 bt = Backtest(data, smr_01, margin = 1/100)
71 # import time
72 # start_time = time.time()
---> 73 stats = bt.run()
74 # end_time = time.time()
75
76 # time_taken = end_time - start_time
(...) 82 # print(f"Number of candlesticks: {num_candlesticks}")
83 # print(f"Candlesticks per second: {candlesticks_per_second}")
84 bt.plot()
File ~/my_strat/.venv/lib/python3.12/site-packages/backtesting/backtesting.py:1296, in Backtest.run(self, **kwargs)
1293 broker: _Broker = self._broker(data=data)
1294 strategy: Strategy = self._strategy(broker, data, kwargs)
-> 1296 strategy.init()
1297 data._update() # Strategy.init might have changed/added to data.df
1299 # Indicators used in Strategy.next()
Cell In[205], line 39, in smr_01.init(self)
37 self.range_25, self.range_50, self.range_75 = self.I(range_levels, self.daily_high, self.daily_low, overlay=True)
38 # Average daily range
---> 39 self.adr = resample_apply('D', ta.atr, self.data.High, self.data.Low, self.data.Close)
40 # self.adr = resample_apply('D', ta.sma, self.data.Close, 14)#, self.data.Low, self.data.Close)
42 self.adr2 = self.I(average_daily_range, self.data.df)
File ~/my_strat/.venv/lib/python3.12/site-packages/backtesting/lib.py:330, in resample_apply(rule, func, series, agg, *args, **kwargs)
326 return result
328 wrap_func.__name__ = func.__name__
--> 330 array = strategy_I(wrap_func, resampled, *args, **kwargs)
331 return array
File ~/my_strat/.venv/lib/python3.12/site-packages/backtesting/backtesting.py:152, in Strategy.I(self, func, name, plot, overlay, color, scatter, *args, **kwargs)
150 value = func(*args, **kwargs)
151 except Exception as e:
--> 152 raise RuntimeError(f'Indicator "{name}" error. See traceback above.') from e
154 if isinstance(value, pd.DataFrame):
155 value = value.values.T
RuntimeError: Indicator "atr(H[D],L,C)" error. See traceback above.
However I don't get any error with:
self.sma = resample_apply('D', ta.sma, self.data.Close, 14)
So, again following the docs, I tried doing it myself:
def average_daily_range(df, period):
df_resampled = df.resample('D', label='right').agg({'High': 'max', 'Low': 'min', 'Close': 'last'})
print(df_resampled)
df_resampled.dropna()
atr = ta.atr(df_resampled['High'], df_resampled['Low'], df_resampled['Close'], period)
atr = atr.reindex(df.index).ffill()
return atr
class smr_01(Strategy):
def init(self):
# Average daily range
# self.adr = resample_apply('D', ta.atr, self.data.High, self.data.Low, self.data.Close)
self.sma = resample_apply('D', ta.sma, self.data.Close, 14)
self.adr = self.I(average_daily_range, self.data.df, 14)
Resampled df:
High Low Close
Date
2020-01-02 00:00:00+00:00 1.32608 1.32457 1.32497
2020-01-03 00:00:00+00:00 1.32661 1.31152 1.31467
2020-01-04 00:00:00+00:00 1.31600 1.30531 1.30787
2020-01-05 00:00:00+00:00 NaN NaN NaN
2020-01-06 00:00:00+00:00 1.30855 1.30633 1.30768
2020-01-07 00:00:00+00:00 1.31785 1.30638 1.31711
2020-01-08 00:00:00+00:00 1.32120 1.30948 1.31134
2020-01-09 00:00:00+00:00 1.31694 1.30799 1.31051
2020-01-10 00:00:00+00:00 1.31233 1.30126 1.30691
2020-01-11 00:00:00+00:00 1.30968 1.30422 1.30569
2020-01-12 00:00:00+00:00 NaN NaN NaN
2020-01-13 00:00:00+00:00 1.30441 1.30287 1.30432
2020-01-14 00:00:00+00:00 1.30450 1.29608 1.29859
2020-01-15 00:00:00+00:00 1.30329 1.29542 1.30211
2020-01-16 00:00:00+00:00 1.30582 1.29850 1.30392
2020-01-17 00:00:00+00:00 1.30828 1.30252 1.30760
2020-01-18 00:00:00+00:00 1.31184 1.30050 1.30058
2020-01-19 00:00:00+00:00 NaN NaN NaN
2020-01-20 00:00:00+00:00 1.30071 1.29915 1.30051
2020-01-21 00:00:00+00:00 1.30132 1.29617 1.30035
2020-01-22 00:00:00+00:00 1.30831 1.29952 1.30451
2020-01-23 00:00:00+00:00 1.31525 1.30343 1.31435
2020-01-24 00:00:00+00:00 1.31508 1.30966 1.31186
2020-01-25 00:00:00+00:00 1.31739 1.30565 1.30701
2020-01-26 00:00:00+00:00 NaN NaN NaN
2020-01-27 00:00:00+00:00 1.30799 1.30606 1.30606
2020-01-28 00:00:00+00:00 1.31050 1.30395 1.30588
2020-01-29 00:00:00+00:00 1.30649 1.29752 1.30231
2020-01-30 00:00:00+00:00 1.30273 1.29892 1.30207
Now I have a working ATR resampled to daily... But there is a problem, as you may have noticed both sma and atr are resampled daily with a period of 14:
As you see the ATR start on 20 jan 2020 at 12:00, while the SMA start on 17 jan 2020 at 12:00. So am I doing something wrong or is the library that should be updated?
Packages version:
Package Version
----------------------- -----------
asttokens 3.0.0
backtesting 0.6.3
bokeh 3.6.3
comm 0.2.2
contourpy 1.3.1
cycler 0.12.1
debugpy 1.8.13
decorator 5.2.1
executing 2.2.0
fonttools 4.56.0
ipykernel 6.29.5
ipython 9.0.0
ipython-pygments-lexers 1.1.1
jedi 0.19.2
jinja2 3.1.6
jupyter-client 8.6.3
jupyter-core 5.7.2
kiwisolver 1.4.8
markupsafe 3.0.2
matplotlib 3.10.1
matplotlib-inline 0.1.7
mplfinance 0.12.10b0
nest-asyncio 1.6.0
numpy 2.2.3
packaging 24.2
pandas 2.2.3
pandas-ta 0.3.14b0
parso 0.8.4
pexpect 4.9.0
pillow 11.1.0
platformdirs 4.3.6
prompt-toolkit 3.0.50
psutil 7.0.0
ptyprocess 0.7.0
pure-eval 0.2.3
pygments 2.19.1
pyparsing 3.2.1
python-dateutil 2.9.0.post0
pytz 2025.1
pyyaml 6.0.2
pyzmq 26.2.1
setuptools 76.0.0
six 1.17.0
stack-data 0.6.3
tornado 6.4.2
traitlets 5.14.3
tzdata 2025.1
wcwidth 0.2.13
xyzservices 2025.1.0
Just to add:
On the left is atr applied to daily data, on the right is the data on the left converted back to minute data, i.e. the following command that I found in the docs:
atr = atr.reindex(df.index).ffill()
As you see the daily atr starts on 18 Jan while the one sampled back to 1m data starts on 20 Jan... tough the values are the same.
According to docs:
Notice
label='right'. If it were set to 'left' (default), the strategy would exhibit look-ahead bias. But doing:
data2 = data.resample('D',label='right').agg({'Open': 'first',
'High': 'max',
'Low': 'min',
'Close': 'last'})#.dropna()
print(data2)
Gives as output
Open High Low Close
Date
2020-01-02 1.32463 1.32608 1.32457 1.32497
2020-01-03 1.32497 1.32661 1.31152 1.31467
2020-01-04 1.31466 1.31600 1.30531 1.30787 --> Error, no data for this day (saturday)
2020-01-05 NaN NaN NaN NaN
2020-01-06 1.30808 1.30855 1.30633 1.30768
2020-01-07 1.30767 1.31785 1.30638 1.31711
2020-01-08 1.31708 1.32120 1.30948 1.31134
2020-01-09 1.31135 1.31694 1.30799 1.31051
2020-01-10 1.31047 1.31233 1.30126 1.30691
2020-01-11 1.30691 1.30968 1.30422 1.30569 --> Error, no data for this day (saturday)
2020-01-12 NaN NaN NaN NaN
2020-01-13 1.30347 1.30441 1.30287 1.30432
2020-01-14 1.30432 1.30450 1.29608 1.29859
2020-01-15 1.29858 1.30329 1.29542 1.30211
2020-01-16 1.30211 1.30582 1.29850 1.30392
2020-01-17 1.30395 1.30828 1.30252 1.30760
2020-01-18 1.30758 1.31184 1.30050 1.30058 --> Error, no data for this day (saturday)
2020-01-19 NaN NaN NaN NaN
2020-01-20 1.29915 1.30071 1.29915 1.30051
2020-01-21 1.30052 1.30132 1.29617 1.30035
2020-01-22 1.30037 1.30831 1.29952 1.30451
2020-01-23 1.30451 1.31525 1.30343 1.31435
2020-01-24 1.31435 1.31508 1.30966 1.31186
2020-01-25 1.31185 1.31739 1.30565 1.30701 --> Error, no data for this day (saturday)
2020-01-26 NaN NaN NaN NaN
2020-01-27 1.30799 1.30799 1.30606 1.30606
2020-01-28 1.30606 1.31050 1.30395 1.30588
2020-01-29 1.30588 1.30649 1.29752 1.30231
2020-01-30 1.30230 1.30273 1.29892 1.30207
Indeed if I check:
print(data[data.index.date == pd.to_datetime('2020-01-04').date()])
Empty DataFrame
Columns: [Open, High, Low, Close]
Index: []
Instead if I don't specify the label
```python
data2 = data.resample('D').agg({'Open': 'first',
'High': 'max',
'Low': 'min',
'Close': 'last'}).dropna()
Everything works good, and also:
now the sma indicator and atr strart on the same day, which is the intended result. Hence the final resampled atr function would be:
def average_daily_range(df, period):
# df_resampled = df.resample('D', label='right').agg({'High': 'max', 'Low': 'min', 'Close': 'last'})
df_resampled = df.resample('D').agg({'High': 'max', 'Low': 'min', 'Close': 'last'})
df_resampled.dropna(inplace=True)
atr = ta.atr(df_resampled['High'], df_resampled['Low'], df_resampled['Close'], period)
atr = atr.reindex(data.index).ffill()
return atr
So there are two problems that emerged from this issue:
- Why the
ta.atrfunction doesn't work withresample_apply? - Why according to documentation I must use
label=righteven tough doing so produces a bad result not properly aligned with original data?
- Why the
ta.atrfunction doesn't work withresample_apply?
AttributeError: 'numpy.ndarray' object has no attribute 'index'
Seems to break here:
https://github.com/kernc/backtesting.py/blob/b1a869c67feb531f97bef8769aee09d26a5e0288/backtesting/lib.py#L319-L328
Following the logic, I think the inner branch is missing a trailing else: raise ... clause.
What (shape) does your ta.atr(self.data.High, self.data.Low, self.data.Close) actually return?
- Why according to documentation I must use
label=righteven tough doing so produces a bad result not properly aligned with original data?
Use of label='right' ensures the labeled bin consists only of data of the preceding period, e.g. when there is no data on a Saturday, the value for Sunday is empty/nan. This prevents look-ahead bias where value at time t inadvertently incorporates future values.
ta.atr comes from pandas-ta library. ATR is just a single column, like ta.sma.
Use of label='right' ensures the labeled bin consists only of data of the preceding period, e.g. when there is no data on a Saturday, the value for Sunday is empty/nan. This prevents look-ahead bias where value at time t inadvertently incorporates future values.
Yes but on my data there should be data in Sunday. In my minute data (forex) there is data from Sunday 22:00 to Friday 23:00. So the fact that the resampled version using label = 'Right' leaves Sunday empty and Saturday populated is indeed wrong and not consistent with original data. Am I missing something here?
Thanks for your kind response and for this amazing library.
ta.atrcomes frompandas-talibrary. ATR is just a single column
What its shape, ndim? It looks like this condition holds: ta.atr(h, l, c).ndim not in (1, 2), whereas it shouldn't!
Just to add: ...
On the left is atr applied to daily data, on the right is the data on the left converted back to minute data, i.e. the following command that I found in the docs:
atr = atr.reindex(df.index).ffill()
Note, the lib actually does: https://github.com/kernc/backtesting.py/blob/b1a869c67feb531f97bef8769aee09d26a5e0288/backtesting/lib.py#L329-L330 Does this help?
Hello, so:
atr = ta.atr(data.High, data.Low, data.Close)
print(type(atr))
print(atr.shape)
Produces:
<class 'pandas.core.series.Series'>
(31669,)
So why doesn't it work if I do: self.adr = resample_apply('D', ta.atr, self.data.High, self.data.Low, self.data.Close)?
Also the problem about label='right' persists.
Doing more tests... Resampling candlestick data from 1min to 5min using suggested label='right:
data5m = data.resample('5min', label='right').agg({
"Open": "first",
"High": "max",
"Low": "min",
"Close": "last",
})
Produces:
As you see the end result is a candlestick that starts at 22:05 but contains datas from 22:00 to 22:04. This is wrong because the index refers to the Open datetime, hence the row 22:05 in a 5min timeframe should have:
- Open: market price at
22:05:00 - High: highest market price from
22:05:00to22:09:59 - Low: lowest market price from
22:05:00to22:09:59 - Close: market price at
22:09:59.
Indeed, if I don't use label='right', i.e.:
data5m = data.resample('5min').agg({
"Open": "first",
"High": "max",
"Low": "min",
"Close": "last",
})
I obtain the intended outcome:
So the resample_apply function:
- Should avoid using
label='right'(?)- If so, also the documentation snippet should be changed.
- Should be fixed in order to work with indicators like
ta.atr.
Edit: I double checked with TradingView charts and they are identical, proving even further we don't need `label='right'`
Resampled data chart:
Tradingview:
As you see the end result is a candlestick that starts at
22:05but contains datas from22:00to22:04. This is wrong
Thanks for the illustrative example!
You don't learn that the supposed 22:00:00 bar closed at 1.20032 until 22:04:00 bar closes! I don't know what labeling your data source uses, but plotting that info anytime before the complete end of bar 22:04:00 would introduce look-ahead bias.
Likewise, applying a simple passthrough function:
from backtesting import Strategy, Backtest
from backtesting.test import EURUSD
class S(Strategy):
def init(self):
resample_apply('1d', lambda x: x, self.data.Close, color='blue')
def next(self):
pass
bt = Backtest(EURUSD, S)
_ = bt.run()
bt.plot()
you can see it uses previous complete bar's value as the current value. Had it used the current bar's (potentially incomplete) value, this would introduce look-ahead bias and would redraw / repaint / mislead, like TradingView does.
This part of the issue is "works-as-planned" / wontfix.
- Should be fixed in order to work with indicators like
ta.atr.
AttributeError: 'numpy.ndarray' object has no attribute 'index'
Please provide the following output:
>>> atr = ta.atr(h, l, c)
>>> atr.__class__.__mro__
>>> atr
>>> np.ndim(atr)
Dear kernc, yes I see your point of view. I always used the TradingView and MT4/5 way of labeling data, always been aware of repainting and learned to account for it on my testings.
You don't learn that the supposed 22:00:00 bar closed at 1.20032 until 22:04:00 bar closes!
Yes, the idea is that the Open price is fixed and High, Low, Close changes every new tick until the candle closes. I see that other data providers, like Bloomberg, label the data on the close.
This part of the issue is "works-as-planned" / wontfix.
That's perfectly fine, I'll account for this. Thank's for taking time to explain :)
Please provide the following output:
atr = ta.atr(data.High, data.Low, data.Close)
print(atr.__class__.__mro__)
print(atr)
print(np.ndim(atr))
(<class 'pandas.core.series.Series'>, <class 'pandas.core.base.IndexOpsMixin'>, <class 'pandas.core.arraylike.OpsMixin'>, <class 'pandas.core.generic.NDFrame'>, <class 'pandas.core.base.PandasObject'>, <class 'pandas.core.accessor.DirNamesMixin'>, <class 'pandas.core.indexing.IndexingMixin'>, <class 'object'>)
Date
2020-01-01 22:02:00 NaN
2020-01-01 22:03:00 NaN
2020-01-01 22:04:00 NaN
2020-01-01 22:05:00 NaN
2020-01-01 22:06:00 NaN
...
2020-04-30 23:55:00 0.000128
2020-04-30 23:56:00 0.000122
2020-04-30 23:57:00 0.000125
2020-04-30 23:58:00 0.000127
2020-04-30 23:59:00 0.000125
Name: ATRr_14, Length: 123800, dtype: float64
1
Well, that's confusing. If the result object is already a Series, there's no way I see for it to crash with AttributeError: 'numpy.ndarray' object has no attribute 'index' ... 🤔
Well... If you manage to find some time you can manage to try for yourself...
pip install pandas-ta
import pandas_ta as ta
from backtesting import Strategy
import backtesting as bt
from backtesting.lib import resample_apply
# Import some minute-level data
class myStrat(Strategy):
def init(self):
self.atr = resample_apply('D', ta.atr, self.data.High, self.data.Low, self.data.Close)
...
Sorry if the snippet is not 100% accurate, I'm on my phone. Anyway I'm sure you get what I'm trying to say.