ta-lib icon indicating copy to clipboard operation
ta-lib copied to clipboard

doc error: MFI function should not has an *unstable* period

Open mw66 opened this issue 4 years ago • 3 comments

https://github.com/mrjbq7/ta-lib/issues/435

https://github.com/mrjbq7/ta-lib/blob/master/docs/func_groups/momentum_indicators.md

MFI - Money Flow Index

NOTE: The MFI function has an unstable period.

real = MFI(high, low, close, volume, timeperiod=14)

however, if we check the DESCRIPTION of TA_SetUnstablePeriod(https://ta-lib.org/d_api/ta_setunstableperiod.html).

and then how the MFI is calculated:

https://www.investopedia.com/terms/m/mfi.asp

How to Calculate the Money Flow Index

There are several steps for calculating the Money Flow Index. If doing it by hand, using a spreadsheet is recommended.

Calculate the Typical Price for each of the last 14 periods. For each period, mark whether the typical price was higher or lower than the prior period. This will tell you whether Raw Money Flow is positive or negative. Calculate Raw Money Flow by multiplying the Typical Price by Volume for that period. Use negative or positive numbers depending on whether the period was up or down (see step above). Calculate the Money Flow Ratio by adding up all the positive money flows over the last 14 periods and dividing it by the negative money flows for the last 14 periods. Calculate the Money Flow Index (MFI) using the ratio found in step four. Continue doing the calculations as each new period ends, using only the last 14 periods of data.

It's more like SMA, having look back period of 1 (compare 1st typical price is up/down with the prev typical price), rather than EMA (which will "remember" all the price effect on the current ema value all the way back to the very start).

I think this is a doc error: The MFI function has no unstable period.

mw66 avatar Jun 26 '21 18:06 mw66

Python test code is here:

https://github.com/mrjbq7/ta-lib/issues/435#issuecomment-868955590

OK, I have showed my point theoretically (in the OP).

Now I just did the following test, it shows (actually proves) The MFI function has no unstable period (up to numeric calculation stability).

import pandas as pd                                                                                                                        
import numpy as np                                                                                                                       

def test():
  # check RSI vs MFI unstable period.
  fn = "SPY.csv"                                                                                                 
  df = pd.read_csv(fn)                                                                                                                     
  df["ratio" ] = df["Adj Close"] / df["Close"]                                                                                             
  df["Open"  ] = df["Open"  ] * df["ratio"]                                                                                                
  df["High"  ] = df["High"  ] * df["ratio"]                                                                                                
  df["Low"   ] = df["Low"   ] * df["ratio"]                                                                                                
  df["Close" ] = df["Close" ] * df["ratio"]                                                                                                
  o = np.array(df["Open"])                                                                                                                 
  h = np.array(df["High"])                                                                                                                 
  l = np.array(df["Low"])                                                                                                                  
  c = np.array(df["Close"])                                                                                                                
  v = np.array(df["Volume"], dtype=np.double)                                                                                              
  rsi = []                                                                                                                                 
  for n in [40, 50]:                                                                                                                       
    r = talib.RSI(c[-n:])                                                                                                                  
    print(r)                                                                                                                               
    rsi.append(r)                                                                                                                          
  m = 40 - 14                                                                                                                              
  diff = np.abs(rsi[0][-m:] - rsi[1][-m:])                                                                                                 
  print(np.max(diff), np.mean(diff)) # 2.6424354952679963 1.1047087679412708                                                               
                                                                                                                                           
  mfi = []                                                                                                                                 
  for n in [40, 50]:                                                                                                                       
    m = talib.MFI(h[-n:], l[-n:], c[-n:], v[-n:])                                                                                          
    print(m)                                                                                                                               
    mfi.append(m)                                                                                                                          
  m = 40 - 14                                                                                                                              
  diff = np.abs(mfi[0][-m:] - mfi[1][-m:])                                                                                                 
  print(np.max(diff), np.mean(diff))  # 1.4210854715202004e-14 5.738999019600809e-15                                                       
  assert(np.all(np.isclose(mfi[0][-m:], mfi[1][-m:])))  # pass!                                        

As you can see the rsi diff (max() & mean()) is quite big (because of the EMA kind of memory -- the inherent difference caused by the algorithm); But the the mfi diff is every small (it should all be 0, the diff is caused by numeric computation stability, i.e. rounding error caused by operation sequence.)

You can try this code yourself.

mw66 avatar Jun 26 '21 18:06 mw66

Hey @mingwugmail , I cloned the repo from sourceforge to github to make sure I can add the project as a git submodule. I have no plan to maintain this project however. If you are interested in maintaining this project or if you can reach to the original developer, I'd like to transfer the ownership of the git org.

mckelvin avatar Jul 02 '21 02:07 mckelvin

Thanks for reporting this.

You are correct, although the fix I propose is more complicated than fixing the comment :smile:

A "stable" function must return the exact same value for passing ta-lib automated tests... because I did choose a MFI implementation that introduce some imprecision, I decided to flag it as "unstable".

With that in mind, you are correct that the cause for "instability" is not the same as with, say, RSI.

As you hinted, the MFI implementation subtracts values that were previously added on the same variable, and this introduce a bit of "noise". This is the mind boggling "floating point epsilon" problem. The imprecision is insignificant, but from the test perspective it is not "exactly" stable.

What to do?

I think the real fix would be to re-implement MFI with a different algo that guarantee stability versus prioritizing speed... after all CPU are significantly more faster since this was first implemented :wink:

Once re-implemented, the "unstable" flag could then be turn off.

mario4tier avatar Oct 17 '24 03:10 mario4tier