linearmodels icon indicating copy to clipboard operation
linearmodels copied to clipboard

Check performance of PanelOLS and AbsorbingLS in large models

Open bashtage opened this issue 5 years ago • 2 comments

Check performance and defer expensive operations

bashtage avatar Jan 04 '21 17:01 bashtage

Hi @bashtage! Are there any improvements on that? I also wanted to raise this issue. For my pretty large dataset, running FE panel model works ~20 times slower than just going manual OLS way with substracting corresponding mean values and utilizing np.linalg.solve. Any particular reasons for that?

Also I wanted to ask whether multiprocessing is used here, cannot figure out (but it seems that all of my cores are used)?

OleksiiRomanko avatar Mar 29 '21 15:03 OleksiiRomanko

There is no multiprocessing, but there should be multithreading. Can you post an example with a simulated dataset that is like the one you are fitting (similar group structure), along with the command.

One reason why it might be expensive is that it performs more checks then are necessary, and may also create new data structures. This is a cost but the benefit is large in terms of long term maintainability and shallowness of bugs. In complicated models, those with 2 effects and really large datasets, e.g., 5 million + rows with many relatively small groups (1million+ groups) it should be nearly identical to the best methods available.

bashtage avatar Mar 29 '21 15:03 bashtage