EconML icon indicating copy to clipboard operation
EconML copied to clipboard

MemoryError in econml.dml

Open hhu1 opened this issue 5 years ago • 2 comments

I have a dataset with about 6.3 million observations (rows), 10 treatments and 4 effect modifiers. I am using econml.dml.LinearDML module like this:

est = LinearDMLCateEstimator(model_t =MultiOutputRegressor(CatBoostRegressor()), model_y = CatBoostRegressor()) est.fit(Y, T, X, W, inference='statsmodels') And it throws a MemoryError (see below). It seems that in the inference step it creates a big matrix who crashes the memory. If I instead use econml.dml.KernelDML, the memory consumption is even larger.

I'm wondering if there's any more memory efficient way of implementation which would avoid the error.

File "/local/home/haohupku/run.py", line 162, in _get_coef est.fit(Y, T, X, W, inference='statsmodels') File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/utilities.py", line 1212, in m return to_wrap(*args, **kwargs) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/dml.py", line 608, in fit inference=inference) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/utilities.py", line 1212, in m return to_wrap(*args, **kwargs) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/dml.py", line 488, in fit inference=inference) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/utilities.py", line 1212, in m return to_wrap(*args, **kwargs) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/_rlearner.py", line 322, in fit inference=inference) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/utilities.py", line 1212, in m return to_wrap(*args, **kwargs) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/cate_estimator.py", line 104, in call m(self, Y, T, *args, **kwargs) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/_ortho_learner.py", line 551, in fit sample_var=self._subinds_check_none(sample_var, fitted_inds)) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/_ortho_learner.py", line 621, in _fit_final sample_var=sample_var)) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/_rlearner.py", line 107, in score effects = self._model_final.predict(X).reshape((-1, Y_res.shape[1], T_res.shape[1])) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/dml.py", line 217, in predict prediction = self._model.predict(self._combine(None if X is None else X2, T, fitting=False)) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/dml.py", line 161, in _combine return cross_product(F, T) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/utilities.py", line 293, in cross_product return _apply(cross, XS) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/utilities.py", line 233, in _apply result = op(*XS) File "/home/haohupku/projects/pkgs/miniconda/envs/py36/lib/python3.6/site-packages/econml/utilities.py", line 292, in cross return reshape(reduce(np.multiply, XS), (n, -1)) MemoryError: Unable to allocate 19.7 GiB for an array with shape (53004520, 10, 5) and data type float64

hhu1 avatar Nov 29 '20 04:11 hhu1

Try setting linear_first_stages=False.

If True then we create some extra featurizations to ensure consistency of the estimation and I suspect these extra features crwation is the problem

vsyrgkanis avatar Nov 29 '20 05:11 vsyrgkanis

This solved it for me as I was running into the same error - would it not make more sense setting that to False by default, or at least choose some threshold for nd1d2 above which this linear_first_stages is set to False by default?

EgorKraevTransferwise avatar Jan 29 '22 18:01 EgorKraevTransferwise