EconML LinearDRIV effect vs effect_interval shape inconsistency

Output array shapes are inconsistent for LinearDRIV.effect() and LinearDRIV.effect_interval()[0]. Should make these consistent.

Seems to occur when one of either Y, T, or Z is an array instead of a vector.

Repro:

import numpy as np
import pandas as pd
from econml.iv.dr import LinearDRIV

df = pd.DataFrame(np.random.normal(size = (100, 6)))

Y = df[[0]]
T = df[[1]]
X = df[[2,3,4]]
Z = df[[5]]

est = LinearDRIV().fit(Y=Y, T=T, Z=Z, X=X)

eff = est.effect(X)
lb, ub = est.effect_interval(X)

print(eff.shape)
print(lb.shape)

Oct 10 '22 16:10 fverac

Hi @kbattocchi, I would like to take this issue.

Aug 18 '24 14:08 ssemov

I believe the issue is with the _BaseDRIVModelFinal class. For example, I can replicate the shape inconsistency with SparseLinearDRIV too.

I avoid the shape inconsistency issue, if I modify the predict method in _BaseDRIVModelFinal as below.

This works because it makes eff = self.const_marginal_effect(X) with dimensions (m,1) instead of (m,1,1). Then the effect method returns an output array with dimensions (m,).

@kbattocchi and @fverac does this look like a reasonable approach?

One concern I have is that this conditional check might miss some T and Y dimension configurations of the _BaseDRIV class though I wasn't able to reproduce any.

        X = self._transform_X(X, fitting=False)
        # handle df input dimensions
        if self.d_y == (1,) and self.d_t == (1,):
            return self._model_final.predict(X).reshape((-1,) + self.d_y + self.d_t).reshape(-1,1)
        else:
            return self._model_final.predict(X).reshape((-1,) + self.d_y + self.d_t)

Aug 27 '24 19:08 ssemov

I believe the issue is with the _BaseDRIVModelFinal class. For example, I can replicate the shape inconsistency with SparseLinearDRIV too.

I avoid the shape inconsistency issue, if I modify the predict method in _BaseDRIVModelFinal as below.

This works because it makes eff = self.const_marginal_effect(X) with dimensions (m,1) instead of (m,1,1). Then the effect method returns an output array with dimensions (m,).

@kbattocchi and @fverac does this look like a reasonable approach?

One concern I have is that this conditional check might miss some T and Y dimension configurations of the _BaseDRIV class though I wasn't able to reproduce any.
        X = self._transform_X(X, fitting=False)
        # handle df input dimensions
        if self.d_y == (1,) and self.d_t == (1,):
            return self._model_final.predict(X).reshape((-1,) + self.d_y + self.d_t).reshape(-1,1)
        else:
            return self._model_final.predict(X).reshape((-1,) + self.d_y + self.d_t)

@ssemov Thanks for volunteering to look into this. It might help to outline the intended shapes of these methods:

.effect should return a matrix of shape (n_obs, d_y). Each element in the .effect_interval tuple should also be of the same shape.

.constant_marginal_effect should return a matrix of shape (n_obs, d_y, d_f_t) where d_f_t==d_t is there is no treatment featurizer (as in our repro). Each element in the .constant_marginal_effect_interval tuple should also be of the same shape.

.marginal_effect should return a matrix of shape (n_obs, d_y, d_t). Each element in the .marginal_effect_interval tuple should also be of the same shape.

Judging from the above specification, I would say that in the original repro, the shape of eff is correct in that it is (100, 1) i.e. (n_obs, d_y) and that the shape of lb is what needs to be aligned to match eff, rather than the other way around.

I think your fix makes eff become shape (100,) for the original repro code, which is not aligned with our specification. Moving forward, I would recommend looking for a fix that corrects/updates the shape output of LinearDRIV's effect_interval, rather than the shape of .effect. Probably somewhere in the inference object that LinearDRIV uses.

Hope this helps!

P.S. extended the repro code a little to demonstrate more

df = pd.DataFrame(np.random.normal(size = (100, 6)))

Y = df[[0]]
T = df[[1]]
X = df[[2,3,4]]
Z = df[[5]]

est = LinearDRIV().fit(Y=Y, T=T, Z=Z, X=X)

eff = est.effect(X)
cme = est.const_marginal_effect(X)
me = est.marginal_effect(T, X)

lb, ub = est.effect_interval(X)
cme_lb, cme_ub = est.const_marginal_effect_interval(X)
me_lb, me_ub = est.marginal_effect_interval(T, X)


print('effect shapes. all as expected')
print(eff.shape)
print(cme.shape)
print(me.shape)

print('interval lower bound shapes: Note that the effect lower bound shape is the only one inconsistent with the "point estimate" equivalent')
print(lb.shape)
print(cme_lb.shape)
print(me_lb.shape)

Aug 27 '24 21:08 fverac

Thanks, @fverac! This is super helpful. I looked into your suggestion, and indeed the error is with the inference object LinearDRIV uses.

LinearDRIV uses the StatsModelsInference class, which inherits its effect_interval method from LinearModelFinalInference.

That outputs a vector of dimensions (m,) instead of a matrix (m,dy). .effect has dimensions (m,dy), so we get the inconsistency.

This only happens when Y is an array. When Y is a vector (only 1 outcome), both .effect and .effect_interval have dimensions (m,).

I'll send a fix.

Sep 04 '24 19:09 ssemov