grf Seeking Advice on Causal Inference for Treatment Effect Prediction (Small Sample, Genomic Covariates)

Hello, I’ve been studying causal inference recently, but I’m still unsure how to properly approach my analysis — so I would really appreciate your guidance. I’m working with the following dataset and aim to answer this question:

Goal: For each individual, can we predict whether Treatment A or Treatment B would be more effective?

Dataset Summary: N = 88 patients

Treatment assignment: A or B (binary)

Outcome: binary response (1 = favorable response, 0 = unfavorable)

Covariates:

A binary variable for the presence of a specific gene mutation

A continuous variable for the expression level of a specific gene

Questions Since this is a small dataset (n=88), would it still make sense to split the data into training and test sets, as in conventional supervised learning workflows?

I am considering using causal_forest() from the grf package to estimate individual treatment effects (ITEs).

After estimating the ITEs, is it reasonable to decide:

ITE > 0 => Prefer Treatment A

ITE < 0 => Prefer Treatment B

Is this interpretation valid and commonly used in practice?

I’m aware that with such a small sample size, variance and overfitting could be major issues. If there are any recommendations regarding cross-validation strategies, feature regularization, or alternative models (e.g., T-Learner, S-Learner), I’d love to hear them.

Thank you very much in advance for your help!

Apr 12 '25 09:04 oghzzang

Hi @oghzzang,

That’s a great question. With such a small sample size, there are unfortunately limitations to what machine learning approaches that rely on train/test evaluation can achieve. One alternative is to use cross-fold evaluation specifically tailored for treatment effect heterogeneity, as described here: https://grf-labs.github.io/grf/articles/rate_cv.html. If there a very many covariates, but you think only a few genes matter, then maybe a CATE method that leverages sparsity could be useful to try out, such as a lasso-based R-learner (https://github.com/xnie/rlearner).

Another option, especially if there are very few covariates, is to take a classical approach: define a few pre-specified subgroups based on scientific interest, estimate the average treatment effects (ATEs) within each subgroup, and then assess whether they differ, adjusting for multiple testing.

Apr 29 '25 01:04 erikcs

@erikcs

Thank you for the excellent response. It was really helpful.

Thanks!

Apr 30 '25 08:04 oghzzang