cluster-experiments icon indicating copy to clipboard operation
cluster-experiments copied to clipboard

Add synthetic control as analysis method

Open Gabrielcidral1 opened this issue 2 years ago • 2 comments

As described by @david26694 we want to: "Add a wrapper to a synthetic control implementation that gives p-values, this should allow us to treat synthetic control as just another analysis method and check if it has higher power than simpler things"

This involved the creation of a new analysis class called SyntheticControlAnalysis, similar to the other types of analysis.

However, in order to perform the analysis correctly, we need the pre experiment data as part of the fit to find the weights (fit_synthetic). In the implementation in main, PowerAnalysis class accepts a pre_experiment param, however this is only used for cuped purposes (where we add a column to df). Therefore, I had to create a new class called PowerAnalysisWithPreExperimentData, where the pre experiment df is also available.

Furthermore, I had to create another splitter called PredefinedTreatmentClustersSplitter, as we want only one cluster to be assigned as treatment and the rest as control. This was done to simplify the logic and to be more consistent with the usual application of synthetic control

This will not be implemented in this PR

  • Allow power analysis with more than 1 treatment cluster
  • Run it from power config
  • Graphs on synthetics and donors
  • Parallel execution for p value calculation

Gabrielcidral1 avatar Apr 10 '24 19:04 Gabrielcidral1

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 95.14563% with 5 lines in your changes missing coverage. Please review.

Project coverage is 96.77%. Comparing base (d5a4977) to head (e0b4c1c). Report is 15 commits behind head on main.

Files with missing lines Patch % Lines
cluster_experiments/experiment_analysis.py 93.22% 4 Missing :warning:
cluster_experiments/power_analysis.py 92.85% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #168      +/-   ##
==========================================
- Coverage   96.93%   96.77%   -0.17%     
==========================================
  Files           9       10       +1     
  Lines        1078     1179     +101     
==========================================
+ Hits         1045     1141      +96     
- Misses         33       38       +5     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Apr 11 '24 12:04 codecov-commenter

in the notebook, I think it'd be cool to compare power lines and point estimate distrubtions of clusteredOLS and synthetic control

david26694 avatar May 01 '24 07:05 david26694

(last suggestions and we merge)

david26694 avatar Jun 17 '24 09:06 david26694