distributed-learning-contributivity icon indicating copy to clipboard operation
distributed-learning-contributivity copied to clipboard

Contrib dist stat

Open arthurPignet opened this issue 4 years ago • 1 comments

New contributivity measueament based on statistical distances between 2 distributions:

  • The partner-specific probability distribution of the label, wrt to the input; (estimated via maximun likelihood wrt to the partner's data)
  • The latent joint probability distribution of the label, wrt to the input.(estimated via maximun likelihood wrt to the joint dataset)

This difference of distributions is interpreted as a noise, which allow us to use a multiheaded adaptation of the smodel method to the multipartner case to estimate and quantify this pseudo-noise.

These contributivity metrics only need inferences to be computed, on the trained model (trained via FedSmodel)

The computational additional cost is thus neglectable The method doesn't need a 'perfect' and global test dataset.

For now 3 distances are implemented:

  • KullBack- Leiber divergence
  • Hellinger metric
  • Bhattacharyya distance

These metrics are tested on the reference scenarios, see the colab notebook : https://colab.research.google.com/drive/1DN1lLdd1b1ZmttmEiQKpx8xW5guEf_f_?usp=sharing

TODO

  • [ ] Add doc
  • [x] Investigate over s-model bug when using Advanced or Flexible splitter
  • [x] Handle dict contributivity score for result dataframe
  • [ ] Investigate std computations

arthurPignet avatar May 22 '21 19:05 arthurPignet

Codecov Report

Merging #346 (078cbca) into master (ecc3ea8) will decrease coverage by 0.19%. The diff coverage is 80.37%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #346      +/-   ##
==========================================
- Coverage   80.68%   80.49%   -0.20%     
==========================================
  Files          15       15              
  Lines        3045     3128      +83     
==========================================
+ Hits         2457     2518      +61     
- Misses        588      610      +22     
Impacted Files Coverage Δ
mplc/multi_partner_learning/__init__.py 100.00% <ø> (ø)
mplc/multi_partner_learning/basic_mpl.py 84.98% <ø> (-0.29%) :arrow_down:
mplc/multi_partner_learning/fast_mpl.py 61.09% <55.31%> (-0.81%) :arrow_down:
mplc/contributivity.py 77.23% <100.00%> (+0.67%) :arrow_up:
mplc/scenario.py 83.27% <100.00%> (+0.77%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ecc3ea8...078cbca. Read the comment docs.

codecov-commenter avatar May 23 '21 15:05 codecov-commenter