What does this PR do?

Adds a Transfer Learning guide which includes:

Loading a pre-trained model
Doing parameter surgery
Freezing layers and implementing Differential Learning Rates using optax.multi_transform

Live preview: https://flax--2394.org.readthedocs.build/en/2394/guides/transfer_learning.html

Notes

I couldn't use vit_jax because jax versions conflict, so I used a model from HuggingFace's transformers library. Only quirk here is that transformers downgrades jax so a fix is added to upgrade it on CI.
Expanded guide with complete example moved to #2429

Aug 10 '22 22:08 cgarciae

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Aug 10 '22 22:08 review-notebook-app[bot]

Codecov Report

Merging #2394 (2047050) into main (521f516) will decrease coverage by 0.00%. The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #2394      +/-   ##
==========================================
- Coverage   79.47%   79.46%   -0.01%     
==========================================
  Files          49       49              
  Lines        5204     5202       -2     
==========================================
- Hits         4136     4134       -2     
  Misses       1068     1068

Impacted Files	Coverage Δ
flax/linen/stochastic.py	`96.42% <0.00%> (-0.24%)`	:arrow_down:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Aug 10 '22 23:08 codecov-commenter

Hey @jheek, using jupytext we are syncing the .ipynb and .md files, the .ipynb will be rendered but the .md are there to ease the review process.

Sep 23 '22 14:09 cgarciae

@8bitmp3 thanks a lot for the feedback! It was really useful, I've made the stylistic changes and will keep them in mind for future guides.

Oct 07 '22 21:10 cgarciae

Reviewed and updated in https://github.com/cgarciae/flax/pull/1. LMKWYT!

Oct 20 '22 01:10 8bitmp3

Transfer Learning Guide

What does this PR do?

Notes

Codecov Report