openfl icon indicating copy to clipboard operation
openfl copied to clipboard

Update TensorFlow Task Runner and related workspaces

Open kminhta opened this issue 1 year ago • 3 comments

Related Issue: #973

Summary: This PR aims to update the TensorFlow Task Runner to use Keras as the high-level API, which is in line with best practices as well as updates existing TF workspaces. This enables the usage of non-legacy optimizers (which will be deprecated in future versions of TF/Keras)

Specifically, this PR:

  • Creates a new TensorFlowTaskRunner class in openfl.federated.task.runner_tf which borrows heavily from the KerasTaskRunner task. Major difference is in handling the weights of the optimizer which was necessitated by the removal of the .get_weight() and .weights() attributes from the optimizer. This new TensorFlowTaskRunner extracts weights from the .variables() attribute
    • Also updated the train and validation task names to train_validation and task_validation to be consistent with the torch taskrunner
  • Archived old TensorFlowTaskRunner as TensorFlowTaskRunner_v1 within openfl.federated.task.runner_tf and updated the __init__ files to make it callable. Rationale is to avoid any breaking changes for tutorials or upstream applications that still relied on the low-level TF taskrunner. This can be removed entirely in a future release as needed
    • Also updated the train and validation task names to train_validation and task_validation to be consistent with the torch taskrunner
  • Created a new tf_cnn_mnist workspace and updated the torch_cnn_histology workspace to run on the new TensorFlowTaskRunner using the src/dataloader.py and src/taskrunner.py convention.
    • update to TensorFlow v2.15.1 (latest TensorFlow to not use Keras v3.x by default)
  • Minor tf_3dunet_brats to use new TensorFlowTaskRunner (did not make changes to src files because I did not have Brats3D dataset to verify a large update
  • Minor updates to tf_2dunet to run on archived TensorFlowTaskRunner_v1

Future work

  • Consolidation step still needed:
    • Migrate all tf_2d_unet from TensorFlowTaskRunner_v1 to new TensorFlowTaskRunner
    • Migrate all keras workspaces from KerasTaskRunner to new TensorFlowTaskRunner and remove/archive KerasTaskRunner
  • Look into updated TensorFlowTaskRunner to run on TF v2.16+ with Keras 3.x (this may need some large changes to weight handling that will likely not have backwards compatibility)

kminhta avatar Jun 06 '24 22:06 kminhta

Most of my comments are nitpicks around naming/formatting and/or structure. Please disposition as you find relevant. Also, do we need TF v1 (tf.Session API) task runners? These are ancient APIs that none of the community uses. Does OpenFL have users on this legacy TF API?

MasterSkepticista avatar Jun 08 '24 07:06 MasterSkepticista

Most of my comments are nitpicks around naming/formatting and/or structure. Please disposition as you find relevant. Also, do we need TF v1 (tf.Session API) task runners? These are ancient APIs that none of the community uses. Does OpenFL have users on this legacy TF API?

@MasterSkepticista I agree the TF v1 / Legacy task runners can be removed at this point.

psfoley avatar Jun 12 '24 15:06 psfoley

Thanks @MasterSkepticista @psfoley ! I will get back to addressing these comments once I'm done with some other pressing tasks. I appreciate the review

Most of my comments are nitpicks around naming/formatting and/or structure. Please disposition as you find relevant. Also, do we need TF v1 (tf.Session API) task runners? These are ancient APIs that none of the community uses. Does OpenFL have users on this legacy TF API?

@MasterSkepticista I agree the TF v1 / Legacy task runners can be removed at this point.

Great, I am also in agreement with removing the old TF runner. Actually, one reason why I did not propose it directly in this PR was because the tf2dunet workspaces uses the old TF runner. I can update it, but since it uses the BraTS dataset, I will have to get approval so I can test it

kminhta avatar Jun 12 '24 20:06 kminhta

Thank you @kta-intel. This was completed by https://github.com/securefederatedai/openfl/pull/1174.

tanwarsh avatar Dec 16 '24 07:12 tanwarsh