Conversion of fetch to PyTorch
Is your feature request related to a problem? Please describe. As part of the discussion of moving from Keras2 to Keras3, see https://github.com/devanshkv/fetch/issues/36#issuecomment-2350962626
Describe the solution you'd like I have begun the conversion to using PyTorch but I have some questions regarding your original implementation. Feedback on this conversion would be most appreciated. My implementation can be found here: https://github.com/aweaver1fandm/fetch/tree/pytorch
The implementation is not yet complete but is at a point where I can't proceed further until I'm sure it's moving in the correct direction.
-
Is the transfer training procedure (found in fetch/transfer_train.py) correct. I was unsure about the number of epochs when a layer is unfrozen. I tried to implement what you wrote in the paper but it was a bit vague when it came to the part about unfreezing layers
-
Does the data preparation (found in _data_from_h5 in fetch/puslar_data.py) seem to match what you originally did?
-
In the paper you talk about unfreezing layers, but based on the picture in that paper the term "layer" is a bit ambiguous and depends on the architecture of the CNN. That is, for example, it would be different for densenet architectures compared to vgg architectures. a. Is that a correct interpretation of what you meant? b. Do the custom unfreeze functions in fetch/model.py seem to capture that meaning?
-
Based on your code, it seems that Guassian noise is only added to the freq data. Is that correct?
-
Do the model setups for transfer training an individual CNN (see TorchVisionModel in fetch/model.py) look correct in terms of matching how you modified them? For transfer training I purposefully used an output layer with one neuron, instead of two as you did. This would seem to be the way to go because at this point it's a binary classification problem and the single output is the probability it's a pulsar
I've pulled the training and test data from http://astro.phys.wvu.edu/fetch/. I do an 85/15 random split of the training data for training and validation and use the test data to evaluate once the model is trained.