fetch icon indicating copy to clipboard operation
fetch copied to clipboard

FETCH hangs in the middle of classification.

Open astrogewgaw opened this issue 11 months ago • 0 comments

We are using FETCH as part of the transient search pipeline for the SPOTLIGHT project (a commensal survey for FRBs/pulsars at the GMRT). We are currently facing an issue where FETCH often hangs in the middle of classification. This happens even though:

  1. The number of candidates is not very large.
  2. The model is run on an NVIDIA A100, with 80 GB of GPU memory.

Unfortunately we have not been able to reliably reproduce the bug. Currently it seems to happen randomly, and does not seem to be triggered by a particular candidate. We verified the latter by rerunning FETCH on the same candidate, and it runs successfully. Any idea what could be causing the issue? I am using tensorflow v2.15.0.post1, and keras v2.15.0, since higher versions just do not work, with Python 3.10.14. I am aware that the bug will be difficult to solve since there is no reproducibility (as far as we can see), but I thought I will still open an issue so that we can discuss what could be the possible causes at the very least.

astrogewgaw avatar Feb 19 '25 10:02 astrogewgaw