maes_microscopy icon indicating copy to clipboard operation
maes_microscopy copied to clipboard

Request of details on JUMP-CP preprocessing

Open jasperhyp opened this issue 1 year ago • 0 comments

Hi! Thank you for sharing this excellent model and open-sourcing the RxRx dataset and associated resources. I found the paper and Huggingface documentation very insightful. While reviewing these materials, I noticed that the open-source model was trained using JUMP-CP data alongside the RxRx datasets, with random image crops employed during training.

Given that JUMP-CP images originate from a different institute and imaging system compared to RxRx, could you please provide more details on how the JUMP-CP data was processed before training? For example, is it corrected for variations in background intensity, thresholded, and/or others? After that, how was it processed to 256*256 while maintaining similar magnification -- was it downsampled or just randomly cropped from the original resolution? Understanding this would be incredibly valuable for my research, as I aim to use your model for inference on another Cell Painting dataset that may exhibit even greater differences.

Additionally, if possible, could you share a Python script or workflow for mapping raw CellPainting images (e.g., TIFF files such as these ones in JUMP-CP) into the input for the model? This would greatly facilitate the application of your model to similar datasets.

Thank you for your time and consideration. I appreciate your efforts in making your work accessible!

jasperhyp avatar Jan 08 '25 02:01 jasperhyp