MONICA icon indicating copy to clipboard operation
MONICA copied to clipboard

MONICA

MONICA: Benchmarking on Long-tailed Medical Image Classification

Introduction

We build a unified codebase called Medical OpeN-source Long-taIled ClassifiCAtion (MONICA), which implements over 30 methods developed in long-tailed Learning and evaluated on 12 long-tailed medical datasets covering 6 medical domains. alt text

Installation

First, clone the repo and cd into the directory:

git clone this repo.
cd MONICA

Then create a conda env and install the dependencies:

conda create -n MONICA
conda activate MONICA
conda env create -f MONICA.yml

1. Prepare Datasets

Data Download

Domain Dataset Link License
Dermatology ISIC2019 https://challenge.isic-archive.com/data/#2019 CC-BY-NC
Dermatology DermaMNIST https://medmnist.com/ CC BY-NC 4.0
Ophthalmology ODIR https://www.kaggle.com/datasets/andrewmvd/ocular-disease-recognition-odir5k not specified
Ophthalmology RFMiD https://ieee-dataport.org/open-access/retinal-fundus-multi-disease-image-dataset-rfmid CC BY-NC 4.0
Radiology OragnA/C/SMNIST https://medmnist.com/ CC BY 4.0
Radiology CheXpert https://stanfordmlgroup.github.io/competitions/chexpert/ Stanford University Dataset Research Use Agreement
Pathology PathMNIST https://medmnist.com/ CC BY 4.0
Pathology WILDS-Camelyon17 (In Progress) https://wilds.stanford.edu/datasets/ CC0 1.0
Hematology BloodMNIST https://medmnist.com/ CC BY 4.0
Histology TissueMNIST https://medmnist.com/ CC BY 4.0
Gastroenterology KVASIR https://www.kaggle.com/datasets/meetnagadia/kvasir-dataset ODbL 1.0

Please follow the same license as the original datasets.

Image Preprocessing for Non-Image Datasets

If you download from the correct links, most of the evaluated datasets are conducted in image format. We need to process the MedMNIST data for the unified training. Please find ./utils/process_medmnist.ipynb for reference.

for idx in range(train_images.shape[0]):
  img = train_images[idx]
  label = train_labels[idx][0]
  img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
  save_name = 'train_%s_%s.jpg' %(idx, label)
  cv2.imwrite('%s/%s' %(target_dir+dataset, save_name), img_rgb)

To match the pre-split set in ./numpy/medmnist, all the images in MedMNIST will be stored in the format {split}_{image_idx}_{label}.jpg.

2. The Structure of Pre-Split Numpy Files

For the unified benchmark, we have pre-split the train/val/test sets, which are stored in numpy files.

There are four numpy files for each dataset/setting.

Take the ISIC datasets as an example, we have the three split files which start with the split 'train/val/test' followed by the imbalance ratio 100/200/500, e.g, train_100.npy.

We also have one dictionary file, e.g, dic.npy.

In ./dataset/dataloader.py, you can find how we access the image_name and its label:

self.np_dict = np.load(dict_path,allow_pickle=True).item()
self.np_path = np.load(np_path,allow_pickle=True)
self.img_name = []
self.img_label = []
for _ in self.np_path:
  self.img_name.append(_)
  self.img_label.append(self.np_dict[_])

Customized Datasets

You can conduct your datasets in the following format:

import numpy as np
import random
data_name = ['image_%s'%i for i in range(1000) ]
label = np.random.randint(0,10,1000)
dic = {data_name[i]:label[i] for i in range(1000)}
random.shuffle(data_name)
train = data_name[:700]
val = data_name[700:800]
test = data_name[800:1000]
np.save('train',train)
np.save('val',val)
np.save('test',test)
np.save('dic',dic)

And replace the img_path and np_path with your numpy files in the config files.

3. Start Training

Config Files

Take the isic_GCL_2nd as the example (only keep some important hyperparameters).

general: # Define some general parameters.
  img_size: 224
  seed: 1
  num_classes: 8
  dataset_name: 'isic'
  method: 'GCL_2nd'

model:
  if_resume: True # If this is set as True, the model will load for the resume_path.
  resume_path: './outputs/isic/100_GCL_224_resnet50_True_256_1_50/best.pt' # If if_resume is set as False, this will not work.
  if_freeze_encoder: True # If if_resume is set as False, this will not work.
  model_name: resnet50
  pretrained: True # If load the default pretrained weights provided by timm (from huggingface).


datasets:
  sampler: GCL # Sampler strategy.
  img_path: '/mnt/sda/datasets/isic2019/train/' # The image will be loaded as `img_path + name stored in np_path'. So if you can leave this blank if the full path is stored in 'np_path'.
  train:
    np_path: './numpy/isic/train_100.npy'
    dict_path: './numpy/isic/dic.npy' # Make sure this is consistency to the keys stored in the dic files.
  val:
    np_path: './numpy/isic/val_100.npy'
    dict_path: './numpy/isic/dic.npy'
  test:
    np_path: './numpy/isic/test_100.npy'
    dict_path: './numpy/isic/dic.npy'
  transforms:
    train: 'strong'
    val_test: 'crop'

Support Methods

Methods Paper Link (TBD) Offical Codes (TBD)
ERM (Crossentropy) NA
Re-sampling NA
Re-weighting NA
MixUp mixup: Beyond empirical risk minimization
Focal Loss Focal loss for dense object detection
Classifier Re-training Decoupling representation and classifier for long-tailed recognition
T-Norm Decoupling representation and classifier for long-tailed recognition
LWS Decoupling representation and classifier for long-tailed recognition
KNN Decoupling representation and classifier for long-tailed recognition
CBLoss Class-balanced loss based on effective number of samples
CBLoss_Focal Class-balanced loss based on effective number of samples
LADELoss Disentangling label distribution for long-tailed visual recognition
LDAM Learning imbalanced datasets with label-distribution-aware margin loss
Logits Adjust Loss Long-tail learning via logit adjustment
Logits Adjust Posthoc Long-tail learning via logit adjustment
PriorCELoss Disentangling label distribution for long-tailed visual recognition
RangeLoss Range Loss for Deep Face Recognition with Long-Tailed Training Data
SEQLLoss Equalization loss for long-tailed object recognition
VSLoss Label-imbalanced and group-sensitive classification under overparameterization
WeightedSoftmax Deep Long-Tailed Learning: A Survey
BalancedSoftmax Balanced meta-softmax for long-tailed visual recognition
De-Confound Long-tailed classification by keeping the good and removing the bad momentum causal effect
DisAlign Distribution alignment: A unified framework for long-tail visual recognition
GCL first stage Long-tailed visual recognition via gaussian clouded logit adjustment
GCL second stage Long-tailed visual recognition via gaussian clouded logit adjustment
MiSLAS Improving calibration for long-tailed recognition
RSG Rsg: A simple but effective module for learning imbalanced datasets
SADE Long-tailed recognition by routing diverse distribution-aware experts
SAM Sharpness-aware minimization for efficiently improving generalization
BBN Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition
Methods Paper Link (TBD) Offical Codes (TBD)
BYOL Bootstrap your own latent-a new approach to self-supervised learning
MOCOv2 Improved baselines with momentum contrastive learning
MAE (RetFound) RETFound: a foundation model for generalizable disease detection from retinal image
CAEv2 (PanDerm) A General-Purpose Multimodal Foundation Model for Dermatology
DINOv2 (TBD) Dinov2: Learning robust visual features without supervision

Support Backbones

Backbones Paper
ResNet Deep residual learning for image recognition
ViT An image is worth 16x16 words: Transformers for image recognition at scale
Swin Transformer Swin transformer: Hierarchical vision transformer using shifted windows
ConvNext A convnet for the 2020s
RetFound RETFound: a foundation model for generalizable disease detection from retinal images
PanDerm A General-Purpose Multimodal Foundation Model for Dermatology

Noted: If you apply foundation models, please modify the model path and use the backbone of the foundation model as the model name, e.g., ViT for RetFound.

Script Training

For non-MedMNIST Training, please use command like:

python main.py --config ./configs/isic/100/isic_ERM.yml

For MedMNIST Training, please find the ./train.sh script for reference.

Citation

If you find this code useful, we kindly to request you to cite our paper:

@article{ju2024monica,
  title={MONICA: Benchmarking on Long-tailed Medical Image Classification},
  author={Ju, Lie and Yan, Siyuan and Zhou, Yukun and Nan, Yang and Xing, Xiaodan and Duan, Peibo and Ge, Zongyuan},
  journal={arXiv preprint arXiv:2410.02010},
  year={2024}
}
@inproceedings{ju2022flexible,
  title={Flexible sampling for long-tailed skin lesion classification},
  author={Ju, Lie and Wu, Yicheng and Wang, Lin and Yu, Zhen and Zhao, Xin and Wang, Xin and Bonnington, Paul and Ge, Zongyuan},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  pages={462--471},
  year={2022},
  organization={Springer}
}

Disclaimer

This repository is provided for research purposes only. The datasets used in this project are either publicly available under their respective licenses or referenced from external sources. Redistribution of data files included in this repository is not permitted unless explicitly allowed by the original dataset licenses.

Data Usage

Please ensure that you comply with the licensing terms of the datasets before using them. The authors are not responsible for any misuse of the data. If you are using any dataset provided or linked in this repository, it is your responsibility to adhere to the license terms provided by the dataset creators.

For questions or concerns, please contact the repository maintainers.