datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Add OASBUD Dataset (#2371)

Open sputney13 opened this issue 5 years ago • 0 comments

Add OASBUD Dataset

  • Dataset Name: oasbud
  • Issue Reference: https://github.com/tensorflow/datasets/issues/2371
  • dataset_info.json Gist: https://gist.github.com/sputney13/794db3b1c5e894eff00aed7cae86dd09

Description

The Open Access Series of Breast Ultrasonic Data contains 200 ultrasound scans (2 orthogonal scans each) of 52 malignant and 48 benign breast tumors, collected by the Department of Ultrasound at The Institute of Fundamental Technological Research of the Polish Academy of Sciences from patients at the Institute of Oncology (Warsaw). The scans are stored as rf data arrays of x by 510 (where x depends on scan depth) and each scan includes a same-size mask that denotes the region-of-interest for the tumor. The tumors were ranked on the BI-RADS scale, which describes the probability of lesion malignancy, and classified as malignant or benign. The 100 dataset entries each contain the two scans, two masks, BI-RADS ranking, and classification. The b_mode configuration processes the two scans into a B Mode image using the hilbert transform, log compression, and dB thresholding method suggested by the dataset authors.

Checklist

  • [x] Address all TODO's
  • [x] Add alphabetized import to subdirectory's __init__.py
  • [x] Run download_and_prepare successfully
  • [x] Add checksums file
  • [x] Properly cite in BibTeX format
  • [x] Add passing test(s)
  • [x] Add test data
  • [x] If using additional dependencies (e.g. scipy), use lazy_imports (if applicable)
  • [x] Add data generation script (not applicable)
  • [x] Lint code

sputney13 avatar Sep 13 '20 22:09 sputney13