Very Slow Loading of NIfTI (.nii.gz) Files Compared to SimpleITK
Describe the bug There is a significant performance gap between MONAI and SimpleITK when loading NIfTI (.nii.gz) files. In my own benchmark I have found:
- MONAI average load time: 2.0321 seconds
- SimpleITK average load time: 0.1135 seconds
Using cProfile I notice that for monai, the bulk of the loading time is consumed by numpy.asanyarray. Results are the following when running a loop with monai.transforms.LoadImage 10 times:
MONAI Profiling Results
| ncalls | tottime | percall | cumtime | percall | filename:lineno(function) |
|---|---|---|---|---|---|
| 10 | 0.000 | 0.000 | 20.231 | 2.023 | benchmark.py:11(load_with_monai) |
| 10 | 0.016 | 0.002 | 20.212 | 2.021 | monai/transforms/io/array.py:229(call) |
| 10 | 0.000 | 0.000 | 19.959 | 1.996 | monai/data/image_reader.py:927(get_data) |
| 10 | 0.000 | 0.000 | 19.956 | 1.996 | monai/data/image_reader.py:1018(_get_array_data) |
| 60/40 | 17.903 | 0.298 | 19.956 | 0.499 | {built-in method numpy.asanyarray} |
| 10 | 0.000 | 0.000 | 2.053 | 0.205 | nibabel/arrayproxy.py:436(array) |
| 10 | 0.000 | 0.000 | 2.053 | 0.205 | nibabel/arrayproxy.py:413(_get_scaled) |
| 10 | 0.000 | 0.000 | 2.052 | 0.205 | nibabel/arrayproxy.py:389(_get_unscaled) |
| 10 | 0.041 | 0.004 | 2.050 | 0.205 | nibabel/volumeutils.py:392(array_from_file) |
| 10 | 0.052 | 0.005 | 2.008 | 0.201 | {method 'readinto' of '_io._BufferedIOBase' objects} |
SimpleITK Profiling Results
| ncalls | tottime | percall | cumtime | percall | filename:lineno(function) |
|---|---|---|---|---|---|
| 10 | 0.000 | 0.000 | 1.139 | 0.114 | benchmark.py:15(load_with_sitk) |
| 10 | 0.000 | 0.000 | 1.076 | 0.108 | SimpleITK/extra.py:340(ReadImage) |
| 10 | 0.062 | 0.006 | 0.062 | 0.006 | {built-in method numpy.array} |
| 10 | 0.000 | 0.000 | 0.001 | 0.000 | SimpleITK/extra.py:259(GetArrayViewFromImage) |
| 10 | 0.000 | 0.000 | 0.001 | 0.000 | SimpleITK/SimpleITK.py:8443(init) |
| 10 | 0.001 | 0.000 | 0.001 | 0.000 | {built-in method SimpleITK._SimpleITK.new_ImageFileReader} |
| 10 | 0.000 | 0.000 | 0.000 | 0.000 | SimpleITK/extra.py:156(_get_numpy_dtype) |
Note: Times are in seconds. 'ncalls' is the number of calls, 'tottime' is the total time spent in the function (excluding time in subfunctions), 'percall' is the average time spent per call, and 'cumtime' is the cumulative time spent in the function and all subfunctions.
As shown in the profiling results, the majority of time in MONAI is spent in numpy operations, particularly numpy.asanyarray, which takes about 19.956 seconds cumulatively. In contrast, SimpleITK's ReadImage function takes only 1.076 seconds.
To Reproduce
import time
import os
import numpy as np
import monai
import SimpleITK as sitk
def load_with_monai(file_path):
start_time = time.time()
image = monai.transforms.LoadImage()(file_path)
end_time = time.time()
return image, end_time - start_time
def load_with_sitk(file_path):
start_time = time.time()
image = sitk.ReadImage(file_path)
array = sitk.GetArrayFromImage(image)
end_time = time.time()
return array, end_time - start_time
def run_benchmark(file_path, num_iterations=10):
monai_times = []
sitk_times = []
for _ in range(num_iterations):
_, monai_time = load_with_monai(file_path)
monai_times.append(monai_time)
_, sitk_time = load_with_sitk(file_path)
sitk_times.append(sitk_time)
print(f"MONAI average load time: {np.mean(monai_times):.4f} seconds")
print(f"SimpleITK average load time: {np.mean(sitk_times):.4f} seconds")
if __name__ == "__main__":
file_path = "path/to/file.nii.gz"
if not os.path.exists(file_path):
print(f"File not found: {file_path}")
else:
run_benchmark(file_path)
Environment
MONAI version: 1.3.2
Numpy version: 1.26.4
Pytorch version: 2.4.1
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 59a7211070538586369afd4a01eca0a7fe2e742e
MONAI __file__: C:\Users\<username>\miniconda3\envs\panorama\Lib\site-packages\monai\__init__.py
Optional dependencies:
Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION.
ITK version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: 5.2.1
scikit-image version: NOT INSTALLED or UNKNOWN VERSION.
scipy version: 1.13.1
Pillow version: 10.4.0
Tensorboard version: NOT INSTALLED or UNKNOWN VERSION.
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: 0.19.1
tqdm version: 4.66.5
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: NOT INSTALLED or UNKNOWN VERSION.
pandas version: NOT INSTALLED or UNKNOWN VERSION.
einops version: NOT INSTALLED or UNKNOWN VERSION.
transformers version: NOT INSTALLED or UNKNOWN VERSION.
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: NOT INSTALLED or UNKNOWN VERSION.
clearml version: NOT INSTALLED or UNKNOWN VERSION.
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
================================
Printing system config...
================================
`psutil` required for `print_system_info`
================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 12.4
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None
cuDNN version: 90100
Current device: 0
Library compiled for CUDA architectures: ['sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
GPU 0 Name: NVIDIA GeForce RTX 4070
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 46
GPU 0 Total memory (GB): 12.0
GPU 0 CUDA capability (maj.min): 8.9
Additional context Add any other context about the problem here.
import time import os import numpy as np import monai import SimpleITK as sitk import cProfile
Custom LoadImage with prefetching
class FastMONAILoadImage(monai.transforms.LoadImage): def call(self, filename): data = super().call(filename) # Directly extract the image array to bypass unnecessary numpy operations return np.asarray(data[0]) # Only return the image array
def load_with_monai(file_path): start_time = time.time() image = FastMONAILoadImage()(file_path) # Use the optimized loader end_time = time.time() return image, end_time - start_time
def load_with_sitk(file_path): start_time = time.time() image = sitk.ReadImage(file_path) array = sitk.GetArrayFromImage(image) end_time = time.time() return array, end_time - start_time
def run_benchmark(file_path, num_iterations=10): monai_times = [] sitk_times = []
# Profile the MONAI loading process
cProfile.runctx('load_with_monai(file_path)', globals(), locals(), filename="monai_profile.prof")
for _ in range(num_iterations):
_, monai_time = load_with_monai(file_path)
monai_times.append(monai_time)
_, sitk_time = load_with_sitk(file_path)
sitk_times.append(sitk_time)
print(f"MONAI average load time: {np.mean(monai_times):.4f} seconds")
print(f"SimpleITK average load time: {np.mean(sitk_times):.4f} seconds")
if name == "main": file_path = "path/to/file.nii.gz" if not os.path.exists(file_path): print(f"File not found: {file_path}") else: run_benchmark(file_path)
Could you please explain why does 'ncalls' for numpy.asanyarray in MONAI profiling results is shown as 60/40, what does this indicate?
The ncalls value shown as 60/40 in MONAI profiling results means the function numpy.asanyarray was called 60 times in total. Out of these, 40 were direct calls (where it was explicitly invoked), and the remaining 20 were indirect calls (triggered by other functions). On Mon, 30 Dec 2024 at 7:41 PM, 罗崚骁 (LUO Lingxiao) @.***> wrote:
Could you please explain why does 'ncalls' for numpy.asanyarray in MONAI profiling results is shown as 60/40, what does this indicate?
— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2565534449, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOZRTPTAWOHDYKO2ERLVET2IFIB3AVCNFSM6AAAAABOY6FZ7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRVGUZTINBUHE . You are receiving this because you commented.Message ID: @.***>
You can closed the Issue by your end
Hi @ericspod and @virginiafdez, Could you please take a look at this issue? It's the one I mentioned in the meeting but couldn't find at the time. Thanks.
Yes, I will check.
On Wed, 26 Feb 2025 at 10:37 AM, YunLiu @.***> wrote:
Hi @ericspod https://github.com/ericspod and @virginiafdez https://github.com/virginiafdez, Could you please take a look at this issue? It's the one I mentioned in the meeting but couldn't find at the time. Thanks.
— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2683906363, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOZRTIE5442GV4JWGLKSTT2RVDYFAVCNFSM6AAAAABOY6FZ7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBTHEYDMMZWGM . You are receiving this because you commented.Message ID: @.***> [image: KumoLiu]KumoLiu left a comment (Project-MONAI/MONAI#8107) https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2683906363
Hi @ericspod https://github.com/ericspod and @virginiafdez https://github.com/virginiafdez, Could you please take a look at this issue? It's the one I mentioned in the meeting but couldn't find at the time. Thanks.
— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2683906363, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOZRTIE5442GV4JWGLKSTT2RVDYFAVCNFSM6AAAAABOY6FZ7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBTHEYDMMZWGM . You are receiving this because you commented.Message ID: @.***>
The issue was addressed while ago for the similar work can you please the close the call
On Wed, 26 Feb 2025 at 11:04 AM, Kartik Shastrakar < @.***> wrote:
Yes, I will check.
On Wed, 26 Feb 2025 at 10:37 AM, YunLiu @.***> wrote:
Hi @ericspod https://github.com/ericspod and @virginiafdez https://github.com/virginiafdez, Could you please take a look at this issue? It's the one I mentioned in the meeting but couldn't find at the time. Thanks.
— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2683906363, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOZRTIE5442GV4JWGLKSTT2RVDYFAVCNFSM6AAAAABOY6FZ7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBTHEYDMMZWGM . You are receiving this because you commented.Message ID: @.***> [image: KumoLiu]KumoLiu left a comment (Project-MONAI/MONAI#8107) https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2683906363
Hi @ericspod https://github.com/ericspod and @virginiafdez https://github.com/virginiafdez, Could you please take a look at this issue? It's the one I mentioned in the meeting but couldn't find at the time. Thanks.
— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2683906363, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOZRTIE5442GV4JWGLKSTT2RVDYFAVCNFSM6AAAAABOY6FZ7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBTHEYDMMZWGM . You are receiving this because you commented.Message ID: @.***>
I think this is still a relevant issue. I'm not seeing such dramatic differences with current versions of libraries, but MONAI still takes twice as long to read a file. There is a difference that LoadImage produces a tensor with the dimensions unchanged from the file, while SITK produces numpy with dimensions changed by default. This doesn't explain the difference seen however so we should see about updating how we do loading to match what the newer Cupy based code does just with Numpy.
I haven’t tried the latest version as i am focused on the other project so will try to look at it when i get free time
On Wed, 26 Feb 2025 at 8:11 PM, Eric Kerfoot @.***> wrote:
I think this is still a relevant issue. I'm not seeing such dramatic differences with current versions of libraries, but MONAI still takes twice as long to read a file. There is a difference that LoadImage produces a tensor with the dimensions unchanged from the file, while SITK produces numpy with dimensions changed by default. This doesn't explain the difference seen however so we should see about updating how we do loading to match what the newer Cupy based code does just with Numpy.
— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2685248924, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOZRTOG3NLCK5B3FC55LXT2RXHBNAVCNFSM6AAAAABOY6FZ7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBVGI2DQOJSGQ . You are receiving this because you commented.Message ID: @.***> [image: ericspod]ericspod left a comment (Project-MONAI/MONAI#8107) https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2685248924
I think this is still a relevant issue. I'm not seeing such dramatic differences with current versions of libraries, but MONAI still takes twice as long to read a file. There is a difference that LoadImage produces a tensor with the dimensions unchanged from the file, while SITK produces numpy with dimensions changed by default. This doesn't explain the difference seen however so we should see about updating how we do loading to match what the newer Cupy based code does just with Numpy.
— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/8107#issuecomment-2685248924, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOZRTOG3NLCK5B3FC55LXT2RXHBNAVCNFSM6AAAAABOY6FZ7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBVGI2DQOJSGQ . You are receiving this because you commented.Message ID: @.***>