python-gdcm icon indicating copy to clipboard operation
python-gdcm copied to clipboard

Error running example ConvertMPL.py

Open bal-agates opened this issue 1 year ago • 5 comments

I downloaded the example ConvertMPL.py from the example link in the readme. This example appears to be Python 2 code so I first converted to Python 3 with these changes:

--- a/dicom/ConvertMPL.py
+++ b/dicom/ConvertMPL.py
@@ -52,14 +52,14 @@ def gdcm_to_numpy(image):
     """Converts a GDCM image to a numpy array.
     """
     pf = image.GetPixelFormat().GetScalarType()
-    print 'pf', pf
-    print image.GetPixelFormat().GetScalarTypeAsString()
+    print('pf', pf)
+    print(image.GetPixelFormat().GetScalarTypeAsString())
     assert pf in get_gdcm_to_numpy_typemap().keys(), \
            "Unsupported array type %s"%pf
     d = image.GetDimension(0), image.GetDimension(1)
-    print 'Image Size: %d x %d' % (d[0], d[1])
+    print('Image Size: %d x %d' % (d[0], d[1]))
     dtype = get_numpy_array_type(pf)
-    gdcm_array = image.GetBuffer()
+    gdcm_array = image.GetBuffer().encode("latin-1")
     ## use float for accurate scaling
     result = numpy.frombuffer(gdcm_array, dtype=dtype).astype(float)
     ## optional gamma scaling

The print() changes were straightforward. image.GetBuffer() is returning a Python "str" type. numpy.frombuffer is expecting a bytes like object. I chose the "latin-1" codec for conversion from str->bytes because I believe this will convert ord(0..255) to bytes 0x00..0xFF. However, in my test case the image.GetBuffer()[14528] has an ord of 56461 which causes an exception. If the gdcm buffer is a char* I am not sure how this is possible. I looked at the gdcm C++ code some and in places there is some dynamic typing that might be difficult for SWIG to determine?

I cannot provide my test DICOM image because it contains personal data. Using a C++ program I was successful at extracting a raw image and then converting the raw to PNG. Therefore I believe the input data is valid. I believe this test image in DICOM is JPEG 2000 Compression (Lossless only), 1036x825, dtype int8, buffer length 854700.

Any suggestions on how to debug this? Different codec? Different way to convert str to bytes?

bal-agates avatar Jan 07 '25 23:01 bal-agates

I forgot my system info.

macOS 14.7.2 on Apple M1 Pro python 3.12.8 python-dgcm 3.0.24.1 installed with pip gdcm 3.0.22_2 installed with MacPorts (what I linked against for C++ test code)

bal-agates avatar Jan 08 '25 00:01 bal-agates

Hi @bal-agates. Try use image.GetBuffer().encode("utf-8", errors="surrogateescape")

tfmoraes avatar Jan 08 '25 18:01 tfmoraes

That worked. Thanks so much!!!

Along with the Python 2 -> 3 conversion changes I also had make changes to the MatPlotLib code for this to work. If I were to submit updated Python examples where should I do that? It looks like the master repository is on SourceForge. I have no account there and have only downloaded from SourceForge in the past. For now I might just make my own GitHub repository for updated examples.

bal-agates avatar Jan 09 '25 15:01 bal-agates

You can submit a PR in the GDCM Github repository too https://github.com/malaterre/GDCM/pulls . I submitted one PR there some months ago.

tfmoraes avatar Jan 10 '25 18:01 tfmoraes

I am working on updating the Python examples but not ready for a pull request. I have started work in python-gdcm-exaples. So far I have only worked on ConvertMPL.py which I believe is mostly working except for palette images. I have included some test data in this project which might cause problems later with a pull request?

bal-agates avatar Jan 14 '25 03:01 bal-agates