dicomParser Internationalization support

I cannot find anything related to internationalization support neither in the documentation nor in the code itself. What is the status of internationalization support in dicomParser ?

I have not checked but I suspect some string won't play well with JSON which is limited to UTF-8 /UTF-16/UTF-32 strings.

Ref:

Table D.6.2-1. Supported Specific Character Set Defined Terms

May 11 '20 09:05 malaterre

Here is what I get using the DICOM Dump to JSON live example:

while it should look like:

ref:

Internationalized character set test DICOM images

May 11 '20 09:05 malaterre

readFixedString does not seems to check the value for SpecificCharacterSet (0008,0005) as seen at:

https://github.com/cornerstonejs/dicomParser/blob/82573d94342e12b4bae4fcd2e93f91bb061474cf/src/byteArrayParser.js#L29-L37

refs:

http://dicom.nema.org/medical/dicom/current/output/chtml/part03/sect_C.12.html#table_C.12-1
http://dicom.nema.org/medical/dicom/current/output/chtml/part03/sect_C.12.html#sect_C.12.1.1.2

May 11 '20 14:05 malaterre

@malaterre Currently dicomParser itself doesn't do any character set decoding. However, if you need it now, you can pair dicomParser with the dicom-character-set library that I wrote.

May 11 '20 14:05 yagni

@yagni That looks pretty promising. One thing I still fail to understand. The original string function from dicomParser seems to be doing the following:

Take raw byte
Consider it as ISO-8859-1 character, turn it into UTF-16
Return a truncated string (stop at first byte === 0)

So I am wondering what is the expected input to your library ? Can I pass directly the output of string element function ?

May 11 '20 14:05 malaterre

I didn't have any experience with DICOM character sets so didn't factor it into the original design. I like the idea of putting it in a separate library like @yagni did so it can be added in for those that need it. I specifically didn't add image decompression to this library for the same reason.

May 11 '20 14:05 chafey

@chafey I am pretty sure that this test is just wrong:

if (byte === 0) {

This feel like a c-string ASCII ending. I am sure we can have byte===0 in unicode (we should only rely on the length).

May 11 '20 14:05 malaterre

@malaterre You'll need to pass the raw bytes to dicom-character-set. If I remember correctly, fromCharCode converts it to UTF so you end up with bytes not in the original data. So just slice the byteArray starting at the element's dataOffset and going for its length number of bytes, then pass that into dicom-character-set, along with the Specific Character Set and optional VR (see the readme for more details).

May 11 '20 14:05 yagni

@yagni Thanks for the confirmation. @chafey it would be nice to document what string is actually doing. I hope the next version will offer a function rawString, that would be clearer (IMHO).

May 11 '20 14:05 malaterre

It probably makes sense to revisit the whole repo in light of non ascii character sets, lots of code is using this library now and we should not be propogating designs with are not character set aware

May 11 '20 14:05 chafey

Hello! Are there some new info about this feature? Or, maybe someone, can help to understand, how to get the raw data from tag, and I can parse it by my self... ? For example, how can i get this binary data from "x00100010"? Screenshot 2022-11-28 at 12 34 09

Thanks alot!

Nov 28 '22 09:11 creemer

@creemer To get the raw data, create a Uint8Array at the data offset of the element (like we do in the readme) of the appropriate length:

const patientNameElement = dataSet.elements.x00100010;
const patientNameBytes = new Uint8Array(dataSet.byteArray.buffer, patientNameElement.dataOffset, patientNameElement.length);

If you don't want to parse those bytes yourself at this point, you can pass them, along with the value of the Specific Character Set element, to my dicom-character-set library:

import { convertBytes } from 'dicom-character-set';
const str = convertBytes(dataSet.string('x00080005'), patientNameBytes, {vr: 'PN'});

Nov 28 '22 23:11 yagni

@yagni Thanks a lot! It is all I need :)

Nov 29 '22 09:11 creemer