Message Encoding and Decoding
Hi, I was running into an issue with the hl7apy library that I couldn't really understand so I'd figure i'd bring it up here.
I have some function in my codebase that is defined like this:
def build_message(patient_data, report_data, <other_args...>)
message = Message()
set_msh_headers(message)
patient_result_group = Group("PATIENT_RESULT")
patient_group = Group("PATIENT")
patient_group.add(Segment("PID"))
# sets a bunch of attributes on the PID segment via setattr(obj, attr, value)
return message
Anyway this works fine, but I encounter issues with the following code:
This seems to work great!
from hl7apy.parser import parse_message
message = build_message(<pass in relevant args>)
encoded_message = message.value.encode()
m = parse_message(encoded_message.decode("utf-8"))
print(m.PID.PATIENT_ID) # This prints the correct value
This doesn't seem to work
m = build_message(<pass in relevant args>)
print(m.PID.PATIENT_ID) # This prints nothing
I don't understand. My function returns a Message type value and parse_message converts a string to a message type value. Why doesn't the 2nd code properly print out the PATIENT_ID?
It seems when I call type(), the types are very different:
<class 'bytes'> ### just returning message
<class 'hl7apy.core.Message'> ### calling parse_message
any idea why?
I am very late to the party here, but I might have a start of explanation. The difference between your first and second attempt might lie in setting msh_9 or not. Or at least, so I think.
Once the internal structure of the message has the groups, it looks like the "PID" accessor doesn't go down the tree / hierarchy to find the actual segment. This is the reason why you can't access it when using the message instance that comes out of your build_message function.
But if you don't set msh_9, it just adds a PID segment to the "root" of the message (without the full group structure). It looks like this is what's happening when you write your message to a string and parse it back.
If you set msh_9 to a valid value representing the message type, then it is parsed with the group structure etc (better for validation, but then the "direct" accessors are broken. I don't know if this is the intended behavior or not. If not, I might submit a PR to fix it.
from hl7apy import VALIDATION_LEVEL
from hl7apy.core import Message, Group, Segment
from hl7apy.exceptions import ValidationError, UnsupportedVersion, HL7apyException, ChildNotFound
from hl7apy.parser import parse_message, parse_segment
def build_message(msh_9=None):
message = Message("ORU_R01")
if msh_9:
message.msh.msh_9 = msh_9
patient_result_group = Group("ORU_R01_PATIENT_RESULT")
patient_group = Group("ORU_R01_PATIENT")
patient_group.add(Segment("PID"))
patient_group.pid.patient_id = "123456"
patient_result_group.add(patient_group)
message.add(patient_result_group)
# sets a bunch of attributes on the PID segment via setattr(obj, attr, value)
return message
print("--- BUILD MESSAGE WITHOUT msh 9 ---")
m = build_message()
from pprint import pprint
print("Original message:")
pprint(m.children)
print(m.pid.patient_id)
message_string = m.value.encode()
print(f"Encoded message: {message_string}")
print("Parsing after encoding and decoding message:")
parsed = parse_message(message_string.decode())
print("Children of parsed message:")
pprint(parsed.children)
print(f'Patient id : {parsed.pid.patient_id.value}')
print("---- WITH MSH 9----")
withmsh9 = build_message(msh_9='ORU^R01^ORU_R01')
pprint(withmsh9.children)
print(withmsh9.pid.patient_id.value)
print(f'Encoded message: {withmsh9.value.encode()}')
parsed_withmsh9 = parse_message(withmsh9.value.encode().decode())
print("Children of parsed message:")
pprint(parsed_withmsh9.children)
print(f'Patient id : {parsed_withmsh9.pid.patient_id.value}')
The previous code gives the following output :
--- BUILD MESSAGE WITHOUT msh 9 ---
Original message:
[<Segment MSH>, <Group ORU_R01_PATIENT_RESULT>]
[]
Encoded message: b'MSH|^~\\&|||||20241122104135|||||2.5\rPID||123456'
Parsing after encoding and decoding message:
Children of parsed message:
[<Segment MSH>, <Segment PID>]
Patient id : 123456
---- WITH MSH 9----
[<Segment MSH>, <Group ORU_R01_PATIENT_RESULT>]
Encoded message: b'MSH|^~\\&|||||20241122104136||ORU^R01^ORU_R01|||2.5\rPID||123456'
Children of parsed message:
[<Segment MSH>, <Group ORU_R01_PATIENT_RESULT>]
Patient id :
Oh this is super strange. I'm not too familiar with the HL7 standards, but this does seem out of the ordinary.
Thanks for providing some clarification, I'll keep in mind this "quirk" (unclear if intended or not) going forward.
Probably will keep this issue open until one of the main contributors respond.
Dear @Blackglade and @rmic, first of all, sorry for being late in answering. I hope and suppose that @Blackglade has already solved the issues but I'd like to give my two cents, which may clarify the problem better.
What @rmic said is correct with the only inaccuracy that the important part was not the assignment of message.msh.msh_9 (which must be done anyway) but the instantiation of the message with the Message type "ORU_R01". This allows hl7apy to load the correct structure of the message which expects the hierarchy using the groups.
I couldn't reproduce the original problem and, actually, I'm a bit surprised that the build_message function worked since the PATIENT_RESULT and PATIENT groups don't exist and should have ended up in an InvalidName exception.
The correct way to access the PID segment in the @rmic example is:
m.oru_r01_patient_result.oru_r01_patient.pid.value which works the same in both cases