python-hl7 is a simple library for parsing messages of Health Level 7
(HL7) version 2.x into Python objects. python-hl7 includes a simple
client that can send HL7 messages to a Minimal Lower Level Protocol (MLLP)
server (mllp_send).
HL7 is a communication protocol and message format for
health care data. It is the de-facto standard for transmitting data
between clinical information systems and between clinical devices.
The version 2.x series, which is often in a pipe delimited format,
is currently the most widely accepted version of HL7 (there
is an alternative XML-based format).
python-hl7 currently only parses HL7 version 2.x messages into
an easy to access data structure. The library could eventually
also contain the ability to create HL7 v2.x messages.
HL7 Messages have a limited number of levels. The top level is a Message.
A Message is comprised of a number of Fields (hl7.Field).
Fields can repeat (hl7.Repetition). The content of a field
is either a primitive data type (such as a string) or a composite
data type comprised of one or more Components (hl7.Component). Components
are in turn comprised of Sub-Components (primitive data types).
The result of parsing is accessed as a tree using python list conventions:
Note that since the first element of the segment is the segment name,
segments are effectively 1-based in python as well (because the HL7 spec does
not count the segment name as part of the segment itself):
Since many many types of segments only have a single instance in a message
(e.g. PID or MSH), hl7.Message.segment() provides a convenience
wrapper around hl7.Message.segments() that returns the first matching
hl7.Segment:
python-hl7 features a simple network client, mllp_send, which reads HL7
messages from a file or sys.stdin and posts them to an MLLP server.
mllp_send is a command-line wrapper around
hl7.client.MLLPClient. mllp_send is a useful tool for
testing HL7 interfaces or resending logged messages:
For receiving HL7 messages using the Minimal Lower Level Protocol (MLLP), take a
look at the related twisted-hl7 package.
If do not want to use twisted and are looking to re-write some of twisted-hl7’s
functionality, please reach out to us. It is likely that some of the MLLP
parsing and formatting can be moved into python-hl7, which twisted-hl7 and other
libraries can depend upon.
python-hl7 supports Python 3.7+ and primarily deals with the unicode str type.
Passing bytes to hl7.parse(), requires setting the
encoding parameter, if using anything other than UTF-8. hl7.parse()
will always return a datastructure containing unicode str objects.
hl7.Message can be forced back into a single string using
and str(message).
Returns a instance of the hl7.Message that allows
indexed access to the data elements.
A custom hl7.Factory subclass can be passed in to be used when
constructing the message and it’s components.
Note
HL7 usually contains only ASCII, but can use other character
sets (HL7 Standards Document, Section 1.7.1), however as of v2.8,
UTF-8 is the preferred character set [1].
python-hl7 works on Python unicode strings. hl7.parse()
will accept unicode string or will attempt to convert bytestrings
into unicode strings using the optional encoding parameter.
encoding defaults to UTF-8, so no work is needed for bytestrings
in UTF-8, but for other character sets like ‘cp1252’ or ‘latin1’,
encoding must be set appropriately.
Returns a instance of a hl7.Batch
that allows indexed access to the messages.
A custom hl7.Factory subclass can be passed in to be used when
constructing the batch and it’s components.
Note
HL7 usually contains only ASCII, but can use other character
sets (HL7 Standards Document, Section 1.7.1), however as of v2.8,
UTF-8 is the preferred character set [2].
python-hl7 works on Python unicode strings. hl7.parse_batch()
will accept unicode string or will attempt to convert bytestrings
into unicode strings using the optional encoding parameter.
encoding defaults to UTF-8, so no work is needed for bytestrings
in UTF-8, but for other character sets like ‘cp1252’ or ‘latin1’,
encoding must be set appropriately.
Returns a instance of the hl7.File that allows
indexed access to the batches.
A custom hl7.Factory subclass can be passed in to be used when
constructing the file and it’s components.
Note
HL7 usually contains only ASCII, but can use other character
sets (HL7 Standards Document, Section 1.7.1), however as of v2.8,
UTF-8 is the preferred character set [3].
python-hl7 works on Python unicode strings. hl7.parse_file()
will accept unicode string or will attempt to convert bytestrings
into unicode strings using the optional encoding parameter.
encoding defaults to UTF-8, so no work is needed for bytestrings
in UTF-8, but for other character sets like ‘cp1252’ or ‘latin1’,
encoding must be set appropriately.
Returns a instance of the hl7.Message, hl7.Batch
or hl7.File that allows indexed access to the data elements or
messages or batches respectively.
A custom hl7.Factory subclass can be passed in to be used when
constructing the message/batch/file and it’s components.
Note
HL7 usually contains only ASCII, but can use other character
sets (HL7 Standards Document, Section 1.7.1), however as of v2.8,
UTF-8 is the preferred character set [4].
python-hl7 works on Python unicode strings. hl7.parse_hl7()
will accept unicode string or will attempt to convert bytestrings
into unicode strings using the optional encoding parameter.
encoding defaults to UTF-8, so no work is needed for bytestrings
in UTF-8, but for other character sets like ‘cp1252’ or ‘latin1’,
encoding must be set appropriately.
Join a the child messages into a single string, separated
by the self.separator. This method acts recursively, calling
the children’s __unicode__ method. Thus unicode() is the
approriate method for turning the python-hl7 representation of
HL7 into a standard string.
If this batch has BHS/BTS segments, they will be added to the
beginning/end of the returned string.
Join a the child batches into a single string, separated
by the self.separator. This method acts recursively, calling
the children’s __unicode__ method. Thus unicode() is the
approriate method for turning the python-hl7 representation of
HL7 into a standard string.
If this batch has FHS/FTS segments, they will be added to the
beginning/end of the returned string.
If key is an integer, __getitem__ acts list a list, returning
the hl7.Segment held at that index:
>>> h[1][['PID'], ...]
If the key is a string of length 3, __getitem__ acts like a dictionary,
returning all segments whose segment_id is key
(alias of hl7.Message.segments()).
Join a the child containers into a single string, separated
by the self.separator. This method acts recursively, calling
the children’s __unicode__ method. Thus unicode() is the
approriate method for turning the python-hl7 representation of
HL7 into a standard string.
ack_code options are one of AA (Application Accept), AR (Application Reject),
AE (Application Error), CA (Commit Accept - Enhanced Mode),
CR (Commit Reject - Enhanced Mode), or CE (Commit Error - Enhanced Mode)
(see HL7 Table 0008 - Acknowledgment Code)
message_id control message ID for ACK, defaults to unique generated ID
application name of sending application, defaults to receiving application of message
facility name of sending facility, defaults to receiving facility of message
If the parse tree is deeper than the specified path continue
following the first child branch until a leaf of the tree is
encountered and return that value (which could be blank).
Example:
PID.F3.R1.C2 = ‘Sub-Component1’ (assume .SC1)
If the parse tree terminates before the full path is satisfied
check each of the subsequent paths and if every one is specified
at position 1 then the leaf value reached can be returned as the
result.
Wraps a byte string, unicode string, or hl7.Message
in a MLLP container and send the message to the server
If message is a byte string, we assume it is already encoded properly.
If message is unicode or hl7.Message, it will be encoded
according to hl7.client.MLLPClient.encoding
The arguments are all the usual arguments to create_connection()
except protocol_factory; most common are positional host and port,
with various optional keyword arguments following.
Start a socket server, call back for each client connected.
The first parameter, client_connected_cb, takes two parameters:
client_reader, client_writer. client_reader is a
hl7.mllp.HL7StreamReader object, while client_writer
is a hl7.mllp.HL7StreamWriter object. This
parameter can either be a plain callback function or a coroutine;
if it is a coroutine, it will be automatically converted into a
Task.
The rest of the arguments are all the usual arguments to
loop.create_server() except protocol_factory; most common are
positional host and port, with various optional keyword arguments
following.
The return value is the same as loop.create_server().
Additional optional keyword arguments are loop (to set the event loop
instance to use) and limit (to set the buffer limit passed to the
StreamReader).
The return value is the same as loop.create_server(), i.e. a
Server object which can be used to stop the service.
If limit is reached, ValueError will be raised. In that case, if
block termination separator was found, complete line including separator
will be removed from internal buffer. Else, internal buffer will be cleared. Limit is
compared against part of the line without separator.
python-hl7 features a simple network client, mllp_send, which reads HL7
messages from a file or sys.stdin and posts them to an MLLP server.
mllp_send is a command-line wrapper around
hl7.client.MLLPClient. mllp_send is a useful tool for
testing HL7 interfaces or resending logged messages:
By default, mllp_send expects the FILE or stdin input to be a properly
formatted HL7 message (carriage returns separating segments) wrapped in a MLLP
stream (<SB>message1<EB><CR><SB>message2<EB><CR>...).
However, it is common, especially if the file has been manually edited in
certain text editors, that the ASCII control characters will be lost and the
carriage returns will be replaced with the platform’s default line endings.
In this case, mllp_send provides the --loose option, which attempts
to take something that “looks like HL7” and convert it into a proper HL7
message..
hl7.mllp package is currently experimental and subject to change.
It aims to replace txHL7.
python-hl7 includes classes for building HL7 clients and
servers using asyncio. The underlying protocol for these
clients and servers is MLLP.
The hl7.mllp package is designed the same as
the asyncio.streams package. Examples in that documentation
may be of assistance in writing production senders and
receivers.
# Using the third party `aiorun` instead of the `asyncio.run()` to avoid# boilerplate.importaiorunimporthl7fromhl7.mllpimportopen_hl7_connectionasyncdefmain():message='MSH|^~\&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01|CNTRL-3456|P|2.4\r'message+='PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|196203520|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292||(206)3345232|(206)752-121||||AC555444444||67-A4335^OH^20030520\r'message+='OBR|1|845439^GHH OE|1045813^GHH LAB|1554-5^GLUCOSE|||200202150730||||||||555-55-5555^PRIMARY^PATRICIA P^^^^MD^^LEVEL SEVEN HEALTHCARE, INC.|||||||||F||||||444-44-4444^HIPPOCRATES^HOWARD H^^^^MD\r'message+='OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105|H|||F\r'# Open the connection to the HL7 receiver.# Using wait_for is optional, but recommended so# a dead receiver won't block you for longhl7_reader,hl7_writer=awaitasyncio.wait_for(open_hl7_connection("127.0.0.1",2575),timeout=10,)hl7_message=hl7.parse(message)# Write the HL7 message, and then wait for the writer# to drain to actually send the messagehl7_writer.writemessage(hl7_message)awaithl7_writer.drain()print(f'Sent message\n{hl7_message}'.replace('\r','\n'))# Now wait for the ACK message from the receieverhl7_ack=awaitasyncio.wait_for(hl7_reader.readmessage(),timeout=10)print(f'Received ACK\n{hl7_ack}'.replace('\r','\n'))aiorun.run(main(),stop_on_unhandled_errors=True)
# Using the third party `aiorun` instead of the `asyncio.run()` to avoid# boilerplate.importaiorunimporthl7fromhl7.mllpimportstart_hl7_serverasyncdefprocess_hl7_messages(hl7_reader,hl7_writer):"""This will be called every time a socket connects with us. """peername=hl7_writer.get_extra_info("peername")print(f"Connection established {peername}")try:# We're going to keep listening until the writer# is closed. Only writers have closed status.whilenothl7_writer.is_closing():hl7_message=awaithl7_reader.readmessage()print(f'Received message\n{hl7_message}'.replace('\r','\n'))# Now let's send the ACK and wait for the# writer to drainhl7_writer.writemessage(hl7_message.create_ack())awaithl7_writer.drain()exceptasyncio.IncompleteReadError:# Oops, something went wrong, if the writer is not# closed or closing, close it.ifnothl7_writer.is_closing():hl7_writer.close()awaithl7_writer.wait_closed()print(f"Connection closed {peername}")asyncdefmain():try:# Start the server in a with clause to make sure we# close itasyncwithawaitstart_hl7_server(process_hl7_messages,port=2575)ashl7_server:# And now we server forever. Or until we are# cancelled...awaithl7_server.serve_forever()exceptasyncio.CancelledError:# Cancelled errors are expectedpassexceptException:print("Error occurred in main")aiorun.run(main(),stop_on_unhandled_errors=True)
A tree has leaf values and nodes. Only the leaves of the tree can have a value.
All data items in the message will be in a leaf node.
After parsing, the data items in the message are in position in the parse tree, but
they remain in their escaped form. To extract a value from the tree you start at the
root of the Segment and specify the details of which field value you want to extract.
The minimum specification is the field number and repeat number. If you are after a
component or sub-component value you also have to specify these values.
If for instance if you want to read the value “Sub-Component2” from the example HL7
you need to specify: Field 3, Repeat 1, Component 2, Sub-Component 2 (PID.F1.R1.C2.S2).
Reading values from a tree structure in this manner is the only safe way to read data
from a message.
All values should be accessed in this manner. Even if a field is marked as being
non-repeating a repeat of “1” should be specified as later version messages
could have a repeating value.
To enable backward and forward compatibility there are rules for reading values when the
tree does not match the specification (eg PID.F1.R1.C2.S2) The common example of this is
expanding a HL7 “IS” Value into a Codeded Value (“CE”). Systems reading a “IS” value would
read the Identifier field of a message with a “CE” value and systems expecting a “CE” value
would see a Coded Value with only the identifier specified. A common Australian example of
this is the OBX Units field, which was an “IS” value previously and became a “CE” Value
in later versions.
Old Version: “|mmol/l|” New Version: “|mmol/l^^ISO+|”
Systems expecting a simple “IS” value would read “OBX.F6.R1” and this would yield a value
in the tree for an old message but with a message with a Coded Value that tree node would
not have a value, but would have 3 child Components with the “mmol/l” value in the first
subcomponent. To resolve this issue where the tree is deeper than the specified path the
first node of every child node is traversed until a leaf node is found and that value is
returned.
>>> h['PID.F3.R1.C2']'Sub-Component1'
This is a general rule for reading values: If the parse tree is deeper than the specified
path continue following the first child branch until a leaf of the tree is encountered
and return that value (which could be blank).
Systems expecting a Coded Value (“CE”), but reading a message with a simple “IS” value in it
have the opposite problem. They have a deeper specification but have reached a leaf node and
cannot follow the path any further. Reading a “CE” value requires multiple reads for each
sub-component but for the “Identifier” in this example the specification would be “OBX.F6.R1.C1”.
The tree would stop at R1 so C1 would not exist. In this case the unsatisfied path elements
(C1 in this case) can be examined and if every one is position 1 then they can be ignored and
the leaf of the tree that was reached returned. If any of the unsatisfied paths are not in
position 1 then this cannot be done and the result is a blank string.
This is the second Rule for reading values: If the parse tree terminates before the full path
is satisfied check each of the subsequent paths and if every one is specified at position 1
then the leaf value reached can be returned as the result.
>>> h['PID.F1.R1.C1.S1']'Field1'
This is a general rule for reading values: If the parse tree is deeper than the specified
path continue following the first child branch until a leaf of the tree is encountered
and return that value (which could be blank).
In the second example every value that makes up the Coded Value, other than the identifier
has a component position greater than one and when reading a message with a simple “IS”
value in it, every value other than the identifier would return a blank string.
Following these rules will result in excellent backward and forward compatibility. It is
important to allow the reading of values that do not exist in the parse tree by simply
returning a blank string. The two rules detailed above, along with the full tree specification
for all values being read from a message will eliminate many of the errors seen when
handling earlier and later message versions.
>>> h['PID.F10.R1']''
At this point the desired value has either been located, or is absent, in which case a blank
string is returned.
HL7 messages are transported using the 7bit ascii character set. Only characters between
ascii 32 and 127 are used. Characters which cannot be transported using this range
of values must be ‘escaped’, that is replaced by a sequence of characters for transmission.
The stores values internally in the escaped format. When the message is composed using
‘str’, the escaped value must be returned.
When the accessor is used to reference the field, the field is automatically unescaped.
>>> h['PID.F2.R1']'|'
The escape/unescape mechanism support replacing separator characters with their escaped
version and replacing non-ascii characters with hexadecimal versions.
The escape method returns a ‘str’ object. The unescape method returns a str object.
HL7 defines a protocol for encoding presentation characters, These include highlighting,
and rich text functionality. The API does not currently allow for easy access to the
escape/unescape logic. You must overwrite the message class escape and unescape methods,
after parsing the message.
The test suite is located in tests/ and can be run several ways.
It is recommended to run the full tox suite so
that all supported Python versions are tested and the documentation is built
and tested. We provide a Makefile to create a virtualenv, install tox,
and run tox:
python-hl7 has converted to use black <https://black.readthedocs.io/en/stable/>
to enforce a coding style. To automatically format using black and isort:
$ make format
It is also recommended to run the flake8 checks for PEP8 and PyFlake
violations. Commits should be free of warnings:
Message now ends with trailing carriage return, to be consistent with Message
Construction Rules (Section 2.6, v2.8). [python-hl7#26 <https://github.com/johnpaulett/python-hl7/issues/26>]
0.3.0 breaks backwards compatibility by correcting
the indexing of the MSH segment and the introducing improved parsing down to
the repetition and sub-component level.
Changed the numbering of fields in the MSH segment.
This breaks older code.
Parse all the elements of the message (i.e. down to sub-component). The
inclusion of repetitions will break older code.
Message (and Message.segments), Field, Repetition and Component can be
accessed using 1-based indices by using them as a callable.
Added Python 3 support. Python 2.6, 2.7, and 3.3 are officially supported.
hl7.parse() can now decode byte strings, using the encoding
parameter. hl7.client.MLLPClient can now encode unicode input
using the encoding parameter. To support Python 3, unicode is now
the primary string type used inside the library. bytestrings are only
allowed at the edge of the library now, with hl7.parse and sending
via hl7.client.MLLPClient. Refer to Python 2 vs Python 3 and Unicode vs Byte strings.
Testing via tox and travis CI added. See Contributing.
A massive thanks to Kevin Gill and
Emilien Klein for the initial code submissions
to add the improved parsing, and to
Andrew Wason for rebasing the initial pull
request and providing assistance in the transition.
mllp_send--loose algorithm modified to allow multiple messages per file.
The algorithm now splits messages based upon the presumed start of a message,
which must start with MSH|^~\&|
mllp_send now takes the --loose options, which allows
sending HL7 messages that may not exactly meet the standard (Windows newlines
separating segments instead of carriage returns).
Converted hl7.segment and hl7.segments into methods on
hl7.Message.
Support dict-syntax for getting Segments from a Message (e.g. message['OBX'])
Use unicode throughout python-hl7 since the HL7 spec allows non-ASCII characters.
It is up to the caller of hl7.parse() to convert non-ASCII messages
into unicode.
Refactored from single hl7.py file into the hl7 module.
Copyright (C) 2009-2020 John Paulett (john -at- paulett.org)
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in
the documentation and/or other materials provided with the
distribution.
3. The name of the author may not be used to endorse or promote
products derived from this software without specific prior
written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.