Category Archives: v3

Question: HL7 Open source libraries


I work on EHR, and I want to use HL7 to communicate with different systems in a medical environment, for that I found apis (Hapi, nHapi, javasig, Everest,) and I don’t know what is the best and what are the selection criteria


Well, what are your selection criteria? The first question is whether you are doing HL7 v2 or v3. Almost certainly, it’s v2. What language are you using? (or want to use).

Here’s some information about the libraries:

  • HAPI: an open-source, object-oriented HL7 2.x parser for Java – it also includes a number of other utilities for using HL7 v2
  • NHAPI: NHapi is a port of the original project HAPI. NHapi allows Microsoft .NET developers to easily use an HL7 2.x object model
  • Javasig: A library of code written to process V3 messages based on the MIF definitions
  • Everest:  Designed to ease the creation, formatting, and transmission of HL7v3 structures with remote systems. Supports CDA R2, and canadian messages

My evaluations:

  • HAPI – a solid well established community, with runs on the board and reliable code. I’d be happy to use this
  • nHAPI – as far as I can tell, not well established, and given it’s for an HIE, I question the long term viability of the community. Unlike HAPI, I never hear of people actually using this
  • Javasig: this is dead – note that the only link I found was to a web archive. You’d have to be desperate to try to use it, though the maintainers might get interested if you did
  • Everest: this has a community of users in Canada, but I’ve not heard of any use outside there. I’m not sure to what degree Mohawk are supporting it (I’ll ask)

You should consider one of the paid libraries – given the amount of hours you’re going to invest in the work (1000s, I bet), a few thousand for a software library and related tools is peanuts. There’s a lot of good choices there (btw, I have one of my own, which doesn’t cost that much).


CDA: What information from the Entries has to go in the narrative?

Most CDA implementation guides, and many tutorials – including some of both I wrote myself – say something like this:

it is an absolute requirement of all CDA documents that the section narrative text completely capture all the information represented in the structured elements within the section

(emphasis added) This example is from Brian’s article I linked to in my previous post, but the same language exists all over the place.

Only, it’s not true. What it should say is, “the section narrative must capture all the clinically relevant information represented in the structured elements within the section”. Which raises the obvious question, ‘well, what’s clinically relevant’?

For comparison, this is the definition of the underlying field that includes the words:

Act.text SHOULD NOT be used for the sharing of computable information

That shouldn’t be understood to mean that you can’t put information that is computable in Act.text, but that it’s meant for human consumption, and you put in there what’s relevant for a human.

What is “Clinically Relevant”?

The rest of this post gives a set of suggestions about what is clinically relevant in the entries. But it’s really important to understand that this is all just suggestions. The decision has to be made by some clinically aware person who knows what the data means – don’t let programmers do it (and I see a lot of documents where the programmers have done it, and what they did doesn’t make sense to a clinician).

First, I’ll look at the entries, and then I’ll make comments about data type rendering below that.

Entry – nothing to display in the narrative

Entry Acts (Observation , etc) – each act should be shown in the narrative, except for RegionOfInterest which is purely technical, and ObservationMedia which is shown via a <renderMultimedia>. Where possible, I prefer for the narrative to be a table, and there should be one row in the table for each entry (you can even make that rule a schematron assertion, and it picks up lots of errors, believe me).  

classCode , moodCode, statusCode – these are critical to be in the narrative – but very often, they are implicit. For example, if the Section title is “Past Procedures”, and all the entries are procedures with moodCode = EVN, then there’s nothing to say in the narrative (and don’t, if you don’t have to, otherwise it’s just noise for already overwhelmed users). But if the section title is “procedures, past and planned”, then you’ll need to show the mood code for each procedure – and don’t just show the code for the moodCode – give it a meaningful human readable word (e.g. done or planned in this case)

effectiveTime – This is generally necessary to display. It’s rare that the timing of some event simply doesn’t matter – except for future activities that are known not to have happened (the effective time then is the time of booking, and whether to show this is a clinical decision based on the context of use)

id – generally I would recommend not showing these. But it’s a matter of clinical judgement when identifiers are likely to matter to someone reading the  document (when they have to get on the phone and say, look, about diagnostic report XXXX, can you…)

code – generally, this needs to be shown

negationInd – this always needs to be shown if it’s negative. The only exception I could think of is if every entry in the section is negated, and the section title says “failed Medication Administrations” or something like that

text – any time an entry has text, I’d very much expect that the text would be in the narrative – but it might not be quite the same. For instance, the text might be generated using a variation of the method by which the whole narrative was generated, so the information would be similar, but laid out differently. Alternatively, the text could be additional independent information from everything else, and then would be found verbatim in the narrative.

Other entry attributes - well, there’s just too many to list here, and beyond the common elements above, there’s not much simple common guidance I can write. Some things like Observation.value, and SubstanceAdministration.dosage are obvious things to put in the narrative, and other things like Act.priority code are unlikely to be in the narrative unless the act is yet future. Most other things lie somewhere between this. Clinical judgement is always required.


  • Author: probably not worth showing
  • Informant: probably not worth showing? (Wouldn’t it be nice to know that there would always be a way to “unpeel the onion” when it is appropriate)
  • Subject: Always must be shown if it’s different to the section subject, but this is extremely rare. (which reminds me – if the section subject differs to the document subject, that absolutely must be shown in the section narrative)
  • Performer: usually worth showing (as a single piece of text, the name of the person, or the org if there’s no person)
  • Consumable/Product: Show the name and/or code of the DrugOrOtherMaterial
  • Specimen: really, I’m not sure. If it’s a diagnostic report, it’s not usually shown in the narrative (or if it is, it’s in the title or implied in the test code). I haven’t seen specimen in other contexts
  • Participant: it depends on what type of participant and why. Sometimes, this is audit trail type stuff, and you shouldn’t show it. Other times, it’s critical information


  • Reference – should be shown as a linkHtml if possible.
  • Precondition – don’t know.
  • ReferenceRange – usually shown, but the labs know what they want or have to do
  • Component (on organizer) – should always be shown


What to do with Entry Relationships all depends on what they say. These could be:

  • Entries in the narrative table in their own right (sometimes in place of the thing that contains them)
  • nested tables in the main table
  • a single line of text in the main table
  • ignored and not shown in the narrative

It’s a clinical judgement call. If you do show an entry relationship, pay attention to these elements:

  • typeCode / inversionInd – The meaning of the typeCode and inversionInd needs to be conveyed somehow. Very often, this is obvious in the context, and showing it is just noise. But sometimes it matters
  • sequenceNumber – any time this is present, it probably needs to be in the narrative
  • negationInd – see above

Data Type Guidance

  • nullFlavor: if the data type has a nullFlavor, then show some abbreviation for the nullFlavor. I use “–” for NI, n/a for NA, unk for UNK, or the code (the first three are the common ones). If the data type has additional properties as well as a nullFlavor, then generally these would be type /scope information, and something smart is required – e.g. <tel nullFalvor=”UNK” use=”H” value=”tel:’/> should be shown as “Home phone number: unk”.
  • BL / INT / REAL / ST: just show the value directly
  • ED: These shouldn’t be rendered directly. You should use a linkHtml or renderMultimedia to refer to this indirectly
  • CD/CE/CV: if you have original text, then show that. If not, it’ll have to be a display. In a few circumstances, where humans use the codes directly (billing codes!) it’s worth showing the codes directly
  • CS: if you have to show a CS, you should know why, and you should have a representation table that translates the code to something good for a human
  • II: if you have to show one of these (try and avoid it), and it has an extension, show the extension. if it’s just a UUID – don’t show it. Humans hate UUIDS. Just like OIDs
  • TEL: I use at template like this [use] [type] details, where use is a lookup from the use code, if there is one, type is phone or fax depending on the scheme in the value, and details is the rest of the value. if it’s a web url, I’ll drop the [type] and show the value is a linkHtml. But that doesn’t always suit (host systems might not know what to do with a mailto: url in a linkHtml)
  • EN: If you have to render this, use [use] [literal] where use is from the use code(s) and literal is the literal form defined in the abstract data types (the definition is obtuse, but the format is usually pretty good)
  • AD: same as EN
  • PQ: render as [value] [unit]. If you have a different human display unit to the UCUM unit, you can use that (or translate it to a human display. for instance, the UCUM unit uL, some people prefer µL and some mandate mcg)
  • RTO: Use the literal form (though use human units if they available)
  • TS: Depends on the culture. In Australia, we mandated dd-mmm-yyyy hh:nn:[ss][T] (no milliseconds). Other cultures will have their own rules
  • IVL<>: I use the literal form
  • GTS: good luck with this one… hopefully you have some system source text description as well as a GTS. I guess there’s always the literal form.


Question: #FHIR, complexity, and modeling


HL7 V3 is known for increasing complexity up to the point where people give up. The RIM seems not adequate enough for modeling the world of clinical information (see. Barry Smith:

Is FHIR meant to be a cure? I understand that FHIR it is about using a RESTful architectural style in communication of clinical resources, but the resources themselves need to be modeled appropriately. Complexity is not going to go away. Thus, FHIR appears to be another way to slice the elephant, the elephant being the world of clinical information and the need for semantic interoperability. Is there a promise for a better modeling of resources in FHIR?


Well, this is not an easy question. The complexity we struggle with comes from several different sources:

  1. The inherent complexity in medicine, and the underlying biological processes
  2. The necessity to provide healthcare in the absence of clear information, and at the limits of human knowledge
  3. The wide variability in the processes around providing healthcare – including education and billing, both within and between countries
  4. Attempts to systematise information about all this to support computable information

Clearly, in FHIR, we could only attempt to reduce the complexity of the last point, and we have attempted to. We’ve also tried to limit the impact of the 3rd point in the way we handle extensions – we have to handle complexity, but we want to keep it in the right place.  Still, this only deals with point #3

These are the things we do to try to manage complexity properly:

  • We have a ruthless focus on what systems already do, partly because this is yardstick of the complexity that people know how to make work
  • We have a strong focus on testing, testing, and more testing – we don’t iterate our designs in the absence of real world implementation experience
  • We use the language of the domain in our resource designs, rather than a standardised language (e.g. RIM) so that people who actually use it can understand what they see quickly
  • We map to other representations, including underlying ontologies and grammars, to ensure that the design is as well based in formal definitions and logic (perhaps I should say that we aspire to do this, I think we have a long way to go yet)
  • We maintain our own ontology around how resources should be structured and broken up

Having said that, I don’t think any of this is a magic bullet, and FHIR is not trying to define a coherent information framework that allows for semantic computing – we’re simply defining pragmatic exchange formats based on today’s reality. I think that some of the problems that the RIM and ontologies are trying to solve, we’ve just punted them into the future in order to pursue a more practical course in the shorter term. For that reason, I try not to use the word “model” in association with FHIR, because we’re not “modeling healthcare” in the same sense that the RIM is trying to do, or that Barry is criticising it for.

BTW, having finally linked to Barry’s work from my blog, I think that Barry is largely mistaken in his criticisms of the RIM. It’s not that I think it’s beyond criticism (I have a number of my own, including this, though I’m not such a fan that I’d pose with it), but that I think he misunderstands its intent, what it does do, and what it’s not good for, and nor does he understand the boundaries within which it works).

I actually think that modeling healthcare properly is beyond our abilities at this time – too complex for normal people to deal with is too simple to express the things specialists know that computers need to say.

I expect a vigorous debate in the comments on this post… but anyone who comments, please keep my 3 laws of interoperability (It’s all about the people, you can’t get rid of complexity, and you can’t have it all) and my note about semantic interoperability (and #2) in mind.

HL7 Standards and rules for handling errors

It’s a pretty common question:

What’s the required behavior if a (message | document | resource) is not valid? Do you have to validate it? what are you supposed to do?

That question can – and has been – asked about HL7 v2 messaging, v3 messaging, CDA documents, and now the FHIR API.

The answer is that HL7 itself doesn’t say. There’s several reasons for that:

  • HL7′s main focus is to define what is and isn’t valid, and how implementation guides and trading partners can define what is and isn’t valid in their contexts
  • HL7 generally doesn’t even say what your obligations are when the content is valid either – that’s nearly always delegated to implementation guides and trading partner agreements (such as, say, XDS, but I can’t even remember XDS making many rules about this)
  • We often discuss this – the problem is that there’s no right rule around what action to take. A system is allowed to choose to accept invalid content, and most choose to do that (some won’t report an error no matter what you send them). Others, on the other hand, reject the content outright. All that HL7 says is that you can choose to reject the content
  • In fact, you’re allowed to reject the content even if it’s valid with regard to the specification, because of other reasons (e.g. ward not known trying to admit a patient)
  • We believe in Postel’s law:

Be conservative in what you do, be liberal in what you accept from others

  • HL7 doesn’t know what you can accept – so it doesn’t try to make rules about that.

So what’s good advice?


I don’t think that there’s any single advice on this. Whether you should validate instances in practice depends on whether you can deal with the consequences of failure, and whether you can’t deal with the consequences of not validating. Here’s an incident to illustrate this point:

We set up our new (HL7 v2) ADT feed to validate all incoming messages, and tested the interface thoroughly during pre-production. There were no problems, and it all looked good. However, as soon as we put the system into production mode, we started getting messages rejected because the clerical staff were inputting data that was not valid. On investigation we found that the testing had used a set of data based on the formally documented practices of the institution, but the clerical staff had informal policies around date entry that we didn’t test. Rejected messages left the applications out of sync, and caused much worse issues. Rather than try to change the clerical staff, we ended up turning validation off

That was a real case. The question here is, what happens to the illegal dates now that you no longer accept them? If you hadn’t been validating, what would have happened? In fact, you have to validate the data, the only question is, do you validate everything up-front, or only what you need as you handle it?

Now consider the Australian PCEHR. It’s an XDS based system, and every submitted document is subjected to the full array of schema and schematron validation that we are able to devise. We do this because downstream processing of the documents – which happens to a limited degree (at the moment) – cannot be proven safe if the documents might be invalid. And we continuously add further validation around identified safety issues (at least, where we can, though many of the safety issues are not things that automated checks can do anything about).

But it has it’s problems too – because of Australian privacy laws, it’s really very difficult for vendors, let alone 3rd parties, to investigate incidents on site in production systems. The PCEHR has additional rules built around privacy and security which make it tougher (e.g. accidentally sharing the patient identifier with someone who is not providing healthcare services for the patient is a criminal offence).  So in practice, when a document is rejected by the pcEHR, it’s the user’s problem. And the end-user has no idea what the problem is, or what to do about it (schematron errors are hard enough for programmers…).

So validation is a vexed question with no right answer. You have to do it to a degree, but you (or your users) will suffer for it too.

Handling Errors

You have to be able reject content. You might choose to handle failed content in line (let the sender know) or out of line (put it in a queue for a system administrator). Both actions are thoroughly wrong and unsafe. And the unsafest thing about either is that they’ll both be ignored in practice – just another process failure in the degenerate process called “healthcare”.

When you reject content, provide both as specific and verbose message as you can, loaded with context, details, paths, reasons, etc – that’s for the person who debugs it. And also provide a human readable version for the users, something they could use to describe the problem to a patient (or even a manager).

If you administer systems: it’s really good to be right on top of this and follow up every error, because they’re all serious – but my experience is that administrative teams are swamped under a stream of messages where the signal to noise ratio is low, but the real problems are beyond addressing anyway.


Question: Should value sets include nullFlavor Values?

Note – mostly, the questions I post on my blog are implementer focused questions. This is different – it’s from a discussion between myself and a HL7 vocab work group co-chair. It’s about a particularly unclear and relatively distant corner case in the CDA specification space, but I thought I’d post it here to get the answer into google. If you don’t even understand the question, don’t worry…


For a CDA value set, my understanding is that one should only include the domain specific values, and that the null flavors would not appear in the value set. Is this true?


This is a little understood corner of our specifications. As a data types guy, I’d say (and have said previously):

We expect people to use the value set machinery to constrain the nullFlavors that can be used for any particular element

This is because nullFlavors are part the value set of the element – so if you were specifying a value set for an element, you’d therefore specify a value set that included the nullFlavors that were allowed. But I see that except for CD etc, that would have to be an adjunct to the value domain, expressed as a value set.You might say, for example, this element can have the values of 0.0-5.0 mg/dL, or NI or UNK, but nothing else. Clearly, you’ll have to the two parts of that differently, and the intention was that you’d use a value set for the nullFlavor part. 

Only, to my knowledge, no one has ever done this – and my own v3 / cda tools have not done this; you can specify nullFlavors in a value set, but my tools – along with everyone else’s, as far as I know, would then expect that they’d appear like this: 

  <code code="NI" codeSystem="2.16.840.1.113883.5.1008"/>

instead of

  <code nullFlavor="NI"/>

(2.16.840.1.113883.5.1008 is the OID for the NullFlavor code system). Concerning these 2 representations, release 2 of the v3 data types says:

The general implication of this is that in a CD or descendant (usually CS), when the code for a nullFlavor is carried in the code/codeSystem (code = “NI” and codeSystem = “2.16.840.1.113883.5.1008″), the CD itself is not null. The CD is only null when its nullFlavor carries this code

That text was written to underline the special meaning of the nullFlavor attribute, but it does serve to say that the first variant above is different to the second, and therefore, it has no defined meaning, and shouldn’t be used.

So my tools would handle the appearance of nullFlavors in the value set wrongly, and I think everyone’s would too, and I’m sure that we never say anything about this in our methodology documentation (e.g. core principles). In fact, looking through the v3 standards, I found:

In many cases, a CD is created from a value set – either a code/code system pair is chosen from a valueSet, or one is not chosen and the CD has the exceptional value of NullFlavor.OTH

Well, I have no idea what that would mean if you put OTH in the value set!

Well, where does this leave us?

  • In practice, including nullFlavors in a value set won’t achieve anything useful
  • We haven’t really said quite what should happen if you do
  • So there’s no real solution for controlling nullFlavors properly

Note that in v2, and FHIR, nullFlavors are not included in the base data type – instead they are included in the value set explicitly, and represented like any other code

Update: There’s another factor to consider:  the existing value sets are designed on the basis that nullFlavors are still allowed, and so that they don’t go in the value set. So deciding now that valuesets define which nullFlavors are allowed invalidates all the existing value sets. So really, to use value sets, we need to define a new binding path. Else you use constraints, as Sarah demonstrates in her comments below.


Set the RIM free!

Last week I met some enterprise architects who are doing a major project around enterprise modeling in a healthcare project. Without ever having learnt anything about the RIM, they had reproduced the Entity -> Role -> Participation -> Act -> Act Relationship cascade, though their pattern was not so solid, and they had slightly different names for these things.

But the really interesting bit was what they had along side this pattern: work flows, conditions, privileges, policies, and organizational goals. Their perspective on the problem was far wider than the RIM, because they are not focused on the informational content of a healthcare exchange, but on understanding the enterprise, the whole business of developing healthcare, and the patterns that they had that overlapped the RIM were just a part of their overall ontology.

For a long time, I’ve believed that the RIM is really misapplied – that the core strengths of the RIM are as a grammar, an ontology pattern for making structured sense out of the healthcare world. Limiting it’s scope to information exchange – what we tried to use it for – meant that there were certain things it could not address properly. For a start, it was compromised by certain patterns around exchange that aren’t sensible ways to handle processes (like trying to describe a building as a set of conversations between architect and builder…). Making it a class model, rather than a more pure ontology – that limits what it can be too – it forces all the patterns to be pre-coordinated into a single structure. Something can be only one thing in that pattern. But there are other patterns to pick from, other patterns to add to the picture. Some of the underlying ontology arguments around the RIM are driven by this limitation – it isn’t a good ontology  for everything because it wasn’t trying to be, it was only trying to cover what you might exchange about everything.

Still, seeing that approach used in a wider concept was quite a revelation for me – there’s a lot of possibilities for something like the RIM once you free it from it’s constraints. We don’t need it to be a class model. We don’t need it to only cover information exchange. So let’s change that – let’s open it up and have the o-RIM – the Ontological RIM, and let it aspire to be the underlying model of the healthcare enterprise, and see how that goes.

Where the RIM is really succeeding (and I don’t count CDA – that succeeds in spite of being hooked up to the RIM) is in application architecture usage – and I can only think that these would be empowered by having a wider focus.

Se let’s free the RIM

Question: RIM XMI Files


The RIM spec has a zip containing a set of XMI files (rim0241i) presumably which contain UML data so that one can use the modeling information as a base for hl7v3 development.  I have not had a ton of luck importing and using these files – a lot of errors from the importing tools.  After some research I found this “At the moment there are several incompatibilities between different modeling tool vendor implementations of XMI, even between interchange of abstract model data. The usage of Diagram Interchange is almost nonexistent. Unfortunately this means exchanging files between UML modeling tools using XMI is rarely possible.” (from  So my question is how can I extract the information from these XMI files?  Is there a recommended tool to use?


These files were produced by an old version of Rational. There are plans to update to a more recent version, but even this won’t help much. The whole area of XMI support is very disappointing – it feels to me as if this was a spec that the tooling vendors were never serious about supporting: getting it right was never mission critical to any of them.

You could hand-write a transform from that version to an XMI that your tool supports (not that they really ever document that either). Alternatively, I’ve posted my working EAP file for this RIM diagram at



Unfortunately, I don’t know which RIM version this is.

I don’t know if anyone can suggest a better option in the comments – has anyone ever actually done anything with these XMI files?

More fun with original Text

As I’ve described before (and here) originalText is a challenge.

Firstly, originalText is a consistent pattern throughout the HL7 data types (v2, v3, and FHIR) is that a coded value is represented by some variation of a cluster that contains

  • (code / system / display) 0..*
  • original text

There’s variations to this theme amongst the CE/CNE,/CWE, CD/CE/CV, and Coding/CodeableConcept, but they all have the same basic pattern. The multiple codes are for equivalent codes that say the same thing in different coding systems. Here’s a v3 example (from the spec):

<code code='195967001' codeSystem='2.16.840.1.113883.19.6.96' 
    codeSystemName='SNOMED CT' displayName='Asthma'>
  <originalText>Mild Asthma</originalText>
  <translation code='49390' codeSystem='2.16.840.1.113883.19.6.2' 
   codeSystemName='ICD9CM' displayName='ASTHMA W/O STATUS ASTHMATICUS'/>

This has the original text “Mild Asthma”, and two different codes, one from ICD-9-CM, and one from Snomed-CT. That’s a pretty straight forward idea.

The problem

But what if the original text is something a little more complex?

Diabetes & Hypertension

The problem with this original text is that there’s going to be two codes. If we stick to just Snomed-CT, this is codes 73211009: Diabetes mellitus, and 38341003: Hypertensive disorder, systemic arterial (I think). There’s no appropriate code that covers both. And this:

<code code='73211009' codeSystem='2.16.840.1.113883.19.6.96' 
    codeSystemName='SNOMED CT' displayName='Diabetes mellitus'>
  <originalText>Diabetes & Hypertension</originalText>
 <translation code='38341003' codeSystem='2.16.840.1.113883.19.6.96' 
   codeSystemName='SNOMED CT' displayName='Hypertensive disorder'/>

is not legal, because there’s no way that these two codes fit anywhere near the definition of

The translations are quasi-synonyms of one real-world concept. Every translation in the set is supposed to express the same meaning “in other words.”

And this becomes not merely a thereortical problem if you’re trying to provide translations to yet another code system as well. So this is a problem – you have to pick one of the codes.

I suppose you could tell the user that “Diabetes & Hypertension” is not a valid text to insert. I’m sure they’ll be fine with that.


2nd try

An alternative is to say that we ensure that in the circumstance this can arise – say, the reason for prescribing a medication – allows 0..* coded data types, not just 0..1. Then we can do this:

<code code='73211009' codeSystem='2.16.840.1.113883.19.6.96' 
    codeSystemName='SNOMED CT' displayName='Diabetes mellitus'/>
<code code='38341003' codeSystem='2.16.840.1.113883.19.6.96' 
   codeSystemName='SNOMED CT' displayName='Hypertensive disorder'/>

Only, where did the original text go? Well, you could repeat it, I suppose. That’d be technically correct:

<code code='73211009' codeSystem='2.16.840.1.113883.19.6.96' 
    codeSystemName='SNOMED CT' displayName='Diabetes mellitus'>
  <originalText>Diabetes & Hypertension</originalText>
<code code='38341003' codeSystem='2.16.840.1.113883.19.6.96' 
 codeSystemName='SNOMED CT' displayName='Hypertensive disorder'>
  <originalText>Diabetes & Hypertension</originalText>

So now, a receiving application has to rip through the codes and eliminate duplicate original texts – that’s not my favourite outcome.


We have a third option in FHIR: break apart the data type, and declare that the containing element (the reason for prescription in this case) isn’t a CodeableConcept, but a complex element that has a have a text element, and 0..* coding elements that don’t have to be equivalent:

   <code value='73211009'/>
   <system value="/>
   <display value='Diabetes mellitus'/>
   <code value='38341003'/>
   <system value="/>
   <display value='Hypertensive disorder'/>
 <text value="Diabetes & Hypertension"/>

Actually, this has exactly the same look on the wire, but it’s defined differently. So this is ok, but there’s no facility for providing translations to other code systems (and the RIM mapping has now become impossible because we haven’t solved this for the RIM)


Original Text continues to be a problem because it cross-cuts the coding structures. If only we could force end-users to value coding enough that we could get rid of text ;-)



On the subject of original text for Codes

In the various coded data types defined by HL7 across v2, v3, and FHIR, there’s a property named text or originalText that is defined using some variant of these words:

The text as seen and/or selected by the user who entered the data which represents the intended meaning of the user.

Original text can be used in a structured user interface to capture what the user saw as a representation of the code or expression on the data input screen, or in a situation where the user dictates or directly enters text, it is the text entered or uttered by the user.

Unfortunately, what this exactly means is a matter of interpretation. The key question is, to what degree does the context affect the interpretation of the text that represents the code, and therefore, to what degree does the context contribute to the original text?

I’ll illustrate the discussion with an example. In SNOMED CT, there’s a (large) heirarchy for organism type. Part of the hierarchy contains codes for virii. A subset of this is found in the PHINVADS value set “Virus types answer list specific to Arbovirus/ArboNet reporting“. This lists 17 codes for type of virus. So you could easily imagine some kind of UI, for instance, where users would select one of the codes from a pick list:

In this case, the original text is the same as the Snomed-CT preferred name, and it’s pretty straight-forward to understand. If, for instance, the user picked “Eastern equine encephalitis virus”, then that’s the original text, and nothing further is needed.

However, a lot of system designers will look at this and say, the word “virus” is repeated in every entry, and that’s just a tax on the users. We should get rid of it. That would give you an entry like this:

Actually, in this case, the example is pretty trivial. “Virus” isn’t hard to read. But how about this SNOMED CT preferred term: “Cholecystectomy with exploration of common bile duct and choledochoenterostomy” – there’s quite a lot of potential for useful simplification there, especially where the set of codes are all siblings, such as the variations of strength in a particular medication:

This somewhat extreme example is from AMT. I doubt any reader can even figure out the differences between those 4 codes. How much easier this is:

Hopefully that example will serve to illustrate that this isn’t just a UI best practices issue – as the codes become finer, it starts to become a clinical safety issue too.

Back to the virus case: if the user picks “Eastern equine encephalitis”, then is the original text “Virus: Eastern equine encephalitis” or just “Eastern equine encephalitis”? What actually works best depends on quite how the original text is going to be used. If the original text is used as the faithful reproduction of the meaning of the user in a similar context as the user entered, then the minimal text the user actually picked is the useful original text – but how similar? If, on the other hand, the original text is used out of context, the full context of the data entry of the code should be represented – but this could be a combination of the text the user actually picked, the field name, additional words taken out of the explicit context on the screen, and even some text that is implicit in the clinical context.

To make things even more fun, a contributor on the HL7 vocabulary mailing list offered this example:



I’m not sure what the best way to resolve this. How do you make original text reliably useful for both uses when the user interface isn’t nailed down?

Well, one way is to rely on the value set – the value set description should contain the information that is implicit in the context. So the true original text would be the value set description + the user picked text. Though I don’t think that any particular field in the value set (either v3 or FHIR) is defined for this purpose in mind. Perhaps that’s something we should address?


Question: v3 development resources


I am attempting to develop an implementation of the hl7v3 standard in PHP (specifically the CDA messaging portion) for MU2 certification.  I have been searching various sites and have downloaded the RIM specs (  My problem is the mif seems to be somewhat proprietary, and the UML versions won’t open in any UML tool that I have tried (all error out).  Where can I get a decent view of the models / model structure so that I can create a useful and meaningful implementation (so far the RIM.png is all I can find)?


You need the full v3 specification, which includes a number of processible resources – at least, that is, what’s possible to provide given the way v3 works.

You can get it here:

Seriously, the MIF is your best option. Sure, it’s proprietary (more than somewhat) – but that’s what there is. And there is a MIF for CDA. See more about the MIF here:

But why do all this? for CDA? It’s just overkill – all stuff you ain’t ever going to need. Just do CDA – it’s about 2 orders less complex than all of v3. Like building a bulldozer so you can plant a single flower….  Use the schema, and go from there.