Category Archives: v3

Question: Entering the #FHIR Bandwagon


HL7 v1 came first, followed by V2, followed by CDA, V3. For newer entrants into Standards and Interoperability is the central dogma of S&I is something like Learn V2 first, then CDA, then V3 and then FHIR and then SMART on FHIR and then Newer or a person can just straightaway buy an FHIR textbok or Collect all FHIR blogs at one place and start reading from scratch?


V2, CDA, FHIR+Smart on FHIR are all different approaches to solving various healthcare interoperability problems. If you’re deeply invested in the standards process, then learning all of v2, CDA, and FHIR will give you deeper perspective about what has and hasn’t worked, and what ideas last across the various specifications. But if you’re just solving a particular problem, then you just pick one, learn it, and give into the job.

Which one to learn? 

  • Actually, this is simple. If you have to use one, because of legal requirements, or because your trading partners have already chosen one – that’s by the far the most likely situation – then you don’t have to decide. Otherwise, you’d use FHIR

How to learn? From a previous post, these are the books I recommend:

None of these cover FHIR. But, in fact, Tim invited me to join with him for the forthcoming 3rd edition of his book, to cover FHIR, and this should be available soon. In the meantime, as you say, there’s the FHIR blogs.

Question about storing/processing Coded values in CDA document

If a cda code is represented with both a code and a translation, which of the following should be imported as the stored code into CDR:
  1. a normalised version of the Translation element (using the clinical terminology service)
  2. the Code element exactly as it is in the cda document
The argument for option 1 is that since the Clinical Terminology Service is the authoritative source of code translations, we do not need to pay any attention to a ‘code’ field, even though the authoring system has performed the translation themselves (which may or may not be correct). The argument for option 2 is that Clinicians sign off on the code and translation fields provided in a document.  Ignoring the code field could potentially modify the intended meaning of the data being provided.
First, to clarify several assumptions:
“Clinicians sign off on the code and translation fields provided in a document”
Clinicians actually sign off on the narrative, not the data. Exactly how a code is represented in the data – translations or whatever – is not important like maintaining the narrative. Obviously there should be some relationship in the document, but exactly what that is is not obvious. And exactly what needs to be done with the document afterwards is even less obvious.
The second assumption concerns which is the ‘original’ code, and which is the ‘translation’. There’s actually a number of options:
  • The user picked code A, and some terminology server performed a translation to code B
    • The user picked code A, and some terminology server in a middle ware engine performed a translation to code B
  • The user picked code X, and some terminology server performed translations to both code A and B
  • The user was in some workflow, and this as manually associated with codes A and B in the source system configuration
  • The user used some words, and a language processor determined codes A and B
    • The user used some words, and two different language processors determined codes A and B

ok, the last is getting a bit fanciful – I doubt there’s one workable language processor out there, but there are definitely a bunch out there being evaluated. Anyway, the point is, the relationship between code A and code B isn’t automatically that one is translated from the other. The language in the data types specification is a little loose:

A CD represents any kind of concept usually by giving a code defined in a code system. A CD can contain the original text or phrase that served as the basis of the coding and one or more translations into different coding systems

It’s loose because it’s not exactly clear what the translations are of:

  • “a code in defined in a code system”
  • “the original text”
  • the concept

The correct answer is the last – each code, and the text, are all representations of the concept. So the different codes may capture different nuances, and it may not be possible to prove that the translation between the two codes is valid.

Finally, either code A or code B might be the root, and the other the translation. The specification says 2 different things about which is root: the original one (if you know which it is), or the one that meets the conformance rule (e.g. if the IG says you have to use SNOMED CT, then you put that in the root, and put the other in the translation, irrespective of the relationship between them).

Actually, which people do depends on what their trading partner does. One major system that runs several important CDRs ignores the translations altogether….

Turning to the actual question: what should a CDR do?

I think that depends on who’s going to be consuming / processing the data. If the CDR is an analysis end point – e.g. data comes in, and analysis reports come out, and also if the use cases are closed, then you could be safe to mine the CD looking for the code your terminology server understands, and just store that as a reference.

But if the use cases aren’t closed, so that it turns out that a particular analysis would be better performed against a different code system, then it would turn out that storing just the one understood reference would be rather costly. A great case is lab data that is coded by both LOINC and SNOMED CT – each of those serves different purposes.

This some applies if other systems are expected to access the data to do their own analysis – they’ll be hamstrung without the full source codes from the original document.

So unless your CDR is a closed and sealed box – and I don’t believe such a thing exists at design time – it’s really rather a good idea to store the Code element exactly as it is in the CDA document (and if it references that narrative text for the originalText, make sure you store that too)



Observation of Titers in HL7 Content

Several important diagnostic measures take the form of a Titer. Quoting from Wikipedia:

A titer (or titre) is a way of expressing concentration. Titer testing employs serial dilution to obtain approximate quantitative information from an analytical procedure that inherently only evaluates as positive or negative. The titer corresponds to the highest dilution factor that still yields a positive reading. For example, positive readings in the first 8 serial twofold dilutions translate into a titer of 1:256 (i.e., 2−8). Titers are sometimes expressed by the denominator only, for example 1:256 is written 256.

A specific example is a viral titer, which is the lowest concentration of virus that still infects cells. To determine the titer, several dilutions are prepared, such as 10−1, 10−2, 10−3, … 10−8.

So the higher the titer, the higher the concentration. 1:2 means a lower concentration than 1:128 (note that this means the clinical intent is the opposite of the literal numeric intent – as the titre gets lower, the concentration gets higher).

Titers are pretty common in clinical diagnostics – I found about 2600 codes for titer type tests in LOINC v2.48 (e.g. Leptospira sp Ab.IgG).

Representing Titers in HL7 Content

In diagnostic reports, titers are usually presented in the text narrative (or the printed form) using the form 1:64, since this makes clear the somewhat arbitrary nature of the numbers in the value. However it’s not unusual for labs to report just the denominator (e.g. “64”) and the person interpreting the reports is required to understand that this is a titer test (this is usually stated in the name).

When it comes to reporting a Titer in structured computable content, there’s several general options:

  • represent it as a string, and leave it up to the recipient to parse that if they really want
  • represent it as an integer, the denominator
  • use a structured form for representing the content

Each of the main HL7 versions (v2, CDA, and FHIR) offer options for each of these approaches:

String Integer Structured
V2 OBX||ST|{test}||1:64 OBX||NM|{test}||64 OBX||SN|{test}||^1^:^64
CDA <value xsi:type=”ST”> 1:64 </value> <value xsi:type=”INT” value=”1:64″/> <value xsi:type=”RTO_INT_INT”> <numerator value=”1”/> <denominator value=”64”> </value>
FHIR “valueString “ : “1:64” “valueInteger” : ”64” “valueRatio”: { “numerator” : { “value” : “1” }, “denominator” : { “value” : “64” } }

(using the JSON form for FHIR here)

One of the joys of titres is that there’s no consistency between the labs – some use one form, some another. A few even switch between representations for the same test (e.g. one LOINC code, different forms, for the same lab).

This is one area where there would definitely be some benefit – to saying that all labs should use the same form. That’s easy to say, but it would be really hard to get the labs to agree, and I don’t know what the path to pushing for conformance would be (in the US, it might be CLIA; in Australia, it would be PITUS; for other countries, I don’t know).

One of the problems here is that v2 (in particular) is ambiguous about whether OBX-5 is for presentation or not. It depends on the use case. And labs are much more conservative about changing human presentation than changing computable form – because of good safety considerations. (Here in Australia, the OBX05 should not be used for presentation, if both sender and receiver are fully conformant to AS 4700.2, but I don’t think anyone would have any confidence in that). In FHIR and CDA, the primary presentation is the narrative form, but the structured data would become the source of presentation for any derived presentation; this is not the primary attested presentation, which generally allays the lab’s safety concerns around changing the content.

If that’s not enough, there’s a further issue…

Incomplete Titers

Recently I came across a set of lab data that included the titer “<1:64”. Note that because the intent of the titre is reversed, it’s not perfectly clear what this means. Does this mean that titre was <64? or that the dilution was greater than 64. Well, fortunately, it’s the first. Quoting from the source:

There are several tests (titers for Rickettsia rickettsii, Bartonella, certain strains of Chlamydia in previously infected individuals, and other tests) for which a result that is less than 1:64 is considered Negative.  For these tests the testing begins at the 1:64 level and go up, 1:128, 1:256, etc.   If the 1:64 is negative then the titer is reported as less than this.

The test comes with this sample interpretation note:

Rickettsia rickettsii (Rocky Mtn. Spotted Fever) Ab, IgG:

  • Less than 1:64: Negative – No significant level of Rickettsia rickettsii IgG Antibody detected.
  • 1:64 – 1:128: Low Positive – Presence of Rickettsia rickettsii IgG Antibody detected
  • 1:256 or greater: Positive – Presence of Rickettsia rickettsii IgG Antibody, suggestive of recent or current infection.

So, how would you represent this one in the various HL7 specifications?

String Integer Structured
V2 OBX||ST|{test}||<1:64 {can’t be done} OBX||SN|{test}||<^1^:^64
CDA <value xsi:type=”ST”> &lt;1:64 </value>  {can’t be done} <value xsi:type=”IVL_RTO_INT_INT”> <high> <numerator value=”1”/> <denominator value=”64”> </high> </value>
FHIR “valueString “ : “<1:64” {can’t be done} “valueRatio”: { “numerator” : { “comparator” : “<”, “value” : “1” }, “denominator” : { “value” : “64” } }

This table shows how the stuctured/ratio form is better than the simple numeric – but there’s a problem: the CDA example, though legal in general v3, is illegal in CDA because CDA documents are required to be valid against the CDA schema, and IVL_RTO_INT_INT was not generated into the CDA schema. I guess that means that the CDA form will have to be the string form?



#FHIR CDA Position Statement & Roadmap: Joint Statement with Lantana

Lantana Consulting Group invited me to take part in the Spring CDA Academy after the HL7 Working meeting in Phoenix in May, which I enjoyed greatly. While I was there, we spent some time discussing the relationship between CDA and FHIR, both where things are today, and where we think they should be.

This is a pretty important subject, and from the beginning of our work on FHIR, one of the most common questions that we have been asked about FHIR is “what about CDA?”. Sometime, we get asked a more specific question:  “What does Lantana think about FHIR?”.

Since the CDA Academy, we’ve been working on a joint statement that summarizes the outcome of our discussions, a shared expression of where we believe that we are, and should be. Today, Lantana Consulting Group have posted our position statement on FHIR and CDA (see their blog post):

This position statement addresses the relationship between HL7’s Clinical Document Architecture (CDA) product line and the Fast Health Interoperability Resource (FHIR) product line. It was prepared jointly by Lantana Consulting Group—a recognized leader in the CDA community—and Grahame Grieve, Health Intersections, the FHIR project lead. This statement is not official policy. It is our hope that it will stimulate discussion and possibly guide policy makers, architects, and implementers as well as standards developers.

An underlying key concept for this position statement is that difference between a “package of data and narrative” and interactive access to the narrative and data in a patient or institution’s record, and that both have their place for exchange. Quoting from the document: 

CDA addresses interoperability for clinical documents, mixing narrative and structured data. FHIR provides granular access to data, a contemporary, streamlined approach to interoperability, and is easy to implement. FHIR can be the future of CDA, but it is not there yet.

FHIR offers considerable promise, but it’s certainly true that we have a long way to go yet. The joint statement issues a call to action:

FHIR DSTU 1 is not a replacement for CDA or C-CDA. Building out the specification so that it can represent existing documents as FHIR resources, and ensuring that FHIR resources can be integrated into CDA documents should be the focus of the next iteration of the DSTU

This is explicitly part of the scope of the next DSTU version of FHIR: to address the areas that CCDA covers, and several Lantana employees have already been working with us on this; I look forward to increased focus on this work.

Explaining the v3 II Type again

Several conversations in varying places make me think that it’s worth explaining the v3 II type again, since there’s some important subtleties about how it works that are often missed (though by now the essence of it is well understood by many implementers). The II data type is “An identifier that uniquely identifies a thing or object“, and it has 2 important properties:

root A unique identifier that guarantees the global uniqueness of the instance identifier. The root alone may be the entire instance identifier.
extension A character string as a unique identifier within the scope of the identifier root.

The combination of root and extension must be globally unique – in other words, if any system ever sees the same identifer elsewhere, it can know beyond any doubt that this it is the same identifier. In practice, this fairly simple claim causes a few different issues:

  • How can you generate a globally unique identifier?
  • The difference between root only vs. root+extension trips people up
  • How do you know what the identifier is?
  • What if different roots are used for the same thing?
  • The difference between the identifier and the identified thing is an end-less source of confusion
  • How to handle incompletely known identifiers

How can you generate a globally unique identifier? 

At first glance, the rule that you have to generate a globally unique identifier that will never be used elsewhere is a bit daunting. How do you do that? However, in practice, it’s actually not that hard. The HL7 standard says that a root must be either a UUID, or an OID.


Wikipedia has this to say about UUIDs:

Anyone can create a UUID and use it to identify something with reasonable confidence that the same identifier will never be unintentionally created by anyone to identify something else. Information labeled with UUIDs can therefore be later combined into a single database without needing to resolve identifier (ID) conflicts.

UUIDs were originally used in the Apollo Network Computing System and later in the Open Software Foundation‘s (OSF) Distributed Computing Environment (DCE), and then in Microsoft Windows platforms as globally unique identifiers (GUIDs).

Most operating systems include a system facility to generate a UUID, or, alternatively, there are robust open source libraries available. Note that you can generate this with (more than) reasonable confidence that the same UUID will never be generated again – see Wikipedia or Google.

Note that UUIDs should be represented as uppercase (53C75E4C-B168-4E00-A238-8721D3579EA2), but lots of examples use lowercase (“53c75e4c-b168-4e00-a238-8721d3579ea2”). This has caused confusion in v3, because we didn’t say anything about this, so you should always use case-insensitive comparison. (Note: FHIR uses lowercase only, per the IETF OID URI registration).

For people for whom a 10^50 chance of duplication is still too much, there’s an alternative method: use an OID


Again, from Wikipedia:

In computing, an object identifier or OID is an identifier used to name an object (compare URN). Structurally, an OID consists of a node in a hierarchically-assigned namespace, formally defined using the ITU-T‘s ASN.1 standard, X.690. Successive numbers of the nodes, starting at the root of the tree, identify each node in the tree. Designers set up new nodes by registering them under the node’s registration authority.

The easiest way it illustrate this is to show it works for my own OID,

1 ISO OID root
2 ISO issue each member (country) a number in this space
36 Issued to Australia. See – you can use your Australian Company Number (ACN) in this OID space
146595217 Health Intersections ACN

All Australian companies automatically have their own OID, then. But if you aren’t an Australian company, you can ask HL7 or HL7 Australia to issue you an OID – HL7 charges $100 for this, but if you’re Australian (or you have reasons to use an OID in Australia), HL7 Australia will do it for free. Or you can get an OID from anyone else who’ll issue OIDs. In Canada, for instance, Canada Health Infoway issues them.

Within that space, I (as the company owner) can use it how I want. Let’s say that I decide that I’ll use .1 for my clients, I’ll give each customer a unique number, and then I’ll use .0 in that space for their patient MRN, then an MRN from my client will be in the OID space “″

So as long as each node within the tree of nodes is administered properly (each number only assigned/used once), then there’ll never be any problems with uniqueness. (Actually, I think that the chances of the OID space being properly administered are a lot lower than 1-1/10^50, so UUIDs are probably better, in fact)

Root vs Root + extension

In the II data type, you must have a root attribute (except in a special case, see the last point below). The extension is optional. The combination of the 2 must be unique. So, for example, if you the root is a GUID, and each similar object simply gets a new UUID, then there’s no need for a extension:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2"/>

If, on the other hand, you’re going to use “53C75E4C-B168-4E00-A238-8721D3579EA2” to identify the set of objects, then you need to assign each object an extension:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2" extension="42"/>

You’ll need some other system for uniquely assigning the extension to an object. Customarily, a database primary key is used.

With OIDs, the same applies. If you assign a new OID to each object (as is usually done in DICOM systems for images), then all you need is a root:

<id root="1.2.840.113663.1100.16233472.1.911832595.119981123.1153052"/> <!-- picked at random from a sample DICOM image -->

If, on the other hand, you use the same OID for the set of objects, then you use an extension:

<id root="1.2.840.113663.1100.16233472.1.911832595.119981123" extension="1153052"/>

And you assign a new value for the extension for each object, again, probably using a database primary key.

Alert readers will now be wondering, what’s the difference between those two? And the answer is, well, those 2 identifiers are *not* the same, even though they’re probably intended to be. When you use an OID, you have to decide: are sub-objects of the id represented as new nodes on the OID, or as values in the extension. That decision can only be made once for the OID. So it’s really useful how only about 0.1% of OIDs actually say in their registration (see below for discussing registration). So why even allow both forms? Why not simply say, you can’t use an extension with an OID? The answer is because not every extension is a simple number than can be in an OID. It’s common to see composite identifiers like “E34234” or “14P232344” or even simply MRNs with 0 prefixed, such as 004214 – none of these are valid OID node values.

Note for Australian readers: the Health Identifier OID requires that IHIs be represented as an OID node, not an extension, like this: Not everyone likes this (including me), but that’s how it is.

The notion that root + extension should be globally unique sounds simple, but I’ve found that it’s real tricky in practice for many implementers.

Identifier Type

When you look at an identifier, you have no idea what type of identifier it is:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2" extension="42"/>

What is this? You don’t know either what kind of object it identifies, or what kind of identifier it is. You’ll have to determine that from something else than the identifier – the II is *only* the identifier, not anything else. However it might be possible to look it up in a registry. For instance, the IHI OID is registered in the HL7 OID registry. You can look it up here. That can tell you what it is, but note two important caveats:

  • There’s no consistent definition of the type – it’s just narrative
  • There’s no machine API to the OID registry

For this reason, that’s a human mediated process, not something done on the fly in the interface. I’d like to change that, but the infrastructure for this isn’t in place.

So for a machine, there’s no way to determine what type of identifier this is from just reading the identifier – it has to be inferred  from context – either the content around the identifier, or reading an implementation guide.

Different Roots for the same thing?

So what stops two different people assigning different roots to the same object? In this case, you’ll get two different identifiers:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2" extension="42"/>
<id root="" extension="42"/>

Even though these are same thing. What stops this happening?

Well, the answer is, not much. The idea of registering identifiers is actually to try and discourage this – if the first person registers the identifier, and the second person looks for it before registering it themselves, then they should find it, and use the same value. but if they’re lazy, or the description and/or the search is bad, then there’ll be duplicates. And, in fact, there are many duplicates in the OID registry, and fixing them (or even just preventing them) is beyond HL7.

Difference between identifier and identified thing

In theory, any time you see the same object, it should have the same identity, and any time you see the same identifier, it should be the same object. Unfortunately, however, life is far from that simple:

  • The original system might not store the identifier, and it’s just assigned on the fly by middleware. Each time you see the same object, it’ll get a new identity. Personally, I think this is terrible, but it’s widely done.
  • The same identifier (e.g. lab report id) might be used for multiple different representations of the same report object
  • The same identifier (e.g. Patient MRN) might be used for multiple different objects that all represent the same real world thing
  • The same identifier (e.g. lab order number) might be used for multiple different real world entities that all connect to the same business process

R2 of the datatypes introduced a “scope” property so you could identify which of these the identifier is, but this can’t be used with CDA.

Incomplete identifiers

What if you know the value of the identifier, but you don’t know what identifier it actually is? This is not uncommon – here’s two scenarios:

  • A patient registration system that knows the patient’s driver license number, but not which country/state issued it
  • A point of care testing device that knows what the patient bar-code is, but doesn’t know what institution it’s in

In both these cases, the root is not known by the information provider. So in this case, you mark the identifer as unknown, but provide the extension:

<id nullFlavor="UNK" extension="234234234"/>

Note: there’s been some controversy about this over the years, but this is in fact true, even if you read something else somewhere else.

Error in R2 Datatypes

Today, Rick Geimer and Austin Kreisler discovered a rather egregious error in the Data Types R2 specifications. The ISO data types say:

This specification defines the following extensions to the URN scheme:

hl7ii – a reference to an II value defined in this specification. The full syntax of the URN is urn:hl7ii:{root}[:{extension}] where {root} and {extension} (if present) are the values from the II that is being referenced. Full details of this protocol are defined in the HL7 Abstract Data Types Specification

Reference: Section of the ISO Data types

However, if you look through the Abstract Data Types R2 for information about this, you won’t find anything about hl7ii. But there is this, in section

The scheme hl7-att is used to make references to HL7 Attachments. HL7 attachments may be located in the instance itself as an attachment on the Message class, or in some wrapping entity such as a MIME package, or stored elsewhere.

The following rules are required to make the hl7-att scheme work:

  1. Attachments SHALL be globally uniquely identified. Attachment id is mandatory, and an ID SHALL never be re-used. Once assigned, an attachment id SHALL be accosiated with exactly one byte-stream as defined for
  2. When receiving an attachment, a receiver SHOULD store that attachment for later reference. A sender is not required to resend the same attachment if the attachment has already been sent.
  3. Attachment references SHALL be resolved against all stored attachments using the globally unique attachment identifier in the address.

(p.s. references are behind HL7 registration wall, but are free)

These are meant to be the same thing, but they are named differently: urn:hl7ii:… vs hl7-att:…

I guess this will have to be handled as a technical correction to one of the two specifications. I prefer urn:hl7ii:.

Comments welcome.

Question: HL7 Open source libraries


I work on EHR, and I want to use HL7 to communicate with different systems in a medical environment, for that I found apis (Hapi, nHapi, javasig, Everest,) and I don’t know what is the best and what are the selection criteria


Well, what are your selection criteria? The first question is whether you are doing HL7 v2 or v3. Almost certainly, it’s v2. What language are you using? (or want to use).

Here’s some information about the libraries:

  • HAPI: an open-source, object-oriented HL7 2.x parser for Java – it also includes a number of other utilities for using HL7 v2
  • NHAPI: NHapi is a port of the original project HAPI. NHapi allows Microsoft .NET developers to easily use an HL7 2.x object model
  • Javasig: A library of code written to process V3 messages based on the MIF definitions
  • Everest:  Designed to ease the creation, formatting, and transmission of HL7v3 structures with remote systems. Supports CDA R2, and canadian messages

My evaluations:

  • HAPI – a solid well established community, with runs on the board and reliable code. I’d be happy to use this
  • nHAPI – as far as I can tell, not well established, and given it’s for an HIE, I question the long term viability of the community. Unlike HAPI, I never hear of people actually using this
  • Javasig: this is dead – note that the only link I found was to a web archive. You’d have to be desperate to try to use it, though the maintainers might get interested if you did
  • Everest: this has a community of users in Canada, but I’ve not heard of any use outside there. I’m not sure to what degree Mohawk are supporting it (I’ll ask)

You should consider one of the paid libraries – given the amount of hours you’re going to invest in the work (1000s, I bet), a few thousand for a software library and related tools is peanuts. There’s a lot of good choices there (btw, I have one of my own, which doesn’t cost that much).


CDA: What information from the Entries has to go in the narrative?

Most CDA implementation guides, and many tutorials – including some of both I wrote myself – say something like this:

it is an absolute requirement of all CDA documents that the section narrative text completely capture all the information represented in the structured elements within the section

(emphasis added) This example is from Brian’s article I linked to in my previous post, but the same language exists all over the place.

Only, it’s not true. What it should say is, “the section narrative must capture all the clinically relevant information represented in the structured elements within the section”. Which raises the obvious question, ‘well, what’s clinically relevant’?

For comparison, this is the definition of the underlying field that includes the words:

Act.text SHOULD NOT be used for the sharing of computable information

That shouldn’t be understood to mean that you can’t put information that is computable in Act.text, but that it’s meant for human consumption, and you put in there what’s relevant for a human.

What is “Clinically Relevant”?

The rest of this post gives a set of suggestions about what is clinically relevant in the entries. But it’s really important to understand that this is all just suggestions. The decision has to be made by some clinically aware person who knows what the data means – don’t let programmers do it (and I see a lot of documents where the programmers have done it, and what they did doesn’t make sense to a clinician).

First, I’ll look at the entries, and then I’ll make comments about data type rendering below that.

Entry – nothing to display in the narrative

Entry Acts (Observation , etc) – each act should be shown in the narrative, except for RegionOfInterest which is purely technical, and ObservationMedia which is shown via a <renderMultimedia>. Where possible, I prefer for the narrative to be a table, and there should be one row in the table for each entry (you can even make that rule a schematron assertion, and it picks up lots of errors, believe me).  

classCode , moodCode, statusCode – these are critical to be in the narrative – but very often, they are implicit. For example, if the Section title is “Past Procedures”, and all the entries are procedures with moodCode = EVN, then there’s nothing to say in the narrative (and don’t, if you don’t have to, otherwise it’s just noise for already overwhelmed users). But if the section title is “procedures, past and planned”, then you’ll need to show the mood code for each procedure – and don’t just show the code for the moodCode – give it a meaningful human readable word (e.g. done or planned in this case)

effectiveTime – This is generally necessary to display. It’s rare that the timing of some event simply doesn’t matter – except for future activities that are known not to have happened (the effective time then is the time of booking, and whether to show this is a clinical decision based on the context of use)

id – generally I would recommend not showing these. But it’s a matter of clinical judgement when identifiers are likely to matter to someone reading the  document (when they have to get on the phone and say, look, about diagnostic report XXXX, can you…)

code – generally, this needs to be shown

negationInd – this always needs to be shown if it’s negative. The only exception I could think of is if every entry in the section is negated, and the section title says “failed Medication Administrations” or something like that

text – any time an entry has text, I’d very much expect that the text would be in the narrative – but it might not be quite the same. For instance, the text might be generated using a variation of the method by which the whole narrative was generated, so the information would be similar, but laid out differently. Alternatively, the text could be additional independent information from everything else, and then would be found verbatim in the narrative.

Other entry attributes – well, there’s just too many to list here, and beyond the common elements above, there’s not much simple common guidance I can write. Some things like Observation.value, and SubstanceAdministration.dosage are obvious things to put in the narrative, and other things like Act.priority code are unlikely to be in the narrative unless the act is yet future. Most other things lie somewhere between this. Clinical judgement is always required.


  • Author: probably not worth showing
  • Informant: probably not worth showing? (Wouldn’t it be nice to know that there would always be a way to “unpeel the onion” when it is appropriate)
  • Subject: Always must be shown if it’s different to the section subject, but this is extremely rare. (which reminds me – if the section subject differs to the document subject, that absolutely must be shown in the section narrative)
  • Performer: usually worth showing (as a single piece of text, the name of the person, or the org if there’s no person)
  • Consumable/Product: Show the name and/or code of the DrugOrOtherMaterial
  • Specimen: really, I’m not sure. If it’s a diagnostic report, it’s not usually shown in the narrative (or if it is, it’s in the title or implied in the test code). I haven’t seen specimen in other contexts
  • Participant: it depends on what type of participant and why. Sometimes, this is audit trail type stuff, and you shouldn’t show it. Other times, it’s critical information


  • Reference – should be shown as a linkHtml if possible.
  • Precondition – don’t know.
  • ReferenceRange – usually shown, but the labs know what they want or have to do
  • Component (on organizer) – should always be shown


What to do with Entry Relationships all depends on what they say. These could be:

  • Entries in the narrative table in their own right (sometimes in place of the thing that contains them)
  • nested tables in the main table
  • a single line of text in the main table
  • ignored and not shown in the narrative

It’s a clinical judgement call. If you do show an entry relationship, pay attention to these elements:

  • typeCode / inversionInd – The meaning of the typeCode and inversionInd needs to be conveyed somehow. Very often, this is obvious in the context, and showing it is just noise. But sometimes it matters
  • sequenceNumber – any time this is present, it probably needs to be in the narrative
  • negationInd – see above

Data Type Guidance

  • nullFlavor: if the data type has a nullFlavor, then show some abbreviation for the nullFlavor. I use “–” for NI, n/a for NA, unk for UNK, or the code (the first three are the common ones). If the data type has additional properties as well as a nullFlavor, then generally these would be type /scope information, and something smart is required – e.g. <tel nullFalvor=”UNK” use=”H” value=”tel:’/> should be shown as “Home phone number: unk”.
  • BL / INT / REAL / ST: just show the value directly
  • ED: These shouldn’t be rendered directly. You should use a linkHtml or renderMultimedia to refer to this indirectly
  • CD/CE/CV: if you have original text, then show that. If not, it’ll have to be a display. In a few circumstances, where humans use the codes directly (billing codes!) it’s worth showing the codes directly
  • CS: if you have to show a CS, you should know why, and you should have a representation table that translates the code to something good for a human
  • II: if you have to show one of these (try and avoid it), and it has an extension, show the extension. if it’s just a UUID – don’t show it. Humans hate UUIDS. Just like OIDs
  • TEL: I use at template like this [use] [type] details, where use is a lookup from the use code, if there is one, type is phone or fax depending on the scheme in the value, and details is the rest of the value. if it’s a web url, I’ll drop the [type] and show the value is a linkHtml. But that doesn’t always suit (host systems might not know what to do with a mailto: url in a linkHtml)
  • EN: If you have to render this, use [use] [literal] where use is from the use code(s) and literal is the literal form defined in the abstract data types (the definition is obtuse, but the format is usually pretty good)
  • AD: same as EN
  • PQ: render as [value] [unit]. If you have a different human display unit to the UCUM unit, you can use that (or translate it to a human display. for instance, the UCUM unit uL, some people prefer µL and some mandate mcg)
  • RTO: Use the literal form (though use human units if they available)
  • TS: Depends on the culture. In Australia, we mandated dd-mmm-yyyy hh:nn:[ss][T] (no milliseconds). Other cultures will have their own rules
  • IVL<>: I use the literal form
  • GTS: good luck with this one… hopefully you have some system source text description as well as a GTS. I guess there’s always the literal form.


Question: #FHIR, complexity, and modeling


HL7 V3 is known for increasing complexity up to the point where people give up. The RIM seems not adequate enough for modeling the world of clinical information (see. Barry Smith:

Is FHIR meant to be a cure? I understand that FHIR it is about using a RESTful architectural style in communication of clinical resources, but the resources themselves need to be modeled appropriately. Complexity is not going to go away. Thus, FHIR appears to be another way to slice the elephant, the elephant being the world of clinical information and the need for semantic interoperability. Is there a promise for a better modeling of resources in FHIR?


Well, this is not an easy question. The complexity we struggle with comes from several different sources:

  1. The inherent complexity in medicine, and the underlying biological processes
  2. The necessity to provide healthcare in the absence of clear information, and at the limits of human knowledge
  3. The wide variability in the processes around providing healthcare – including education and billing, both within and between countries
  4. Attempts to systematise information about all this to support computable information

Clearly, in FHIR, we could only attempt to reduce the complexity of the last point, and we have attempted to. We’ve also tried to limit the impact of the 3rd point in the way we handle extensions – we have to handle complexity, but we want to keep it in the right place.  Still, this only deals with point #3

These are the things we do to try to manage complexity properly:

  • We have a ruthless focus on what systems already do, partly because this is yardstick of the complexity that people know how to make work
  • We have a strong focus on testing, testing, and more testing – we don’t iterate our designs in the absence of real world implementation experience
  • We use the language of the domain in our resource designs, rather than a standardised language (e.g. RIM) so that people who actually use it can understand what they see quickly
  • We map to other representations, including underlying ontologies and grammars, to ensure that the design is as well based in formal definitions and logic (perhaps I should say that we aspire to do this, I think we have a long way to go yet)
  • We maintain our own ontology around how resources should be structured and broken up

Having said that, I don’t think any of this is a magic bullet, and FHIR is not trying to define a coherent information framework that allows for semantic computing – we’re simply defining pragmatic exchange formats based on today’s reality. I think that some of the problems that the RIM and ontologies are trying to solve, we’ve just punted them into the future in order to pursue a more practical course in the shorter term. For that reason, I try not to use the word “model” in association with FHIR, because we’re not “modeling healthcare” in the same sense that the RIM is trying to do, or that Barry is criticising it for.

BTW, having finally linked to Barry’s work from my blog, I think that Barry is largely mistaken in his criticisms of the RIM. It’s not that I think it’s beyond criticism (I have a number of my own, including this, though I’m not such a fan that I’d pose with it), but that I think he misunderstands its intent, what it does do, and what it’s not good for, and nor does he understand the boundaries within which it works).

I actually think that modeling healthcare properly is beyond our abilities at this time – too complex for normal people to deal with is too simple to express the things specialists know that computers need to say.

I expect a vigorous debate in the comments on this post… but anyone who comments, please keep my 3 laws of interoperability (It’s all about the people, you can’t get rid of complexity, and you can’t have it all) and my note about semantic interoperability (and #2) in mind.

HL7 Standards and rules for handling errors

It’s a pretty common question:

What’s the required behavior if a (message | document | resource) is not valid? Do you have to validate it? what are you supposed to do?

That question can – and has been – asked about HL7 v2 messaging, v3 messaging, CDA documents, and now the FHIR API.

The answer is that HL7 itself doesn’t say. There’s several reasons for that:

  • HL7’s main focus is to define what is and isn’t valid, and how implementation guides and trading partners can define what is and isn’t valid in their contexts
  • HL7 generally doesn’t even say what your obligations are when the content is valid either – that’s nearly always delegated to implementation guides and trading partner agreements (such as, say, XDS, but I can’t even remember XDS making many rules about this)
  • We often discuss this – the problem is that there’s no right rule around what action to take. A system is allowed to choose to accept invalid content, and most choose to do that (some won’t report an error no matter what you send them). Others, on the other hand, reject the content outright. All that HL7 says is that you can choose to reject the content
  • In fact, you’re allowed to reject the content even if it’s valid with regard to the specification, because of other reasons (e.g. ward not known trying to admit a patient)
  • We believe in Postel’s law:

Be conservative in what you do, be liberal in what you accept from others

  • HL7 doesn’t know what you can accept – so it doesn’t try to make rules about that.

So what’s good advice?


I don’t think that there’s any single advice on this. Whether you should validate instances in practice depends on whether you can deal with the consequences of failure, and whether you can’t deal with the consequences of not validating. Here’s an incident to illustrate this point:

We set up our new (HL7 v2) ADT feed to validate all incoming messages, and tested the interface thoroughly during pre-production. There were no problems, and it all looked good. However, as soon as we put the system into production mode, we started getting messages rejected because the clerical staff were inputting data that was not valid. On investigation we found that the testing had used a set of data based on the formally documented practices of the institution, but the clerical staff had informal policies around date entry that we didn’t test. Rejected messages left the applications out of sync, and caused much worse issues. Rather than try to change the clerical staff, we ended up turning validation off

That was a real case. The question here is, what happens to the illegal dates now that you no longer accept them? If you hadn’t been validating, what would have happened? In fact, you have to validate the data, the only question is, do you validate everything up-front, or only what you need as you handle it?

Now consider the Australian PCEHR. It’s an XDS based system, and every submitted document is subjected to the full array of schema and schematron validation that we are able to devise. We do this because downstream processing of the documents – which happens to a limited degree (at the moment) – cannot be proven safe if the documents might be invalid. And we continuously add further validation around identified safety issues (at least, where we can, though many of the safety issues are not things that automated checks can do anything about).

But it has it’s problems too – because of Australian privacy laws, it’s really very difficult for vendors, let alone 3rd parties, to investigate incidents on site in production systems. The PCEHR has additional rules built around privacy and security which make it tougher (e.g. accidentally sharing the patient identifier with someone who is not providing healthcare services for the patient is a criminal offence).  So in practice, when a document is rejected by the pcEHR, it’s the user’s problem. And the end-user has no idea what the problem is, or what to do about it (schematron errors are hard enough for programmers…).

So validation is a vexed question with no right answer. You have to do it to a degree, but you (or your users) will suffer for it too.

Handling Errors

You have to be able reject content. You might choose to handle failed content in line (let the sender know) or out of line (put it in a queue for a system administrator). Both actions are thoroughly wrong and unsafe. And the unsafest thing about either is that they’ll both be ignored in practice – just another process failure in the degenerate process called “healthcare”.

When you reject content, provide both as specific and verbose message as you can, loaded with context, details, paths, reasons, etc – that’s for the person who debugs it. And also provide a human readable version for the users, something they could use to describe the problem to a patient (or even a manager).

If you administer systems: it’s really good to be right on top of this and follow up every error, because they’re all serious – but my experience is that administrative teams are swamped under a stream of messages where the signal to noise ratio is low, but the real problems are beyond addressing anyway.