Monthly Archives: July 2012

The standards process

A quote from Tim Bray:

Standards-making is a boring, bureaucratic, unpleasant process, infested by difficult people and psychopathic institutions. Some “standards” turn out to be useful; most are ignored; some are actively harmful. And looking at which organization the standard came from turns out to not be very useful in predicting what’s going to happen.

That’s the process I know – it’s a good summary. And he’s not talking about the same standards bodies I work with – but people are the same everywhere.

PBS Codes in CDA Documents: codeSystemVersion

Many of the CDA documents that will be part of the overall Australian National Electronic Health Record eco-system include medication information (including dispensing information in some cases). Many of the source systems use PBS codes to represent the medications.

For my non-Australian readers, PBS codes are the billing codes for medications used by the national prescribing funding system here in Australia. You can download the definitions here:

When represented in a CDA document, a PBS code looks like this:

<code code=”1471K” codeSystem=”″ codeSystemVersion=”??” displayName=”…”/>

What to put for codeSystemVersion? The answer is, the date of release (as published at the downloads), in the format YYYYMMDD. As of the today, the release is 1st July 2012, so the codeSystemVersion for this is 20120701. However there’s no need to put the codeSystemVersion in the CDA document – in the long run, it only complicates matters, and the few use cases that it does support are very advanced ones which are beyond current practice here in Australia.

PBS Code Version Management

Several implementers have raised the prospect of PBS re-using old codes. If this were to happen, ongoing use of PBS codes in CDA documents would be unsafe, and there would be no real practical work around that would make them safe. Indeed, PBS has re-used old codes (from before 1987) in the past as a temporary workaround for running out of code space. However the PBS team have advised me that this is no longer acceptable practice, and that when they next run out of codes – estimated to be in Dec 2012/Jan 2013 – then they’ll start using longer codes.

Conclusion: there isn’t going to be any re-use of old PBS codes any more.


Identifiers in CDA Documents- Reporting Tool

This post is prompted by the intersection of two issues:

  • Conclusions from quality checking CDA documents in the Australian National EHR Program
  • A series of questions I took privately about how identifiers work in Consolidated CDA

This post explains how identifiers are supposed to work in a CDA document, and introduces a reporting tool to help implementers assess the quality of identifier usage in a CDA document.

@ID attribute

The first kind of identifier in a CDA document is the “ID” attribute that can appear in the following places:

  • Section
  • ObservationMedia
  • RegionOfInterest

These are added so they can be a target of a linkHtml or renderMultiMedia – i.e. the narrative element <linkHtml ref=”#a1″> points to the section <section ID=”a1″>. Note that the attribute is defined as an xml:id, and values of xml:id must be unique within the XML document that contains them – using this makes it difficult to combine single CDA documents into groups in a single XML content (i.e. atom feed, for instance) – and impossible if they are signed.

The ID attribute also exists on the most of the narrative elements, so that they can be the target of an originalText reference in a CD data type, to indicate that the source of this code is this particular text. This is a very advanced usage. It would also be possible, using this method, to make any narrative element the target of a linkHtml reference, but to my knowledge the CDA specification doesn’t say if this is not legal (I think it’s intended that it’s not).

The ID attribute is also used for references from footnoteRefs to footnotes.

These are the only allowed uses of xml:id attributes in a CDA document. It can’t be used to indicate that this [thing] here is the same as that [thing] over there (i.e. this section and that section share the same author). To do that, you have to use logical object references using the id element

id element

Many elements in the CDA document have a child “id” element that serves to identify the class that contains them. Technically, this is the RIM classes Entity, Role, and Act, which generally are allowed to carry one or more identifiers in the id element

Note that this means that some CDA elements have both a child element “id” and an attribute “ID”. Some tools struggle with this. I would’ve thought that such tools were long fixed – it’s not that uncommon to have duplicate names between attributes and elements, since it’s not wrong, but I’ve found a few dev tools that don’t cope with this in the last 12 months. The only solution is to get back to the maintainer of the tools, and screech loudly at them till they fix their tool.

The id element has two important attributes: root and extension. The root has to be either an OID or a UUID, and an extension – any string not including whitespaces – may be present. The identifier (either the root alone if no extension, or the root+extension) must be globally unique.

I’ve found that this root with optional extension business is a lot harder to grasp than it sounds, partly because OIDs have an internal root/extension structure, and so it’s really unclear whether your leaf concept should be in the root or the extension. Say, for example, you have a medical record number, a six digit number, and you assign an OID for it, 2.16.840.1.113883.19.1. Should you represent your MRNs as

<id root="2.16.840.1.113883.19.1.45235"/>

or as

<id root="2.16.840.1.113883.19.1" extension="45235"/>

Generally, I prefer the second (it allows leading zeros, alpha characters if they become required, and is easier to pull out just the MRN), but both forms are valid, and the decision rests with the person who first registers the OID. And if you look in the OID registry, registered OIDs rarely explain which form is correct.

Another confusing thing is whether an extension is allowed or required if the root is a UUID. It’s allowed – and whether it’s required depends on where the unique part comes from. If I’m going to use a stream of unique numbers to actually make the value unique, and I’m just using the UUID to provide a globally unique space for them, then there’ll be an extension:

<id root=655f67b1-2b11-4038-b82f-f6ab2f566f87" extension="1234"/>

As a rule of thumb, if the UUID is registered in the HL7 OID registry (as 655F67B1-2B11-4038-B82F-F6AB2F566F87 is) then you need an extension for the actual unique part. (Note that the UUID is supposed to be represented in lowercase even though the schema doesn’t say so – and irrespective of what case is registered in the registry).

For any Australian readers, if you aren’t sure about this: consult the new Australian handbook on representing identifiers, see my earlier blog post, or ask me.

Unique Identifiers

The fact that identifiers are required to be unique means two things:

  • The identifier uses a properly allocated OID, or a generated UUID, so that no one else would accidentally use it. This sounds hard, but it’s actually relatively easy; generate a GUID (Ctrl-Alt-G in most IDEs), or just register an OID at the HL7 OID registry, but register it carefully, at a fine enough scope that this what you want to use
  • You have to use in a disciplined fashion, so that you only use it for one thing.

The second part turns out to be harder than it sounds. The problem is that there’s no tool to alert you when you copy paste an identifier from one part of the document to another (or from one part of your code to another). I see too many documents that contain duplicate identifiers – that is, the same identifier is used on different elements that represent different objects.

One of my correspondents asked why we don’t simply make a rule that you can’t have duplicated identifiers in a CDA document, like we have with the ID attribute. This would prevent accidental or lazy use of the same identifier again – but it’s not possible, because there’s valid cases for using the same identifier more than once

Identifiers are not unique in a document

This occurs when the same concept can appear multiple times in the document. For example:

  • When the same template is used multiple times
  • When the same person is both author and legalAuthenticator
  • When the same organisation employs all the personal and scopes the patient for the document

So these are all common cases. Other than the template id, a natural question that arises is about the relationship between two instances of the same object in the same document. Take, for example, this fragment of an author from an Australian CDA example:

  <id root="7FCB0EC4-0CD0-11E0-9DFC-8F50DFD72085" />
  <id root="" />
  <addr use="WP">
   <streetAddressLine>1 Clinician Street</streetAddressLine>
  <telecom use="WP" value="tel:0712341234" />

Note to alert Australian readers: yes, I moved HPI-I from it’s normal place, since this is for international readers.

This author has two identifiers, what we might call a technical identifier (the UUID) and the real-world identifier, which is the number by which the author is registered with the national authority. That’s an arbitrary distinction that’s not made in the document itself – the only way to know this is to consult the definitions of the identifiers

For most documents, the author is also the legal authenticator, so we’re going to repeat all the same information there too:

  <id root="7FCB0EC4-0CD0-11E0-9DFC-8F50DFD72085" />
  <id root="" />
  <addr use="WP">
   <streetAddressLine>1 Clinician Street</streetAddressLine>
  <telecom use="WP" value="tel:0712341234" />

Note that the element is different, but everything else is the same. However, you could argue that this is redundant – we already provided all the information about the person the first time, and the second time, all we need to do is provide an identifier:

  <id root="7FCB0EC4-0CD0-11E0-9DFC-8F50DFD72085" />


On reaching the second case, you go and resolve the first identifier, know that this is referring to the same actual object as the first case, and fill in all the details accordingly. However this is complicated by the fact that in some cases where you can do this, the kind of information you can represent is different in each case (author and custodian, for instance), so you mightn’t be able to provide all the details in the first instance. So what do you do if the second case contains different details from the first? Is that an accident, or the correct way to represent it? Unfortunately, the only way to know is to examine the details on each instance, and reason from the underlying RIM classes – there’s no easy rule of thumb.

One notion that this section suggests is that you can extract these RIM entities, roles and classes out to a persistent data store, and use the identifiers to trace the objects across various documents as you see them. This should be safe, after all, because the identifiers are unique. Only, not so much.

Re-using identifiers between documents

Firstly, there’s no guarantee that a given object will have the same identifier across different CDA documents from the same source. Commonly, CDA documents are generated from some intermediary XML or v2 object that doesn’t have the underlying identifiers in it, even if they exist in the original source. In these cases, the objects may acquire a transient identifier that is used multiple times within each document, but is not maintained across the documents. It’s very difficult to consistently identify an object across documents in this case.

Another problem is that some identifiers actually identify the business process that the object represents, and may end up being attached to multiple different objects that all relate to the same real-world process. Lab Order Ids are a classic case here – they’ll be associated with the object that identifies their acknowledgement response to a request for tests, and to the results that represent the outcomes of the request. Driver’s licenses are another example – they’re used to identify multiple different objects that represent the same person (usually from different institutions).

The upshot of this is that even when done well by the author, you can’t simply rely on the identifiers behaving in any particular way.

Reporting Tool

But very often, identifiers aren’t done well. And there’s no conformance tooling that can automatically figure out whether identifiers are being done properly in a document. So I’ve created a little reporting service that takes a CDA document, scans all the identifiers in it, and produces a report that helps visualise the identifiers, and see whether they are being used properly. We’ll be using it in the Australian national program to help check that a document has good identifiers in it. Feel free to use it in other contexts, and I’d welcome suggestions for how to make it more useful (and crash reports for how to break it).

Follow this link to, paste your CDA document into the link, click the button, and then read the report… all the steps up to the last one are real easy. Good luck and happy CDA writing/reading…

Question: How to populate OBX-23?


I am working on creating a reportable labs HL7 message and am having great difficulty finding information on how to complete the OBX-23 component of the message


OBX-23 is the Performing Organization Name of the laboratory. It has the following fields (v2.6):

ID Name Data Type Table Table Values
1 Organization Name ST: String Data
2 Organization Name Type Code IS: Coded Value for User-Defined Tables Organizational name type
3 ID Number NM: Numeric
4 Identifier Check Digit NM: Numeric
5 Check Digit Scheme ID: Coded Value for HL7 Defined Tables Check digit scheme
6 Assigning Authority HD: Hierarchic Designator Assigning authority
7 Identifier Type Code ID: Coded Value for HL7 Defined Tables Identifier type
8 Assigning Facility HD: Hierarchic Designator
9 Name Representation Code ID: Coded Value for HL7 Defined Tables Name/address representation
10 Organization Identifier ST: String Data

There’s some somewhat confusing stuff here, a fusion of naming and identifying an organization. Here’s my notes about the components of this data type:

  1. Sounds simple – the name of the organisation. But organisation names are slippery beasts, and subject to change. So there’s a lot more information to add to deal with the ins and outs of this
  2. The type code can be Alias, Display, Legal, or a stock exchange code (not sure what the use case for that is). This isn’t that useful in OBX-23 where you can only have one name. Most likely, the systems doesn’t know what kind of name the use entered, so it can’t populate a value. And you should only put a value in here if you *know* what the right value is
  3. Components 3-8 match the equivalent components 1-6 of the CX data type, and represent *one* identifier that identifies the organization. i.e. you can have one name and one identifier. But see notes about component 10 – this component should not be used
  4. Identifier check digits should not be used.
  5. Identifier check digits should not be used
  6. This is the identity of the authority that issued the identifier in component 10. Typically, you have either a public identifier (preferred) or a private key there. This component scopes the identifier by providing a code (in sub-component 1) or a more formal identifier in sub-component 2. Sub-component 3 provides more information about sub-component 2. It’s hard to provide general advice about the HD data type. If you are using an identifier issued by a public authority, there should be some advice somewhere about how to properly identify a particular authority. I say *should* because I have no idea where to look outside Australia (and here in Australia, my earlier blog post about this eventually developed into a hand-book to be published soon by Standards Australia)
  7. The type of the identifier – if known. Only a few of the values in the table are applicable to organisations, such as NII, and the ubiquitious but uninformative default “XX” (Organization identifier)
  8. Where (place or location) that the identifier was assigned. Usually this is not known, and mostly this is irrelevant, and I’d advise against using it.
  9. Don’t use outside Japan
  10. This component replaces component 3. I don’t recall the discussion around this component, but the only apparent change is the type – that component 3 should have been ST, not NM, and this replaces component 3 with a type of ST, to better align with CX

So, summary: org name in component 1. Only populate 2 if you’re sure. If you have an identifier, put it in component 10, and then populate 6 with a scope for that identifier. Populate 7 if you’re sure. Leave everything else blank.


Question: Where do lab result interpretations go?


In an ORU^R01 message, where do you usually put the interpretation of lab result?


Well, that depends on that kind of interpretation you are talking about. Firstly, Lab results are a combination of data and interpretations, and there is very often no clear division in the way the data should be understood, let alone the report (aside: Doctors and formal modelers continue in the fantasy that lab results are data, not interpretation but this is not the case. Interpretation starts in the lab before any numbers are released, though the main high% tests just go straight out).

So “Interpretation” can be one of the following things:

  • A flag appended to the numerical result (H, L, etc)
  • A comment provided in place of the result
  • An additional comment appended to the result
  • A whole set of pages of content which is interpretation as narrative, that may include some original data

Accordingly, HL7 provides several candidate locations for placing these interpretations:

  • OBX-8: Abnormal Flags : “a table lookup indicating the normalcy status of the result
  • An NTE after the OBX
  • A separate OBX, or set of OBXs

I very much recommend against using the NTE approach – They are used in some areas around the world, but they just don’t carry semantics. So it really comes to, use OBX-8 for flags, and put other interpretations in an OBX of their own. Really, the key to this use is a combination of OBX-3 and OBX-4. OBX-3 is the code that defines what the type of OBX is – it might one of the LOINC codes for comment, such as 8251-1, or a local code. The problem with interpreting lab results mostly comes down to interpreting the oBX-3 codes correctly. OBX-4 may help, in that it can indicate that a particular OBX segment is a child segment of another, but there’s no real consistency between the syntax for using OBX-4, and also quite what being a child implies.



Free UCUM Validation Service

We needed a simple UCUM validation service to call from the FHIR schematron rules – i.e. is this UCUM code valid? Is it of the right dimension (time, length etc)?. So I’ve put one up here:

It’s too much trouble to do anything but make it free for use, though if it starts getting absolutely hammered and blows my amazon account usage through the roof I’ll have to reconsider.

There’s other UCUM services – see for more details. (at the bottom of the page)

Question: What happened to the Eclipse OHF code?

The Eclipse OHF project was retired after community interest was diverted to The Open Health Tools project. What happened to the source?

Well, of the sub-projects:

Then there’s the H3ET sub-project. This included HL7 v2/v3/CDA code, with the underlying support for UCUM.

Parts of the v3 code got forked. An an earlier version of the code, with UK NHS customizations, can be found in the OHT Static Model Designer or the B2i Open HealthWorkbench. The OHF project team continued to maintain the code-base for longer, but that code hasn’t found a new home since OHF was retired. The current code can be found here as .zip. It’s still good code, and covered by the Eclipse IP rules and license. I really need to find a new home for it, and given it’s pedigree, the logical thing is for it to be hosted as an OHT charter project; however this is blocked by the previous contribution of that earlier fork of the code, and OHT’s unwillingness to host duplicate projects. I don’t have the time to navigate the minefield associated with that, nor does anyone have the months (or more) required to unfork the code. So I’m not really sure what will happen there. Maybe I should just put it up on sourceforge.

The HL7 v2 code was used in the OHF bridge project, and really should become a part of that project – that’s waiting for me and the admins of the OHT project to get around to doing the paper work, refactoring the code into a package namespace, and then making the contribution.

Finally, the underlying UCUM code was contributed to the eclipse UOMO project.


Validating Name Characters

Well, the pcEHR go-live hasn’t gone that well. One particular feature that’s attracted some attention is that fact that the pcEHR won’t accept people with some unusual characters in their surnames.


Medical Observer has found patients with apostrophes or hyphens in their name cannot register for an e-health record, as the government scrambles to get the rest of the patient registration process working.

It sounds like a glaring oversight… only, just what characters do you need to allow in a patient’s surname? I suspect that real experts would be fairly circumspect in commenting on this – it’s harder than it looks.

Firstly, in general, this is a good assessment of requirements: But this is for the entire world. Do you have to support all these in an Australian context? What’s the point of supporting name characters that Medicare don’t support, for instance? (there’s some, but is it worth it). Since most systems we have were designed pre-unicode, the answer can never be as simple as, any unicode character.

Perhaps there’s an applicable Australian standard? How about AS 5017, “Health Care Client Identification”? Now what is has to say about this is:

Family Name Verification rules:  Alphabetic characters, tilde (~), punctuation (.,-) and spaces only (2006 version)

It shouldn’t come as a surprise to anyone that AS 5017 is the standard which has been followed in the design of the HI service and the pcEHR, and you’ll note that apostrophes aren’t on the list (though hyphens are, so I’m guessing that hyphens are actually ok, but the spokesmen was confused on the fine details – pretty likely in my experience).  Now it sounds like a glaring oversight not to accept apostrophe’s, but all you need is one developer to faithfully follow the rules – that is what you want, after all – and the tester’s to miss this minor point, and bingo – you have a public relations disaster…. (I must say, I have no idea why tilde’s are on the list – I’ve never seen them in a name?)

For given name, AS 5017 doesn’t specify any verification rules. There’s some good advice in Meteor for family name and given name, but again, no comprehensive list of characters there either. While writing this blog, I spent some time perusing the medicare web site for the rules around name character verification – they obviously do have them, since there’s error codes relating to invalid characters in names, but I couldn’t find what they are.

At an HL7 meeting, I heard about a US family that named their son φ, and expected systems to cater for this, but I’m pretty sure that you wouldn’t need to cater for this in Australia.

If anyone has more information, it’d be great to have them in the comments. Thanks